Wednesday, 2 December 2009

SPARQL - extension




I know that there is a working group that is looking for extending the current SPARQL specification/protocol. There are also some discussions outside the working group that outlines requirements for such an extension.

As a contribution to these efforts, in this post I want to start outlining things ( queries ) that are not possible to write using SPARQL in its current state. Requirement of such query syntax and support stems from demands of practical Semantic Web development work. Here are some examples. I will appreciate your comments if I am incorrectly assuming that these sort of queries are not possible to write.

1. A query to verify the cardinality of property values.
For example, in DBPedia, a Band (<http://dbpedia.org/ontology/Band>) has property <http://dbpedia.org/ontology/genre> with no cardinality restrictions (i.e. Band can have any number of genres).

Using sparql one can not extract Bands which has more than x number of genres?

The following SPARQL query is not possible to write and execute:

Select ?band
where
{
?band rdf:type <http://dbpedia.org/ontology/Band> .
FILTER (COUNT(?genre) > 3).
}

2. Support for negation. It is well known that SPARQL does not support negation. It is obvious that negation is an important feature to have, I will demonstrate the usefulness of negation using one scenario.

In DBPedia ontology, the classification hierarchy for "Organisation" is as follows:


If you are mapping your ontology to DBPedia's and if you have a slightly different hierarchy where "SportsTeam" is not a subclass of "Organisation" then
in order to retrieve all the instances of Organisation class that are not "SportsTeam" you will need to use negation (ALL(Organisation)-ALL(SportsTeam)).

The alternative is cumbersome solution where the query will involve retrieving instances of "Organisation"subclasses except SportsTeam and then merging them. For example,

Regex support in CONSTRUCT queries:

One of the purposes of CONSTRUCT queries is to map ontologies and datasets. It will be quite useful to be able to specify regex based URI patterns as part of the CONSTRUCT queries. Particularly applicable where one wants to use the unique & human readbale identifiers from DBPedia. For example, it will be useful to have something like:

CONSTRUCT
{
<http://myurischeme/resource/"unique id from DBPedia URI"> rdf:type myOntology:myType.
}
WHERE
{
<http://dbpedia.org/resource/ABC> rdf:type <http://dbpedia.org/ontology/Person>
}