Skip to content

Commit

Permalink
up version
Browse files Browse the repository at this point in the history
  • Loading branch information
cmungall committed Mar 10, 2018
1 parent 82933ab commit 09c3718
Show file tree
Hide file tree
Showing 6 changed files with 206 additions and 1 deletion.
173 changes: 173 additions & 0 deletions SPECIFICATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# Specification

Sparqlprog is a subset of prolog, roughly equivalent to
[datalog](https://en.wikipedia.org/wiki/Datalog), that can be compiled
to SPARQL. Sparqlprog can also be executed natively by a native logic
programming engine, allowing for delegation of large database queries
to a remote service combined with local programming. See the
[README](README.md) for more information.

## Queries and Compound Terms

A query is a boolean combination of `compound terms`. A compound term
is composed of a predicate plus zero or more arguments, where each
argument can be a term or a variable.

Predicates may be built-in or defined. The core built-in predicate is
`rdf/3` (the `/3` denotes the predicate has 3 arguments).

The most basic query is a query for all triples:

`rdf(S,P,O)`

Variables are denoted by a leading uppercase symbol. In the above
example, all arguments are variables to the query succeeds for all
triples. It is equivalent to the SPARQL query

`SELECT * WHERE {?x ?p ?o}`

The following query unifies the variable `Cls` with all subjects of a
triple in which the predicate is `rdf:type` and the value/object is
`owl:Class`. Note that "Class" is in single quotes to avoid being
treated as a variable:

`rdf(Cls,rdf:type,owl:'Class')`

## Boolean combinations

Terms can be combined with any combinations of conjunction,
disjunction or negation. These are denoted by the symbols ',', ';' and
'\+' respectively. Formally these are all predicates (conjunctions and
disjunction are binary, negation is unary), but these can be written
using infix notation for syntacic convenience.

The ',/2' predicate denotes conjunction:

`rdf(Cls,rdf:type,owl:'Class'),rdf(Cls,rdfs:subClassOf,Super)`

The ';/2' predicate denotes disjunction:

`rdf(Obj,rdf:type,owl:'ObjectProperty');rdf(Obj,rdf:type,owl:'DataProperty')`

The '\+/1` predicate denotes negation:

...

Parentheses can be used to group conbinations

## Rules

A rule is written `Head :- Body`, where the head of the rule is a single term and the body is any boolean combination

The following rule defines `is_a/2` which is trivially equivalent to an `rdf/3` call with the RDFS subclass predicate:

```
is_a(A,B) :- rdf(A,rdfs:subClassOf,B).
```

Multiple rules are treated disjunctions. E.g. the following definition
of `is_a/2` succeeds when the subject is a subclass of the object, or
an instance.

```
is_a(A,B) :- rdf(A,rdfs:subClassOf,B).
is_a(A,B) :- rdf(A,rdf:type,B).
```

Recursive rules *cannot* be written in sparqlprog. For example, if you write:

```
is_a(A,B) :- rdf(A,rdfs:subClassOf,B).
is_a(A,B) :- is_a(A,Z),is_a(Z,B).
```

This cannot be directly converted to SPARQL. However, in this particular example, a property path can be used - see below.

For examples of rules, see some of the programs in the [ontologies](prolog/sparqlprog/ontologies/) folder

## Builtins

The builtins typically correspond to predicates defined in the
swi-prolog semweb package. Additional definitions can optionally be
imported.

The core two builtins are:

* `rdf(Subject, Predicate, Object)` (triple queries)
* `rdf(Subject, Predicate, Object, Graph)` (quad queries)

Additionally:

* `rdfs_subclass_of/2` - inferred subClassOf
* `rdfs_individual_of/2` - inferred subClassOf

There are also predicate builtins for a subset of SPARQL functions that return booleans, such as

* `str_starts/3`
* `str_ends/3`
* `between/3`

The complete list is not yet implemented.

Additionally, comparison operators are also supported. We use standard prolog infix operators, e.g.

```
A @<= B
```

for string comparisons

```
A <= B
```

for numeric comparisons

## Functions

In addition to predicates, functions can be used in a query. These can be builtins, but functions can also be defined.

`bind/2` can be used to explicitly set a value.


* concat
* ucase

## Predicate paths


## Aggregate Queries

## Namespaces

## Programs and Modules

## Execution of sparqlprog queries

A query can be executed using the `pl2sparql` command, or from any
language via an API call to a sparqlprog pengine. See the
[README](README.md) for details.

Additionally, queries can be executed from within a prolog program using `??/2`.

For example:

```
TODO
```

## Embedding within logic programs

Any sparqlprog program should be executable directly using a prolog
engine that defines the `rdf/3` predicate, such as swi-prolog.

Additionally, native execution and remote execution can be mixed.









2 changes: 2 additions & 0 deletions bin/pq-ncit
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/sh
pl2sparql -u sparqlprog/endpoints -u sparqlprog/ontologies/ncit -s ncit "$@"
2 changes: 2 additions & 0 deletions bin/pq-ontobee
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/sh
pl2sparql -u sparqlprog/endpoints -u sparqlprog/ontologies/ontobee -s ontobee "$@"
5 changes: 5 additions & 0 deletions examples/gocam-examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,8 @@ pq-go "kinase_activity(A),part_of(A,P),signal_transduction(P),enabled_by(A,G)"
# ---
pq-go "kinase_activity(A),regulates(A,A2),enabled_by(A,G)"

# ---
# user to model
# ---
pq-go -C "rdf(X,dc:contributor,Y),rdf(X,rdf:type,owl:'Ontology')"

23 changes: 23 additions & 0 deletions examples/ontobee-examples.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# All properties in PATO
pq-ontobee "in_ontology(X,pato,owl:'ObjectProperty')"

# All properties in NCIT, plus labels
pq-ontobee -f prolog "in_ontology(X,ncit,owl:'ObjectProperty'),label(X,N)" "x(X,N)"

# All properties in NCIT, auto-labels, contract URIs, show domain and range
pq-ontobee -u sparqlprog/ontologies/ncit -l -f csv "in_ontology(X,ncit,owl:'ObjectProperty'),owl:domain(X,D),owl:range(X,R)" "x(X,D,R)"

# Same, explicit
pq-ontobee -u sparqlprog/ontologies/ncit -f csv "in_ontology(P,ncit,owl:'ObjectProperty'),label(P,PN),owl:domain(P,D),owl:range(P,R),label(D,DN),label(R,RN)" "x(P,PN,D,DN,R,RN)"

# all property usages across ontobee (TODO: select DISTINCT)
pq-ontobee "rdf(_,owl:onProperty,P,G),label(P,PN)" "x(P,PN,G)"

# also:
pq-ontobee "aggregate_group(count(P),[P,G],rdf(_,owl:onProperty,P,G),Num)"

# all triples with a literal with a trailing whitespace
pq-ontobee 'rdf(C,P,V),is_literal(V),str_ends(str(V)," ")'

# all redundant subclass assertions
pq-ontobee -l "subClassOf(A,B),subClassOf(B,C),subClassOf(A,C)"
2 changes: 1 addition & 1 deletion pack.pl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name('sparqlprog').
title('Logic programming with SPARQL').
version('0.0.2').
version('0.0.3').
author('Chris Mungall','[email protected]').
author('Samer Abdallah','[email protected]').
maintainer('Chris Mungall','[email protected]').
Expand Down

0 comments on commit 09c3718

Please sign in to comment.