From 7157bf39e102f1cfb60a13c1dd8465a659a1cd42 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 14 Dec 2016 16:10:06 +0100 Subject: [PATCH 01/30] Add the constraints CIP - General outline - Describe the node uniqueness constraint --- .../CIP2016-12-14-Constraint-syntax.adoc | 174 ++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc new file mode 100644 index 0000000000..d668cbab4d --- /dev/null +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -0,0 +1,174 @@ += CIP2016-12-16 - Constraints syntax +:numbered: +:toc: +:toc-placement: macro +:source-highlighter: codemirror + +*Author:* Mats Rydberg + +[abstract] +.Abstract +-- +This CIP describes syntax and semantics for Cypher constraints. +These are language constructs that impose restrictions on the shape of the data graph, and how statements are allowed to change it. +-- + +toc::[] + +== Motivation + +Constraints provide utility for shaping the data graph. + +== Background + +Cypher has a loose notion of schema, in which nodes and relationships may take very heterogeneous forms, both in terms of properties and in graph patterns. +Constraints allows us to bound the heterogeneous nature of the property graph into a more regular form. + +== Proposal + +This CIP includes the following proposed constraints: + +* Node property uniqueness constraint +* Node property existence constraint +* Relationship property existence constraint + +Each constraint is detailed in its own below section. + +Once a constraint has been created, it may not be amended. +Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. + +==== Constraint names + +All constraints require the user to specify a nonempty _name_ at constraint creation time. +This name is subsequently the handle with which a user may refer to the constraint, e.g. when dropping it. + +// TODO: Should we impose restrictions on the domain of constraint names, or are all Unicode characters allowed? + +=== Syntax overview + +The syntax for all constraints follow the same basic outline. + +.Grammar definition for constraint syntax. +[source, ebnf] +---- +constraint command = create-constraint | drop-constraint ; +create-constraint = "CREATE", "CONSTRAINT", constraint-name, "FOR", constraint-pattern, "REQUIRE", constraint-expr ; +constraint-name = symbolic-name +constraint-pattern = node-pattern | simple-pattern ; +constraint-expr = uniqueness-constraint | existence-constraint ; +drop-constraint = "DROP", "CONSTRAINT", constraint-name ; +---- + +The constraint expressions vary depending on the actual constraint (see the detailed sections). + +.Example of dropping a constraint with name foo: +[source, cypher] +---- +DROP CONSTRAINT foo +---- + +=== Semantics overview + +The semantics are defined for each type of constraint, but some characteristics are shared: + +* When a statement tries to create a constraint on a graph where the data does not pass the constraint criterion, that statement will raise an error. +* When a statement tries to create a constraint with a name that already exists, that statement will raise an error. +* When a statement tries to drop a constraint referencing a name that does not exist, that statement will raise an error. +* When an updating statement tries to modify the graph in such a way that it would violate a constraint, that statement will raise an error. + +=== Node property uniqueness constraint + +This constraint enforces that there can not be duplicate values of a certain property for a certain type of node. +For example, that among nodes labeled with `:Person`, each `email` property must be unique. + +==== Syntax + +.Grammar definition for node property uniqueness constraint: +[source, ebnf] +---- +uniqueness-constraint = "UNIQUE", property-expression, { ",", property-expression } ; +---- + +.Example of single-property uniqueness constraint: +[source, cypher] +---- +CREATE CONSTRAINT unique_person_email +FOR (p:Person) +REQUIRE UNIQUE p.email +---- + +.Example of multiple-property uniqueness constraint: +[source, cypher] +---- +CREATE CONSTRAINT unique_person_details +FOR (p:Person) +REQUIRE UNIQUE p.name, p.email, p.address +---- + +==== Semantics + +A property uniqueness constraint is applied on nodes with a specific label, for one or more property keys. +The constraint applies to all nodes where the property exist; its value must be non-null. +When more than one property key is defined as part of the constraint, the uniqueness applies only to nodes where _all_ of the properties exist (are non-null). +The uniqueness mandates that two distinct nodes within the domain of the constraint can not have the same combination of values for the defined properties (respectively). + +===== Example + +Consider the graph created by the following statement: + +[source, cypher] +---- +CREATE (:Color {name: 'white', rgb: 255}) +CREATE (:Color {name: 'black', rgb: 0}) +CREATE (:Color {name: 'very, very dark grey', rgb: 0}) // rounding error! +---- + +Due to the duplication of the `rgb` property, the following attempt at creating a constraint will fail: + +[source, cypher] +---- +CREATE CONSTRAINT only_one_color_per_rgb +FOR (c:Color) +REQUIRE UNIQUE c.rgb +---- + +Suppose that we would rather like to have one color node per name _and_ RGB value (to work around the rounding errors). +We could then use the following constraint, without modifying our data: + +[source, cypher] +---- +CREATE CONSTRAINT unique_color_nodes +FOR (c:Color) +REQUIRE UNIQUE c.rgb, c.name +---- + +=== Interaction with existing features + +The main interaction between the constraints and the rest of the language happens during updating statements. +Existing constraints will cause certain updating statements to fail; in fact, that's the main purpose. + +=== Alternatives + +Plenty of alternative syntaxes have been discussed: + +* `GIVEN`, `CONSTRAIN`, `ASSERT` instead of `FOR` +* `ASSERT`, `ENFORCE`, `IMPLIES` instead of `REQUIRE` + +The use of existing expression to express uniqueness, instead of using a new keyword `UNIQUE`, on the form: +---- +FOR (p:Person), (q:Person) +REQUIRE p.email <> q.email AND p <> q +---- +which quickly becomes unwieldy for multiple properties. + +== What others do + +// TODO: SQL syntax for constraints + +== Benefits to this proposal + +Constraints make Cypher's notion of schema more well-defined, and allows users to keep graphs in a more regular, easier to manage form. + +== Caveats to this proposal + +For an implementing system, some constraints may prove challenging to enforce, as they generally require scanning through large parts of the graph to look for conflicting entities. From 6448277604c5b51c645a0e3896d824f2eb44569e Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 14 Dec 2016 16:36:32 +0100 Subject: [PATCH 02/30] Add property existence constraint syntax --- .../CIP2016-12-14-Constraint-syntax.adoc | 37 +++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index d668cbab4d..1846c17824 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -142,6 +142,43 @@ FOR (c:Color) REQUIRE UNIQUE c.rgb, c.name ---- +[[existence]] +=== Property existence constraints + +Property existence constraints are defined for both nodes and relationships, but the semantics are the same. +For this reason we will go over both constraints in the same section. + +==== Syntax + +.Grammar definition for property existence constraint: +[source, ebnf] +---- +existence-constraint = "exists", "(", property-expression, ")" ; +---- + +.Example of node property existence constraint: +[source, cypher] +---- +CREATE CONSTRAINT colors_must_have_rgb +FOR (c:Color) +REQUIRE exists(c.rgb) +---- + +.Example of relationship property existence constraint: +[source, cypher] +---- +CREATE CONSTRAINT rates_have_quality +FOR ()-[l:RATED]-() +REQUIRE exists(l.rating) +---- + +==== Semantics + +Property existence constraints enforce that the value of the specified property is non-null for all entities in the constraint domain. + + +===== Example + === Interaction with existing features The main interaction between the constraints and the rest of the language happens during updating statements. From 6b667b0cb91077ba73cb3f47f76f644b3c578a37 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Thu, 15 Dec 2016 10:13:46 +0100 Subject: [PATCH 03/30] Introduce the concept of domain - Define domain for uniqueness - Define domain for existence --- .../CIP2016-12-14-Constraint-syntax.adoc | 22 ++++++++++++------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 1846c17824..f3e058a939 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -28,9 +28,9 @@ Constraints allows us to bound the heterogeneous nature of the property graph in This CIP includes the following proposed constraints: -* Node property uniqueness constraint -* Node property existence constraint -* Relationship property existence constraint +* <> +* <> +* <> Each constraint is detailed in its own below section. @@ -76,6 +76,10 @@ The semantics are defined for each type of constraint, but some characteristics * When a statement tries to drop a constraint referencing a name that does not exist, that statement will raise an error. * When an updating statement tries to modify the graph in such a way that it would violate a constraint, that statement will raise an error. +The constraints define a _domain_ within which the constraint applies. +The domain is defined by the constraint pattern. + +[[uniqueness]] === Node property uniqueness constraint This constraint enforces that there can not be duplicate values of a certain property for a certain type of node. @@ -107,10 +111,10 @@ REQUIRE UNIQUE p.name, p.email, p.address ==== Semantics -A property uniqueness constraint is applied on nodes with a specific label, for one or more property keys. -The constraint applies to all nodes where the property exist; its value must be non-null. -When more than one property key is defined as part of the constraint, the uniqueness applies only to nodes where _all_ of the properties exist (are non-null). -The uniqueness mandates that two distinct nodes within the domain of the constraint can not have the same combination of values for the defined properties (respectively). +The domain of a property uniqueness constraint is defined as all the nodes with a specific label where the specified property key(s) exist (are non-null). +When more than one property key is defined as part of the constraint, only nodes where _all_ of the properties exist are part of the domain. + +The uniqueness mandates that two distinct nodes within the domain can not have the same combination of values for the defined properties (respectively). ===== Example @@ -174,8 +178,10 @@ REQUIRE exists(l.rating) ==== Semantics -Property existence constraints enforce that the value of the specified property is non-null for all entities in the constraint domain. +The domain of a node property existence constraint are all nodes with the specified label. +Similarly, the domain of a relationship property existence constraint are all relationship with the specified type. +Property existence constraints mandates that the value of the specified property exists (is non-null) for all entities in the domain. ===== Example From bbb46f4fd81ba4e34dba0560992e69f25767d36d Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Thu, 15 Dec 2016 10:39:49 +0100 Subject: [PATCH 04/30] Add example for existence constraint - Use hex integers for rgb examples --- .../CIP2016-12-14-Constraint-syntax.adoc | 32 +++++++++++++++++-- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index f3e058a939..9201b0ec45 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -122,9 +122,9 @@ Consider the graph created by the following statement: [source, cypher] ---- -CREATE (:Color {name: 'white', rgb: 255}) -CREATE (:Color {name: 'black', rgb: 0}) -CREATE (:Color {name: 'very, very dark grey', rgb: 0}) // rounding error! +CREATE (:Color {name: 'white', rgb: 0xffffff}) +CREATE (:Color {name: 'black', rgb: 0x000000}) +CREATE (:Color {name: 'very, very dark grey', rgb: 0x000000}) // rounding error! ---- Due to the duplication of the `rgb` property, the following attempt at creating a constraint will fail: @@ -185,6 +185,32 @@ Property existence constraints mandates that the value of the specified property ===== Example +Consider the graph containing `:Color` nodes. +Each color has an integral RGB value representation in a property `rgb`. +Users may lookup color nodes to extract their RGB values for application processing. +Users may also add new color nodes to the graph. + +Suppose the query that looks up the RGB value of a color with a given name looks like this: + +[source, cypher] +---- +MATCH (c:Color {name: $name}) +WHERE exists(c.rgb) +RETURN c.rgb +---- + +The `WHERE` clause protects the application from receiving `null` values back for user-defined colors where the RGB values have not been specified correctly. +It may however be eliminated by the introduction of a node property existence constraint: + +[source, cypher] +---- +CREATE CONSTRAINT colors_must_have_rgb +FOR (c:Color) +REQUIRE exists(c.rgb) +---- + +Any updating statement that would create a `:Color` node without specifying a `rgb` property for it would now fail. + === Interaction with existing features The main interaction between the constraints and the rest of the language happens during updating statements. From bf6e05c36643b56025ec022d7b3a76b99acb0ee5 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Thu, 15 Dec 2016 10:53:20 +0100 Subject: [PATCH 05/30] Add SQL examples --- .../CIP2016-12-14-Constraint-syntax.adoc | 44 ++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 9201b0ec45..47a0802749 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -232,7 +232,49 @@ which quickly becomes unwieldy for multiple properties. == What others do -// TODO: SQL syntax for constraints +In SQL, the following constraints exist (http://www.w3schools.com/sql/sql_constraints.asp): + +* `NOT NULL` - Indicates that a column cannot store NULL value +* `UNIQUE` - Ensures that each row for a column must have a unique value +* `PRIMARY KEY` - A combination of a `NOT NULL` and `UNIQUE`. Ensures that a column (or combination of two or more columns) have a unique identity which helps to find a particular record in a table more easily and quickly +* `FOREIGN KEY` - Ensure the referential integrity of the data in one table to match values in another table +* `CHECK` - Ensures that the value in a column meets a specific condition +* `DEFAULT` - Specifies a default value for a column +The next chapters will describe each constraint in detail. + +The property existence constraints represent the same functionality as the `NOT NULL` SQL constraint. +The node property uniqueness constraint represents the `PRIMARY KEY` SQL constraint. + +SQL constraints may be introduced at table creation time (in a `CREATE TABLE` statement), or in an `ALTER TABLE` statement: + +.Creating a persons table in SQL Server / Oracle / MS Access: +[source, sql] +---- +CREATE TABLE Persons +( + P_Id int NOT NULL UNIQUE, + LastName varchar(255) NOT NULL, + FirstName varchar(255)) +---- + +.Creating a persons table in MySQL: +[source, sql] +---- +CREATE TABLE Persons +( + P_Id int NOT NULL, + LastName varchar(255) NOT NULL, + FirstName varchar(255) + UNIQUE (P_Id) +) +---- + +.Adding a named composite `UNIQUE` constraint in MySQL / SQL Server / Oracle / MS Access: +[source, sql] +---- +ALTER TABLE Persons +ADD CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName) +---- == Benefits to this proposal From a7521d292dcfd88cd41abb61ff95168491276bd5 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Thu, 15 Dec 2016 11:12:54 +0100 Subject: [PATCH 06/30] Update standardisation scope --- docs/standardisation-scope.adoc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/standardisation-scope.adoc b/docs/standardisation-scope.adoc index 3663597885..5e6ead0a9b 100644 --- a/docs/standardisation-scope.adoc +++ b/docs/standardisation-scope.adoc @@ -44,6 +44,10 @@ It is the goal of this project to create a good and feature-rich standard langua * `allShortestPaths()` * `shortestPath()` +=== Commands + +* `CREATE CONSTRAINT` + === Operators ==== General @@ -210,7 +214,6 @@ It is the goal of this project to create a good and feature-rich standard langua === Commands -* `CREATE CONSTRAINT` * `CREATE INDEX` === Operators From 27cd6e6f1b3fd3898cfeb9424b871f49f697f3b0 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Thu, 15 Dec 2016 12:23:50 +0100 Subject: [PATCH 07/30] Add grammar rules for new constraint syntax --- grammar/basic-grammar.xml | 2 +- grammar/commands.xml | 45 ++++++++++++++++++++- tools/grammar/src/test/resources/cypher.txt | 13 ++++++ 3 files changed, 57 insertions(+), 3 deletions(-) diff --git a/grammar/basic-grammar.xml b/grammar/basic-grammar.xml index 64a78a90a9..69a71c427c 100644 --- a/grammar/basic-grammar.xml +++ b/grammar/basic-grammar.xml @@ -146,7 +146,7 @@ - + : &WS; diff --git a/grammar/commands.xml b/grammar/commands.xml index 3466bee5b6..6ffa9ebe47 100644 --- a/grammar/commands.xml +++ b/grammar/commands.xml @@ -44,7 +44,7 @@ xmlns:rr="http://opencypher.org/railroad" xmlns:oc="http://opencypher.org/opencypher"> - + @@ -57,10 +57,51 @@ + + + + + + + + + + CREATE &SP; CONSTRAINT &SP; &SP; + FOR &SP; &SP; + REQUIRE &SP; + + + + DROP &SP; CONSTRAINT &SP; + + + + + + + + + ( &var; &label; ) + ( ) - [ &var; ] - ( ) + + + + + + + - + + UNIQUE &SP; &WS; , &WS; + + + + exists ( ) + + + CREATE &SP; diff --git a/tools/grammar/src/test/resources/cypher.txt b/tools/grammar/src/test/resources/cypher.txt index 786eeadb69..8c3b14c25d 100644 --- a/tools/grammar/src/test/resources/cypher.txt +++ b/tools/grammar/src/test/resources/cypher.txt @@ -312,3 +312,16 @@ CALL db.labels() YIELD * WHERE label CONTAINS 'User' AND foo + bar = foo RETURN count(label) AS numLabels§ CALL db.labels() YIELD x WHERE label CONTAINS 'User' AND foo + bar = foo RETURN count(label) AS numLabels§ +CREATE CONSTRAINT foo +FOR (p:Person) +REQUIRE UNIQUE p.name§ +CREATE CONSTRAINT bar +FOR (p:Person) +REQUIRE UNIQUE p.name, p.email§ +CREATE CONSTRAINT baz +FOR (p:Person) +REQUIRE exists(p.name)§ +CREATE CONSTRAINT cru +FOR ()-[r:REL]-() +REQUIRE exists(r.property)§ +DROP CONSTRAINT foo_bar_baz§ From 08f8eb7876db65f5bf86d995a7ce1e9fbb74d264 Mon Sep 17 00:00:00 2001 From: Petra Selmer Date: Wed, 15 Feb 2017 10:49:59 +0000 Subject: [PATCH 08/30] Edited the textual contents of the CIP --- .../CIP2016-12-14-Constraint-syntax.adoc | 100 +++++++++--------- 1 file changed, 49 insertions(+), 51 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 47a0802749..a38d5c6238 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -17,12 +17,12 @@ toc::[] == Motivation -Constraints provide utility for shaping the data graph. +Constraints provide the means by which various aspects of the data graph may be controlled. == Background -Cypher has a loose notion of schema, in which nodes and relationships may take very heterogeneous forms, both in terms of properties and in graph patterns. -Constraints allows us to bound the heterogeneous nature of the property graph into a more regular form. +Cypher has a loose notion of a schema, in which nodes and relationships may take very heterogeneous forms, both in terms of properties and in graph patterns. +Constraints allow us to mould the heterogeneous nature of the property graph into a more regular form. == Proposal @@ -32,7 +32,7 @@ This CIP includes the following proposed constraints: * <> * <> -Each constraint is detailed in its own below section. +Each constraint is detailed in the sections below. Once a constraint has been created, it may not be amended. Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. @@ -61,7 +61,7 @@ drop-constraint = "DROP", "CONSTRAINT", constraint-name ; The constraint expressions vary depending on the actual constraint (see the detailed sections). -.Example of dropping a constraint with name foo: +.Example of dropping a constraint with name `foo`: [source, cypher] ---- DROP CONSTRAINT foo @@ -69,12 +69,12 @@ DROP CONSTRAINT foo === Semantics overview -The semantics are defined for each type of constraint, but some characteristics are shared: +The following list describes the situations in which an error will be raised: -* When a statement tries to create a constraint on a graph where the data does not pass the constraint criterion, that statement will raise an error. -* When a statement tries to create a constraint with a name that already exists, that statement will raise an error. -* When a statement tries to drop a constraint referencing a name that does not exist, that statement will raise an error. -* When an updating statement tries to modify the graph in such a way that it would violate a constraint, that statement will raise an error. +* Attempting to create a constraint on a graph where the data does not comply with the constraint criterion. +* Attempting to create a constraint with a name that already exists. +* Attempting to drop a constraint referencing a non-existent name. +* Attempting to modify the graph in such a way that it would violate a constraint. The constraints define a _domain_ within which the constraint applies. The domain is defined by the constraint pattern. @@ -82,18 +82,18 @@ The domain is defined by the constraint pattern. [[uniqueness]] === Node property uniqueness constraint -This constraint enforces that there can not be duplicate values of a certain property for a certain type of node. -For example, that among nodes labeled with `:Person`, each `email` property must be unique. +This constraint enforces that there cannot be duplicate values of some property `p` for any node labeled with some label `l`. +For example, this constraint can specify that the `email` property must be unique for all nodes labeled with `:Person`. ==== Syntax -.Grammar definition for node property uniqueness constraint: +.Grammar definition for the node property uniqueness constraint: [source, ebnf] ---- uniqueness-constraint = "UNIQUE", property-expression, { ",", property-expression } ; ---- -.Example of single-property uniqueness constraint: +.Example of a single-property uniqueness constraint: [source, cypher] ---- CREATE CONSTRAINT unique_person_email @@ -101,7 +101,7 @@ FOR (p:Person) REQUIRE UNIQUE p.email ---- -.Example of multiple-property uniqueness constraint: +.Example of a multiple-property uniqueness constraint: [source, cypher] ---- CREATE CONSTRAINT unique_person_details @@ -111,10 +111,10 @@ REQUIRE UNIQUE p.name, p.email, p.address ==== Semantics -The domain of a property uniqueness constraint is defined as all the nodes with a specific label where the specified property key(s) exist (are non-null). +The domain of a property uniqueness constraint is defined as all the nodes with a specific label where the specified property key(s) exist (i.e. are not null). When more than one property key is defined as part of the constraint, only nodes where _all_ of the properties exist are part of the domain. -The uniqueness mandates that two distinct nodes within the domain can not have the same combination of values for the defined properties (respectively). +The uniqueness constraint mandates that two distinct nodes within the domain cannot have the same combination of values for the defined properties. ===== Example @@ -127,7 +127,7 @@ CREATE (:Color {name: 'black', rgb: 0x000000}) CREATE (:Color {name: 'very, very dark grey', rgb: 0x000000}) // rounding error! ---- -Due to the duplication of the `rgb` property, the following attempt at creating a constraint will fail: +Owing to the duplication of the `rgb` property, the following attempt at creating a constraint will fail: [source, cypher] ---- @@ -149,18 +149,18 @@ REQUIRE UNIQUE c.rgb, c.name [[existence]] === Property existence constraints -Property existence constraints are defined for both nodes and relationships, but the semantics are the same. -For this reason we will go over both constraints in the same section. +Property existence constraints are defined for both nodes and relationships; these have the same semantics. +We now describe both of these. ==== Syntax -.Grammar definition for property existence constraint: +.Grammar definition for the property existence constraint: [source, ebnf] ---- existence-constraint = "exists", "(", property-expression, ")" ; ---- -.Example of node property existence constraint: +.Example of a node property existence constraint: [source, cypher] ---- CREATE CONSTRAINT colors_must_have_rgb @@ -168,7 +168,7 @@ FOR (c:Color) REQUIRE exists(c.rgb) ---- -.Example of relationship property existence constraint: +.Example of a relationship property existence constraint: [source, cypher] ---- CREATE CONSTRAINT rates_have_quality @@ -181,16 +181,16 @@ REQUIRE exists(l.rating) The domain of a node property existence constraint are all nodes with the specified label. Similarly, the domain of a relationship property existence constraint are all relationship with the specified type. -Property existence constraints mandates that the value of the specified property exists (is non-null) for all entities in the domain. +The property existence constraint mandates that the value of the specified property exists (i.e. is not null) for all entities in the domain. ===== Example Consider the graph containing `:Color` nodes. -Each color has an integral RGB value representation in a property `rgb`. -Users may lookup color nodes to extract their RGB values for application processing. -Users may also add new color nodes to the graph. +Each color is represented as an integer-type RGB value in a property `rgb`. +Users may look up nodes labeled with `:Color` to extract their RGB values for application processing. +Users may also add new `:Color`-labeled nodes to the graph. -Suppose the query that looks up the RGB value of a color with a given name looks like this: +The following query retrieves the RGB value of a color with a given `name`: [source, cypher] ---- @@ -199,8 +199,8 @@ WHERE exists(c.rgb) RETURN c.rgb ---- -The `WHERE` clause protects the application from receiving `null` values back for user-defined colors where the RGB values have not been specified correctly. -It may however be eliminated by the introduction of a node property existence constraint: +The `WHERE` clause may be used to prevent an application from retrieving `null` values for user-defined colors where the RGB values have not been specified correctly. +It may, however, be eliminated by the introduction of a node property existence constraint: [source, cypher] ---- @@ -213,54 +213,52 @@ Any updating statement that would create a `:Color` node without specifying a `r === Interaction with existing features -The main interaction between the constraints and the rest of the language happens during updating statements. -Existing constraints will cause certain updating statements to fail; in fact, that's the main purpose. +The main interaction between the constraints and the rest of the language occurs during updating statements. +Existing constraints will cause any updating statements to fail, thereby fulfilling the main purpose of this feature. === Alternatives -Plenty of alternative syntaxes have been discussed: +Alternative syntaxes have been discussed: * `GIVEN`, `CONSTRAIN`, `ASSERT` instead of `FOR` * `ASSERT`, `ENFORCE`, `IMPLIES` instead of `REQUIRE` -The use of existing expression to express uniqueness, instead of using a new keyword `UNIQUE`, on the form: +The use of an existing expression to express uniqueness -- instead of using a new keyword `UNIQUE` -- becomes unwieldy for multiple properties, as exemplified by the following: ---- FOR (p:Person), (q:Person) REQUIRE p.email <> q.email AND p <> q ---- -which quickly becomes unwieldy for multiple properties. == What others do In SQL, the following constraints exist (http://www.w3schools.com/sql/sql_constraints.asp): -* `NOT NULL` - Indicates that a column cannot store NULL value -* `UNIQUE` - Ensures that each row for a column must have a unique value -* `PRIMARY KEY` - A combination of a `NOT NULL` and `UNIQUE`. Ensures that a column (or combination of two or more columns) have a unique identity which helps to find a particular record in a table more easily and quickly -* `FOREIGN KEY` - Ensure the referential integrity of the data in one table to match values in another table +* `NOT NULL` - Indicates that a column cannot store a null value. +* `UNIQUE` - Ensures that each row for a column must have a unique value. +* `PRIMARY KEY` - A combination of a `NOT NULL` and `UNIQUE`. Ensures that a column (or a combination of two or more columns) has a unique identity, reducing the resources required to locate a specific record in a table. +* `FOREIGN KEY` - Ensures the referential integrity of the data in one table matches values in another table. * `CHECK` - Ensures that the value in a column meets a specific condition -* `DEFAULT` - Specifies a default value for a column -The next chapters will describe each constraint in detail. +* `DEFAULT` - Specifies a default value for a column. -The property existence constraints represent the same functionality as the `NOT NULL` SQL constraint. -The node property uniqueness constraint represents the `PRIMARY KEY` SQL constraint. +The property existence constraints correspond to the `NOT NULL` SQL constraint. +The node property uniqueness constraint corresponds to the `PRIMARY KEY` SQL constraint. -SQL constraints may be introduced at table creation time (in a `CREATE TABLE` statement), or in an `ALTER TABLE` statement: +SQL constraints may be introduced at table creation time in a `CREATE TABLE` statement, or in an `ALTER TABLE` statement: -.Creating a persons table in SQL Server / Oracle / MS Access: +.Creating a `Person` table in SQL Server / Oracle / MS Access: [source, sql] ---- -CREATE TABLE Persons +CREATE TABLE Person ( P_Id int NOT NULL UNIQUE, LastName varchar(255) NOT NULL, FirstName varchar(255)) ---- -.Creating a persons table in MySQL: +.Creating a `Person` table in MySQL: [source, sql] ---- -CREATE TABLE Persons +CREATE TABLE Person ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, @@ -272,14 +270,14 @@ CREATE TABLE Persons .Adding a named composite `UNIQUE` constraint in MySQL / SQL Server / Oracle / MS Access: [source, sql] ---- -ALTER TABLE Persons +ALTER TABLE Person ADD CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName) ---- == Benefits to this proposal -Constraints make Cypher's notion of schema more well-defined, and allows users to keep graphs in a more regular, easier to manage form. +Constraints make Cypher's notion of schema more well-defined, allowing users to maintain graphs in a more regular, easier-to-manage form. == Caveats to this proposal -For an implementing system, some constraints may prove challenging to enforce, as they generally require scanning through large parts of the graph to look for conflicting entities. +Some constraints may prove challenging to enforce in a system seeking to implement the contents of this CIP, as these generally require scanning through large parts of the graph to locate conflicting entities. From 0d1ae5e92754131796b869eb8574bd748dbb02db Mon Sep 17 00:00:00 2001 From: Petra Selmer Date: Thu, 16 Feb 2017 09:09:02 +0000 Subject: [PATCH 09/30] More textual edits to the Constraints CIP --- .../CIP2016-12-14-Constraint-syntax.adoc | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index a38d5c6238..304fba0ac4 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -40,7 +40,7 @@ Should a user wish to change its definition, it has to be dropped and recreated ==== Constraint names All constraints require the user to specify a nonempty _name_ at constraint creation time. -This name is subsequently the handle with which a user may refer to the constraint, e.g. when dropping it. +This name is subsequently the handle with which a user may refer to the constraint, for example when dropping it. // TODO: Should we impose restrictions on the domain of constraint names, or are all Unicode characters allowed? @@ -118,7 +118,11 @@ The uniqueness constraint mandates that two distinct nodes within the domain can ===== Example -Consider the graph created by the following statement: +Consider the graph created by the statement below. +The graph contains nodes labeled with `:Color`. +Each color is represented as an integer-type RGB value in a property `rgb`. +Users may look up nodes labeled with `:Color` to extract their RGB values for application processing. +Users may also add new `:Color`-labeled nodes to the graph. [source, cypher] ---- @@ -136,7 +140,7 @@ FOR (c:Color) REQUIRE UNIQUE c.rgb ---- -Suppose that we would rather like to have one color node per name _and_ RGB value (to work around the rounding errors). +Suppose that we would rather like to have one color node per `name` _and_ `rgb` value (to work around the rounding errors). We could then use the following constraint, without modifying our data: [source, cypher] @@ -185,10 +189,7 @@ The property existence constraint mandates that the value of the specified prope ===== Example -Consider the graph containing `:Color` nodes. -Each color is represented as an integer-type RGB value in a property `rgb`. -Users may look up nodes labeled with `:Color` to extract their RGB values for application processing. -Users may also add new `:Color`-labeled nodes to the graph. +Consider once again the graph containing `:Color` nodes. The following query retrieves the RGB value of a color with a given `name`: From 824a2c78e5fab78eb639d38123db7d668a680225 Mon Sep 17 00:00:00 2001 From: Petra Selmer Date: Fri, 17 Feb 2017 08:20:14 +0000 Subject: [PATCH 10/30] Amended w3 reference to denote alterations in 'quoted' text --- cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 304fba0ac4..f05c657908 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -232,7 +232,7 @@ REQUIRE p.email <> q.email AND p <> q == What others do -In SQL, the following constraints exist (http://www.w3schools.com/sql/sql_constraints.asp): +In SQL, the following constraints exist (inspired by http://www.w3schools.com/sql/sql_constraints.asp): * `NOT NULL` - Indicates that a column cannot store a null value. * `UNIQUE` - Ensures that each row for a column must have a unique value. From 410305681b7c1f85619af54e769b50d339d6aa69 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 1 Mar 2017 14:17:44 +0100 Subject: [PATCH 11/30] Rework CIP - Specify general constraint language - Specify `UNIQUE` operator - Clearly define semantics for domain and expressions - List all concrete constraints in example section - Add several more examples --- .../CIP2016-12-14-Constraint-syntax.adoc | 174 ++++++++++-------- 1 file changed, 94 insertions(+), 80 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index f05c657908..979c38cb2e 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -26,40 +26,45 @@ Constraints allow us to mould the heterogeneous nature of the property graph int == Proposal -This CIP includes the following proposed constraints: +This CIP specifies the general syntax for constraint definition (and constraint removal), and provides several examples of possible use cases for constraints. +However, the specification does not otherwise specify or limit the space of expressible constraints that the syntax and semantics allow. -* <> -* <> -* <> - -Each constraint is detailed in the sections below. +===== Mutability Once a constraint has been created, it may not be amended. Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. -==== Constraint names +===== Constraint names All constraints require the user to specify a nonempty _name_ at constraint creation time. This name is subsequently the handle with which a user may refer to the constraint, for example when dropping it. // TODO: Should we impose restrictions on the domain of constraint names, or are all Unicode characters allowed? -=== Syntax overview +=== Syntax -The syntax for all constraints follow the same basic outline. +The constraint syntax is defined as follows: .Grammar definition for constraint syntax. [source, ebnf] ---- constraint command = create-constraint | drop-constraint ; -create-constraint = "CREATE", "CONSTRAINT", constraint-name, "FOR", constraint-pattern, "REQUIRE", constraint-expr ; +create-constraint = "CREATE", "CONSTRAINT", constraint-name, "FOR", constraint-pattern, "REQUIRE", constraint-expr, { "REQUIRE", constraint-expr } ; constraint-name = symbolic-name constraint-pattern = node-pattern | simple-pattern ; -constraint-expr = uniqueness-constraint | existence-constraint ; +constraint-expr = uniqueness-expr | expression ; +uniquness-expr = "UNIQUE", property-expression, { ",", property-expression } drop-constraint = "DROP", "CONSTRAINT", constraint-name ; ---- -The constraint expressions vary depending on the actual constraint (see the detailed sections). +The constraint expression (`constraint-expr` above) is any expression that evaluates to a boolean value. +This allows for very complex concrete constraint definitions within the specified syntax. + +To that set of valid expressions, this CIP further specifies a special prefix operator `UNIQUE`, which is used to assert uniqueness of one or more property expressions. + +==== Removing constraints + +A constraint is removed by referring to its name. .Example of dropping a constraint with name `foo`: [source, cypher] @@ -67,41 +72,25 @@ The constraint expressions vary depending on the actual constraint (see the deta DROP CONSTRAINT foo ---- -=== Semantics overview +=== Semantics -The following list describes the situations in which an error will be raised: - -* Attempting to create a constraint on a graph where the data does not comply with the constraint criterion. -* Attempting to create a constraint with a name that already exists. -* Attempting to drop a constraint referencing a non-existent name. -* Attempting to modify the graph in such a way that it would violate a constraint. +The semantics for constraints follow these general rules: -The constraints define a _domain_ within which the constraint applies. -The domain is defined by the constraint pattern. +1. The constraint pattern define the constraint domain, where all entities that would be returned by a `MATCH` clause with the same pattern constitute the domain, with one notable exception (see <>). -[[uniqueness]] -=== Node property uniqueness constraint +2. The constraint expressions defined in the `REQUIRE` clauses of the constraint definition must all evaluate to `true`. Any other result raises an error (see <>). -This constraint enforces that there cannot be duplicate values of some property `p` for any node labeled with some label `l`. -For example, this constraint can specify that the `email` property must be unique for all nodes labeled with `:Person`. +3. [[domain-exception]]Entities for which a constraint expression evaluate to `null` under Cypher's ternary logic are _excluded_ from the constraint domain, even if they fit within the constraint pattern. -==== Syntax +==== Uniqueness -.Grammar definition for the node property uniqueness constraint: -[source, ebnf] ----- -uniqueness-constraint = "UNIQUE", property-expression, { ",", property-expression } ; ----- +The new operator `UNIQUE` is only valid as part of a constraint expression. +It takes as argument one or more property expressions, and asserts that the combination of the evaluated values of the expressions (forming a tuple) is unique across the constraint domain. -.Example of a single-property uniqueness constraint: -[source, cypher] ----- -CREATE CONSTRAINT unique_person_email -FOR (p:Person) -REQUIRE UNIQUE p.email ----- +The domain of the uniqueness expression is limited to entities for which _all_ properties defined as arguments to the `UNIQUE` operator exist. +In other words, property expressions which evaluate to `null` are not considered for uniqueness (see <>) above. -.Example of a multiple-property uniqueness constraint: +.Example of a constraint definition using `UNIQUE`, over the domain of nodes labeled with `:Person`: [source, cypher] ---- CREATE CONSTRAINT unique_person_details @@ -109,14 +98,31 @@ FOR (p:Person) REQUIRE UNIQUE p.name, p.email, p.address ---- -==== Semantics +==== Errors + +The following list describes the situations in which an error will be raised: + +* Attempting to create a constraint on a graph where the data does not comply with the constraint criterion. +* Attempting to create a constraint with a name that already exists. +* Attempting to drop a constraint referencing a non-existent name. +* Attempting to modify the graph in such a way that it would violate a constraint. + +The constraints define a _domain_ within which the constraint applies. +The domain is defined by the constraint pattern. + +==== Compositionality + +It is possible to define multiple `REQUIRE` clauses within the scope of the same constraint. +The semantics between these is that of a conjunction between the constraint expressions of the clauses, such that the constraint is upheld if and only if for all `REQUIRE` clauses, the expression evaluates to `true`. + +This is useful not only for readability and logical separation of different aspects of the same constraint, but also for combining the use of the `UNIQUE` operator with other constraint expressions. -The domain of a property uniqueness constraint is defined as all the nodes with a specific label where the specified property key(s) exist (i.e. are not null). -When more than one property key is defined as part of the constraint, only nodes where _all_ of the properties exist are part of the domain. +=== Examples -The uniqueness constraint mandates that two distinct nodes within the domain cannot have the same combination of values for the defined properties. +In this section we provide several examples of constraints that are possible to express in the specified syntax. -===== Example +[NOTE] +The specification in this CIP is limited to the general syntax of constraints, and the following are simply examples of possible uses of the language defined by that syntax. None of the examples provided are to be viewed as mandatory for any Cypher implementation. Consider the graph created by the statement below. The graph contains nodes labeled with `:Color`. @@ -150,21 +156,18 @@ FOR (c:Color) REQUIRE UNIQUE c.rgb, c.name ---- -[[existence]] -=== Property existence constraints +Now, consider the following query which retrieves the RGB value of a color with a given `name`: -Property existence constraints are defined for both nodes and relationships; these have the same semantics. -We now describe both of these. - -==== Syntax - -.Grammar definition for the property existence constraint: -[source, ebnf] +[source, cypher] ---- -existence-constraint = "exists", "(", property-expression, ")" ; +MATCH (c:Color {name: $name}) +WHERE exists(c.rgb) +RETURN c.rgb ---- -.Example of a node property existence constraint: +The `WHERE` clause is here used to prevent an application from retrieving `null` values for user-defined colors where the RGB values have not been specified correctly. +It may, however, be eliminated by the introduction of a constraint asserting the existence of that property: + [source, cypher] ---- CREATE CONSTRAINT colors_must_have_rgb @@ -172,45 +175,53 @@ FOR (c:Color) REQUIRE exists(c.rgb) ---- -.Example of a relationship property existence constraint: +Any updating statement that would create a `:Color` node without specifying an `rgb` property for it would now fail. + +Alternatively, we could extend our previous constraint definition with this new requirement: + [source, cypher] ---- -CREATE CONSTRAINT rates_have_quality -FOR ()-[l:RATED]-() -REQUIRE exists(l.rating) +CREATE CONSTRAINT color_schema +FOR (c:Color) +REQUIRE UNIQUE c.rgb, c.name +REQUIRE exists(c.rgb) ---- -==== Semantics - -The domain of a node property existence constraint are all nodes with the specified label. -Similarly, the domain of a relationship property existence constraint are all relationship with the specified type. - -The property existence constraint mandates that the value of the specified property exists (i.e. is not null) for all entities in the domain. - -===== Example +This composite constraint will make sure that all `:Color` nodes has a value for their `rgb` property, and that its value is unique for each `name`. -Consider once again the graph containing `:Color` nodes. - -The following query retrieves the RGB value of a color with a given `name`: +More complex constraint definitions are considered below: +.Property value limitations [source, cypher] ---- -MATCH (c:Color {name: $name}) -WHERE exists(c.rgb) -RETURN c.rgb +CREATE CONSTRAINT road_width +FOR ()-[r:ROAD]-() +REQUIRE 5 < r.width < 50 ---- -The `WHERE` clause may be used to prevent an application from retrieving `null` values for user-defined colors where the RGB values have not been specified correctly. -It may, however, be eliminated by the introduction of a node property existence constraint: +.Cardinality +[source, cypher] +---- +CREATE CONSTRAINT spread_the_love +FOR (p:Person) +REQUIRE size((p)-[:LOVES]->()) > 3 +---- +.Endpoint requirements [source, cypher] ---- -CREATE CONSTRAINT colors_must_have_rgb -FOR (c:Color) -REQUIRE exists(c.rgb) +CREATE CONSTRAINT can_only_own_things +FOR ()-[:OWNS]->(t) +REQUIRE (t:Vehicle) OR (t:Building) OR (t:Object) ---- -Any updating statement that would create a `:Color` node without specifying a `rgb` property for it would now fail. +.Label coexistence +[source, cypher] +---- +CREATE CONSTRAINT programmers_are_people_too +FOR (p:Programmer) +REQUIRE p:Person +---- === Interaction with existing features @@ -224,7 +235,7 @@ Alternative syntaxes have been discussed: * `GIVEN`, `CONSTRAIN`, `ASSERT` instead of `FOR` * `ASSERT`, `ENFORCE`, `IMPLIES` instead of `REQUIRE` -The use of an existing expression to express uniqueness -- instead of using a new keyword `UNIQUE` -- becomes unwieldy for multiple properties, as exemplified by the following: +The use of an existing expression to express uniqueness -- instead of using the operator `UNIQUE` -- becomes unwieldy for multiple properties, as exemplified by the following: ---- FOR (p:Person), (q:Person) REQUIRE p.email <> q.email AND p <> q @@ -279,6 +290,9 @@ ADD CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName) Constraints make Cypher's notion of schema more well-defined, allowing users to maintain graphs in a more regular, easier-to-manage form. +Additionally, this specification is deliberately defining a constraint _language_ within which a great deal of possible concrete constraints are made possible. +This allows different implementers of Cypher to independently choose how to limit the scope of supported constraint expressions that fit their model and targeted use cases. + == Caveats to this proposal Some constraints may prove challenging to enforce in a system seeking to implement the contents of this CIP, as these generally require scanning through large parts of the graph to locate conflicting entities. From fcaad28406eabd5e6d4d155fa4e267edb742b890 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 1 Mar 2017 14:18:31 +0100 Subject: [PATCH 12/30] Remove Motivation section --- cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc | 4 ---- 1 file changed, 4 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 979c38cb2e..c323636296 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -15,10 +15,6 @@ These are language constructs that impose restrictions on the shape of the data toc::[] -== Motivation - -Constraints provide the means by which various aspects of the data graph may be controlled. - == Background Cypher has a loose notion of a schema, in which nodes and relationships may take very heterogeneous forms, both in terms of properties and in graph patterns. From 35c47309870a519f9b5aed7b245e82cf170b4864 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 1 Mar 2017 14:21:01 +0100 Subject: [PATCH 13/30] Move Mutability and Name sections They are more appropriate under the Semantics and Syntax sections, respectively. --- .../CIP2016-12-14-Constraint-syntax.adoc | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index c323636296..f4415df00f 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -25,18 +25,6 @@ Constraints allow us to mould the heterogeneous nature of the property graph int This CIP specifies the general syntax for constraint definition (and constraint removal), and provides several examples of possible use cases for constraints. However, the specification does not otherwise specify or limit the space of expressible constraints that the syntax and semantics allow. -===== Mutability - -Once a constraint has been created, it may not be amended. -Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. - -===== Constraint names - -All constraints require the user to specify a nonempty _name_ at constraint creation time. -This name is subsequently the handle with which a user may refer to the constraint, for example when dropping it. - -// TODO: Should we impose restrictions on the domain of constraint names, or are all Unicode characters allowed? - === Syntax The constraint syntax is defined as follows: @@ -58,6 +46,13 @@ This allows for very complex concrete constraint definitions within the specifie To that set of valid expressions, this CIP further specifies a special prefix operator `UNIQUE`, which is used to assert uniqueness of one or more property expressions. +==== Constraint names + +All constraints require the user to specify a nonempty _name_ at constraint creation time. +This name is subsequently the handle with which a user may refer to the constraint, for example when dropping it. + +// TODO: Should we impose restrictions on the domain of constraint names, or are all Unicode characters allowed? + ==== Removing constraints A constraint is removed by referring to its name. @@ -78,6 +73,11 @@ The semantics for constraints follow these general rules: 3. [[domain-exception]]Entities for which a constraint expression evaluate to `null` under Cypher's ternary logic are _excluded_ from the constraint domain, even if they fit within the constraint pattern. +==== Mutability + +Once a constraint has been created, it may not be amended. +Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. + ==== Uniqueness The new operator `UNIQUE` is only valid as part of a constraint expression. From 9d7aaf433240d7c664e51459c84f47cf604a16e3 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 1 Mar 2017 14:26:56 +0100 Subject: [PATCH 14/30] Add cross-links Move Errors section --- .../CIP2016-12-14-Constraint-syntax.adoc | 28 +++++++++---------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index f4415df00f..f2680afda1 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -44,7 +44,7 @@ drop-constraint = "DROP", "CONSTRAINT", constraint-name ; The constraint expression (`constraint-expr` above) is any expression that evaluates to a boolean value. This allows for very complex concrete constraint definitions within the specified syntax. -To that set of valid expressions, this CIP further specifies a special prefix operator `UNIQUE`, which is used to assert uniqueness of one or more property expressions. +To that set of valid expressions, this CIP further specifies a special prefix operator `UNIQUE`, which is used to assert uniqueness of one or more property expressions (see <> for details). ==== Constraint names @@ -67,17 +67,27 @@ DROP CONSTRAINT foo The semantics for constraints follow these general rules: -1. The constraint pattern define the constraint domain, where all entities that would be returned by a `MATCH` clause with the same pattern constitute the domain, with one notable exception (see <>). +1. The constraint pattern define the constraint _domain_, where all entities that would be returned by a `MATCH` clause with the same pattern constitute the domain, with one notable exception (see <>). -2. The constraint expressions defined in the `REQUIRE` clauses of the constraint definition must all evaluate to `true`. Any other result raises an error (see <>). +2. The constraint expressions defined in the `REQUIRE` clauses of the constraint definition must all evaluate to `true`. 3. [[domain-exception]]Entities for which a constraint expression evaluate to `null` under Cypher's ternary logic are _excluded_ from the constraint domain, even if they fit within the constraint pattern. +==== Errors + +The following list describes the situations in which an error will be raised: + +* Attempting to create a constraint on a graph where the data does not comply with the constraint criterion. +* Attempting to create a constraint with a name that already exists. +* Attempting to drop a constraint referencing a non-existent name. +* Attempting to modify the graph in such a way that it would violate a constraint. + ==== Mutability Once a constraint has been created, it may not be amended. Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. +[[uniqueness]] ==== Uniqueness The new operator `UNIQUE` is only valid as part of a constraint expression. @@ -94,18 +104,6 @@ FOR (p:Person) REQUIRE UNIQUE p.name, p.email, p.address ---- -==== Errors - -The following list describes the situations in which an error will be raised: - -* Attempting to create a constraint on a graph where the data does not comply with the constraint criterion. -* Attempting to create a constraint with a name that already exists. -* Attempting to drop a constraint referencing a non-existent name. -* Attempting to modify the graph in such a way that it would violate a constraint. - -The constraints define a _domain_ within which the constraint applies. -The domain is defined by the constraint pattern. - ==== Compositionality It is possible to define multiple `REQUIRE` clauses within the scope of the same constraint. From ec56ba95d6674c3f1e6d271d52b131fe4c001621 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 1 Mar 2017 15:53:14 +0100 Subject: [PATCH 15/30] Add example for CIR-2017-172 References #172 --- cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index f2680afda1..2745d1f961 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -217,6 +217,16 @@ FOR (p:Programmer) REQUIRE p:Person ---- +Assuming a function `acyclic()` that takes a path as argument and returns `true` if and only if the same node does not appear twice in the path, otherwise `false`, we may express: + +.Constraint example from CIR-2017-172 +[source, cypher] +---- +CREATE CONSTRAINT enforce_dag_acyclic_for_R_links +FOR p = ()-[:R*]-() +REQUIRE acyclic(p) +---- + === Interaction with existing features The main interaction between the constraints and the rest of the language occurs during updating statements. From c6a6d38b8e25ffb596fb5a5cb3328050b491f34a Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Wed, 1 Mar 2017 21:57:02 +0100 Subject: [PATCH 16/30] Support arbitrary patterns - Remove TODO - Add example using larger pattern - Add example using multiple `exists()` --- .../CIP2016-12-14-Constraint-syntax.adoc | 21 +++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 2745d1f961..070a9a5bde 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -33,9 +33,8 @@ The constraint syntax is defined as follows: [source, ebnf] ---- constraint command = create-constraint | drop-constraint ; -create-constraint = "CREATE", "CONSTRAINT", constraint-name, "FOR", constraint-pattern, "REQUIRE", constraint-expr, { "REQUIRE", constraint-expr } ; +create-constraint = "CREATE", "CONSTRAINT", constraint-name, "FOR", pattern, "REQUIRE", constraint-expr, { "REQUIRE", constraint-expr } ; constraint-name = symbolic-name -constraint-pattern = node-pattern | simple-pattern ; constraint-expr = uniqueness-expr | expression ; uniquness-expr = "UNIQUE", property-expression, { ",", property-expression } drop-constraint = "DROP", "CONSTRAINT", constraint-name ; @@ -51,8 +50,6 @@ To that set of valid expressions, this CIP further specifies a special prefix op All constraints require the user to specify a nonempty _name_ at constraint creation time. This name is subsequently the handle with which a user may refer to the constraint, for example when dropping it. -// TODO: Should we impose restrictions on the domain of constraint names, or are all Unicode characters allowed? - ==== Removing constraints A constraint is removed by referring to its name. @@ -185,6 +182,22 @@ This composite constraint will make sure that all `:Color` nodes has a value for More complex constraint definitions are considered below: +.Multiple property existence using conjunction +[source, cypher] +---- +CREATE CONSTRAINT person_properties +FOR (p:Person) +REQUIRE exists(p.name) AND exists(p.email) +---- + +.Using larger pattern +[source, cypher] +---- +CREATE CONSTRAINT not_rating_own_posts +FOR (u1:User)-[:RATED]->(:Post)<-[:POSTED_BY]-(u2:User) +REQUIRE u.name <> u2.name +---- + .Property value limitations [source, cypher] ---- From d355dcca588bddb6e99600eaba0bbcc1f5be78a9 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 3 Mar 2017 17:43:01 +0100 Subject: [PATCH 17/30] Introduce PRIMARY KEY constraint predicate - Rename `constrait-expr` to `constraint-predicate` - Limit scope of `UNIQUE` to single properties only - Update examples to reflect `PRIMARY KEY` --- .../CIP2016-12-14-Constraint-syntax.adoc | 79 +++++++++++++------ 1 file changed, 53 insertions(+), 26 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 070a9a5bde..89f284c5bc 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -32,23 +32,25 @@ The constraint syntax is defined as follows: .Grammar definition for constraint syntax. [source, ebnf] ---- -constraint command = create-constraint | drop-constraint ; -create-constraint = "CREATE", "CONSTRAINT", constraint-name, "FOR", pattern, "REQUIRE", constraint-expr, { "REQUIRE", constraint-expr } ; -constraint-name = symbolic-name -constraint-expr = uniqueness-expr | expression ; -uniquness-expr = "UNIQUE", property-expression, { ",", property-expression } -drop-constraint = "DROP", "CONSTRAINT", constraint-name ; +constraint command = create-constraint | drop-constraint ; +create-constraint = "CREATE", "CONSTRAINT", [ constraint-name ], "FOR", pattern, "REQUIRE", constraint-predicate, { "REQUIRE", constraint-predicate } ; +constraint-name = symbolic-name +constraint-predicate = expression | unique | primary-key ; +unique = "UNIQUE", property-expression +primary-key = "PRIMARY KEY", property-expression, { ",", property-expression } +drop-constraint = "DROP", "CONSTRAINT", constraint-name ; ---- -The constraint expression (`constraint-expr` above) is any expression that evaluates to a boolean value. -This allows for very complex concrete constraint definitions within the specified syntax. +The `REQUIRE` clause works exactly like the `WHERE` clause in a standard Cypher query, with the addition of also supporting the special constraint operators `UNIQUE` and `PRIMARY KEY`. +This allows for very complex concrete constraint definitions (using custom predicates) within the specified syntax. -To that set of valid expressions, this CIP further specifies a special prefix operator `UNIQUE`, which is used to assert uniqueness of one or more property expressions (see <> for details). +For details on `UNIQUE` and `PRIMARY KEY`, see the dedicated sections below: <>, <>. ==== Constraint names -All constraints require the user to specify a nonempty _name_ at constraint creation time. +All constraints provide the user the option to specify a nonempty _name_ at constraint creation time. This name is subsequently the handle with which a user may refer to the constraint, for example when dropping it. +In the case where a name is not provided, the system will generate a unique name. ==== Removing constraints @@ -87,26 +89,50 @@ Should a user wish to change its definition, it has to be dropped and recreated [[uniqueness]] ==== Uniqueness -The new operator `UNIQUE` is only valid as part of a constraint expression. +The new operator `UNIQUE` is only valid as part of a constraint predicate. +It takes as argument a single property expression, and asserts that this property is unique across the domain of the constraint. +Following on rule <> above, entities for which the property is not defined (is `null`) are not part of the constraint domain. + +.Example of a constraint definition using `UNIQUE`, over the domain of nodes labeled with `:Person`: +[source, cypher] +---- +CREATE CONSTRAINT only_one_person_per_name +FOR (p:Person) +REQUIRE UNIQUE p.name +---- + +[[primary-key]] +==== Primary key + +The new operator `PRIMARY KEY` is only valid as part of a constraint predicate. It takes as argument one or more property expressions, and asserts that the combination of the evaluated values of the expressions (forming a tuple) is unique across the constraint domain. +It further asserts that the property expressions all exist on the entities of the domain, and thus avoids applicability of rule <> above. The domain of a primary key constraint is thus exactly defined as all entities which fit the constraint pattern. -The domain of the uniqueness expression is limited to entities for which _all_ properties defined as arguments to the `UNIQUE` operator exist. -In other words, property expressions which evaluate to `null` are not considered for uniqueness (see <>) above. +.Example of a constraint definition using `PRIMARY KEY`, over the domain of nodes labeled with `:Person`: +[source, cypher] +---- +CREATE CONSTRAINT person_details +FOR (p:Person) +REQUIRE PRIMARY KEY p.name, p.email, p.address +---- -.Example of a constraint definition using `UNIQUE`, over the domain of nodes labeled with `:Person`: +A semantically equivalent constraint is achieved by composing the use of the `UNIQUE` operator with `exists()` predicates, as exemplified by: + +.Example of a constraint definition equivalent to the above `PRIMARY KEY` example: [source, cypher] ---- -CREATE CONSTRAINT unique_person_details +CREATE CONSTRAINT person_details FOR (p:Person) -REQUIRE UNIQUE p.name, p.email, p.address +REQUIRE UNIQUE p.name +REQUIRE UNIQUE p.email +REQUIRE UNIQUE p.address +REQUIRE exists(p.name) AND exists(p.email) AND exists(p.address) ---- ==== Compositionality It is possible to define multiple `REQUIRE` clauses within the scope of the same constraint. -The semantics between these is that of a conjunction between the constraint expressions of the clauses, such that the constraint is upheld if and only if for all `REQUIRE` clauses, the expression evaluates to `true`. - -This is useful not only for readability and logical separation of different aspects of the same constraint, but also for combining the use of the `UNIQUE` operator with other constraint expressions. +The semantics between these is that of a conjunction (under standard 2-valued boolean logic) between the constraint predicates of the clauses, such that the constraint is upheld if and only if for all `REQUIRE` clauses, the joint predicate evaluates to `true`. === Examples @@ -144,7 +170,8 @@ We could then use the following constraint, without modifying our data: ---- CREATE CONSTRAINT unique_color_nodes FOR (c:Color) -REQUIRE UNIQUE c.rgb, c.name +REQUIRE UNIQUE c.rgb +REQUIRE UNIQUE c.name ---- Now, consider the following query which retrieves the RGB value of a color with a given `name`: @@ -168,17 +195,16 @@ REQUIRE exists(c.rgb) Any updating statement that would create a `:Color` node without specifying an `rgb` property for it would now fail. -Alternatively, we could extend our previous constraint definition with this new requirement: +If we also want to mandate the existence of the `name` property, we could use a `PRIMARY KEY` operator to capture all these requirements in a single constraint: [source, cypher] ---- CREATE CONSTRAINT color_schema FOR (c:Color) -REQUIRE UNIQUE c.rgb, c.name -REQUIRE exists(c.rgb) +REQUIRE PRIMARY KEY c.rgb, c.name ---- -This composite constraint will make sure that all `:Color` nodes has a value for their `rgb` property, and that its value is unique for each `name`. +This constraint will make sure that all `:Color` nodes has a value for their `rgb` and `name` properties, and that the combination is unique across all the nodes. More complex constraint definitions are considered below: @@ -269,8 +295,9 @@ In SQL, the following constraints exist (inspired by http://www.w3schools.com/sq * `CHECK` - Ensures that the value in a column meets a specific condition * `DEFAULT` - Specifies a default value for a column. -The property existence constraints correspond to the `NOT NULL` SQL constraint. -The node property uniqueness constraint corresponds to the `PRIMARY KEY` SQL constraint. +The `NOT NULL` SQL constraint is expressible using an `exists()` constraint predicate. +The `UNIQUE` SQL constraint is exactly as Cypher's `UNIQUE` constraint predicate. +The `PRIMARY KEY` SQL constraint is exactly as Cypher's `PRIMARY KEY` constraint predicate. SQL constraints may be introduced at table creation time in a `CREATE TABLE` statement, or in an `ALTER TABLE` statement: From 9438af11aa7838562231e655fb2186cfa05ba649 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Tue, 7 Mar 2017 15:30:30 +0100 Subject: [PATCH 18/30] Rename constraint operator to NODE KEY - Remove erroneous example for composing `NODE KEY` with `UNIQUE` and `exists()` - Rephrase example section to describe `NODE KEY` more accurately. --- .../CIP2016-12-14-Constraint-syntax.adoc | 47 +++++++------------ 1 file changed, 18 insertions(+), 29 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 89f284c5bc..6c4a48f534 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -35,16 +35,16 @@ The constraint syntax is defined as follows: constraint command = create-constraint | drop-constraint ; create-constraint = "CREATE", "CONSTRAINT", [ constraint-name ], "FOR", pattern, "REQUIRE", constraint-predicate, { "REQUIRE", constraint-predicate } ; constraint-name = symbolic-name -constraint-predicate = expression | unique | primary-key ; +constraint-predicate = expression | unique | node-key ; unique = "UNIQUE", property-expression -primary-key = "PRIMARY KEY", property-expression, { ",", property-expression } +node-key = "NODE KEY", property-expression, { ",", property-expression } drop-constraint = "DROP", "CONSTRAINT", constraint-name ; ---- -The `REQUIRE` clause works exactly like the `WHERE` clause in a standard Cypher query, with the addition of also supporting the special constraint operators `UNIQUE` and `PRIMARY KEY`. +The `REQUIRE` clause works exactly like the `WHERE` clause in a standard Cypher query, with the addition of also supporting the special constraint operators `UNIQUE` and `NODE KEY`. This allows for very complex concrete constraint definitions (using custom predicates) within the specified syntax. -For details on `UNIQUE` and `PRIMARY KEY`, see the dedicated sections below: <>, <>. +For details on `UNIQUE` and `NODE KEY`, see the dedicated sections below: <>, <>. ==== Constraint names @@ -101,32 +101,31 @@ FOR (p:Person) REQUIRE UNIQUE p.name ---- -[[primary-key]] -==== Primary key +[[node-key]] +==== Node key -The new operator `PRIMARY KEY` is only valid as part of a constraint predicate. +The new operator `NODE KEY` is only valid as part of a constraint predicate. It takes as argument one or more property expressions, and asserts that the combination of the evaluated values of the expressions (forming a tuple) is unique across the constraint domain. -It further asserts that the property expressions all exist on the entities of the domain, and thus avoids applicability of rule <> above. The domain of a primary key constraint is thus exactly defined as all entities which fit the constraint pattern. +It further asserts that the property expressions all exist on the entities of the domain, and thus avoids applicability of rule <> above. +The domain of a node key constraint is thus exactly defined as all entities which fit the constraint pattern. -.Example of a constraint definition using `PRIMARY KEY`, over the domain of nodes labeled with `:Person`: +.Example of a constraint definition using `NODE KEY`, over the domain of nodes labeled with `:Person`: [source, cypher] ---- CREATE CONSTRAINT person_details FOR (p:Person) -REQUIRE PRIMARY KEY p.name, p.email, p.address +REQUIRE NODE KEY p.name, p.email, p.address ---- -A semantically equivalent constraint is achieved by composing the use of the `UNIQUE` operator with `exists()` predicates, as exemplified by: +In the context of a single property, a semantically equivalent constraint is achieved by composing the use of the `UNIQUE` operator with `exists()` predicates, as exemplified by: -.Example of a constraint definition equivalent to the above `PRIMARY KEY` example: +.Example of a constraint definition equivalent to a `NODE KEY` on a single property `name`: [source, cypher] ---- CREATE CONSTRAINT person_details FOR (p:Person) REQUIRE UNIQUE p.name -REQUIRE UNIQUE p.email -REQUIRE UNIQUE p.address -REQUIRE exists(p.name) AND exists(p.email) AND exists(p.address) +REQUIRE exists(p.name) ---- ==== Compositionality @@ -163,17 +162,6 @@ FOR (c:Color) REQUIRE UNIQUE c.rgb ---- -Suppose that we would rather like to have one color node per `name` _and_ `rgb` value (to work around the rounding errors). -We could then use the following constraint, without modifying our data: - -[source, cypher] ----- -CREATE CONSTRAINT unique_color_nodes -FOR (c:Color) -REQUIRE UNIQUE c.rgb -REQUIRE UNIQUE c.name ----- - Now, consider the following query which retrieves the RGB value of a color with a given `name`: [source, cypher] @@ -195,16 +183,17 @@ REQUIRE exists(c.rgb) Any updating statement that would create a `:Color` node without specifying an `rgb` property for it would now fail. -If we also want to mandate the existence of the `name` property, we could use a `PRIMARY KEY` operator to capture all these requirements in a single constraint: +If we instead want to make the _combination_ of the properties `name` and `rgb` unique, while simultaneously mandating their existence, we could use a `NODE KEY` operator to capture all these requirements in a single constraint: [source, cypher] ---- CREATE CONSTRAINT color_schema FOR (c:Color) -REQUIRE PRIMARY KEY c.rgb, c.name +REQUIRE NODE KEY c.rgb, c.name ---- This constraint will make sure that all `:Color` nodes has a value for their `rgb` and `name` properties, and that the combination is unique across all the nodes. +This would allow several `:Color` nodes named `'grey'`, as long as their `rgb` values are distinct. More complex constraint definitions are considered below: @@ -297,7 +286,7 @@ In SQL, the following constraints exist (inspired by http://www.w3schools.com/sq The `NOT NULL` SQL constraint is expressible using an `exists()` constraint predicate. The `UNIQUE` SQL constraint is exactly as Cypher's `UNIQUE` constraint predicate. -The `PRIMARY KEY` SQL constraint is exactly as Cypher's `PRIMARY KEY` constraint predicate. +The `PRIMARY KEY` SQL constraint is exactly as Cypher's `NODE KEY` constraint predicate. SQL constraints may be introduced at table creation time in a `CREATE TABLE` statement, or in an `ALTER TABLE` statement: From f31f09f16f7d9caedf2e41258657cc9bd5f36b29 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Tue, 7 Mar 2017 15:47:44 +0100 Subject: [PATCH 19/30] Use ADD for constraint creation - Add missing case for when an error should be raised --- .../CIP2016-12-14-Constraint-syntax.adoc | 43 ++++++++++--------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 6c4a48f534..9a4943ebad 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -32,8 +32,8 @@ The constraint syntax is defined as follows: .Grammar definition for constraint syntax. [source, ebnf] ---- -constraint command = create-constraint | drop-constraint ; -create-constraint = "CREATE", "CONSTRAINT", [ constraint-name ], "FOR", pattern, "REQUIRE", constraint-predicate, { "REQUIRE", constraint-predicate } ; +constraint command = add-constraint | drop-constraint ; +add-constraint = "ADD", "CONSTRAINT", [ constraint-name ], "FOR", pattern, "REQUIRE", constraint-predicate, { "REQUIRE", constraint-predicate } ; constraint-name = symbolic-name constraint-predicate = expression | unique | node-key ; unique = "UNIQUE", property-expression @@ -76,15 +76,16 @@ The semantics for constraints follow these general rules: The following list describes the situations in which an error will be raised: -* Attempting to create a constraint on a graph where the data does not comply with the constraint criterion. -* Attempting to create a constraint with a name that already exists. +* Attempting to add a constraint on a graph where the data does not comply with the constraint criterion. +* Attempting to add a constraint with a name that already exists. +* Attempting to add a constraint that the underlying engine does not support enforcing. * Attempting to drop a constraint referencing a non-existent name. * Attempting to modify the graph in such a way that it would violate a constraint. ==== Mutability -Once a constraint has been created, it may not be amended. -Should a user wish to change its definition, it has to be dropped and recreated with an updated structure. +Once a constraint has been added, it may not be amended. +Should a user wish to change its definition, it has to be dropped and added anew with an updated structure. [[uniqueness]] ==== Uniqueness @@ -96,7 +97,7 @@ Following on rule <> above, entities for which the property .Example of a constraint definition using `UNIQUE`, over the domain of nodes labeled with `:Person`: [source, cypher] ---- -CREATE CONSTRAINT only_one_person_per_name +ADD CONSTRAINT only_one_person_per_name FOR (p:Person) REQUIRE UNIQUE p.name ---- @@ -112,7 +113,7 @@ The domain of a node key constraint is thus exactly defined as all entities whic .Example of a constraint definition using `NODE KEY`, over the domain of nodes labeled with `:Person`: [source, cypher] ---- -CREATE CONSTRAINT person_details +ADD CONSTRAINT person_details FOR (p:Person) REQUIRE NODE KEY p.name, p.email, p.address ---- @@ -122,7 +123,7 @@ In the context of a single property, a semantically equivalent constraint is ach .Example of a constraint definition equivalent to a `NODE KEY` on a single property `name`: [source, cypher] ---- -CREATE CONSTRAINT person_details +ADD CONSTRAINT person_details FOR (p:Person) REQUIRE UNIQUE p.name REQUIRE exists(p.name) @@ -140,7 +141,7 @@ In this section we provide several examples of constraints that are possible to [NOTE] The specification in this CIP is limited to the general syntax of constraints, and the following are simply examples of possible uses of the language defined by that syntax. None of the examples provided are to be viewed as mandatory for any Cypher implementation. -Consider the graph created by the statement below. +Consider the graph added by the statement below. The graph contains nodes labeled with `:Color`. Each color is represented as an integer-type RGB value in a property `rgb`. Users may look up nodes labeled with `:Color` to extract their RGB values for application processing. @@ -153,11 +154,11 @@ CREATE (:Color {name: 'black', rgb: 0x000000}) CREATE (:Color {name: 'very, very dark grey', rgb: 0x000000}) // rounding error! ---- -Owing to the duplication of the `rgb` property, the following attempt at creating a constraint will fail: +Owing to the duplication of the `rgb` property, the following attempt at adding a constraint will fail: [source, cypher] ---- -CREATE CONSTRAINT only_one_color_per_rgb +ADD CONSTRAINT only_one_color_per_rgb FOR (c:Color) REQUIRE UNIQUE c.rgb ---- @@ -176,7 +177,7 @@ It may, however, be eliminated by the introduction of a constraint asserting the [source, cypher] ---- -CREATE CONSTRAINT colors_must_have_rgb +ADD CONSTRAINT colors_must_have_rgb FOR (c:Color) REQUIRE exists(c.rgb) ---- @@ -187,7 +188,7 @@ If we instead want to make the _combination_ of the properties `name` and `rgb` [source, cypher] ---- -CREATE CONSTRAINT color_schema +ADD CONSTRAINT color_schema FOR (c:Color) REQUIRE NODE KEY c.rgb, c.name ---- @@ -200,7 +201,7 @@ More complex constraint definitions are considered below: .Multiple property existence using conjunction [source, cypher] ---- -CREATE CONSTRAINT person_properties +ADD CONSTRAINT person_properties FOR (p:Person) REQUIRE exists(p.name) AND exists(p.email) ---- @@ -208,7 +209,7 @@ REQUIRE exists(p.name) AND exists(p.email) .Using larger pattern [source, cypher] ---- -CREATE CONSTRAINT not_rating_own_posts +ADD CONSTRAINT not_rating_own_posts FOR (u1:User)-[:RATED]->(:Post)<-[:POSTED_BY]-(u2:User) REQUIRE u.name <> u2.name ---- @@ -216,7 +217,7 @@ REQUIRE u.name <> u2.name .Property value limitations [source, cypher] ---- -CREATE CONSTRAINT road_width +ADD CONSTRAINT road_width FOR ()-[r:ROAD]-() REQUIRE 5 < r.width < 50 ---- @@ -224,7 +225,7 @@ REQUIRE 5 < r.width < 50 .Cardinality [source, cypher] ---- -CREATE CONSTRAINT spread_the_love +ADD CONSTRAINT spread_the_love FOR (p:Person) REQUIRE size((p)-[:LOVES]->()) > 3 ---- @@ -232,7 +233,7 @@ REQUIRE size((p)-[:LOVES]->()) > 3 .Endpoint requirements [source, cypher] ---- -CREATE CONSTRAINT can_only_own_things +ADD CONSTRAINT can_only_own_things FOR ()-[:OWNS]->(t) REQUIRE (t:Vehicle) OR (t:Building) OR (t:Object) ---- @@ -240,7 +241,7 @@ REQUIRE (t:Vehicle) OR (t:Building) OR (t:Object) .Label coexistence [source, cypher] ---- -CREATE CONSTRAINT programmers_are_people_too +ADD CONSTRAINT programmers_are_people_too FOR (p:Programmer) REQUIRE p:Person ---- @@ -250,7 +251,7 @@ Assuming a function `acyclic()` that takes a path as argument and returns `true` .Constraint example from CIR-2017-172 [source, cypher] ---- -CREATE CONSTRAINT enforce_dag_acyclic_for_R_links +ADD CONSTRAINT enforce_dag_acyclic_for_R_links FOR p = ()-[:R*]-() REQUIRE acyclic(p) ---- From a2dc74d13411e1af79724a33107692f3c425e624 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Tue, 7 Mar 2017 16:05:37 +0100 Subject: [PATCH 20/30] Add specification for the return record --- .../CIP2016-12-14-Constraint-syntax.adoc | 41 +++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index 9a4943ebad..c7d033c1bb 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -25,6 +25,8 @@ Constraints allow us to mould the heterogeneous nature of the property graph int This CIP specifies the general syntax for constraint definition (and constraint removal), and provides several examples of possible use cases for constraints. However, the specification does not otherwise specify or limit the space of expressible constraints that the syntax and semantics allow. +This specification also covers the return structure of constraint commands, see <>. + === Syntax The constraint syntax is defined as follows: @@ -134,6 +136,45 @@ REQUIRE exists(p.name) It is possible to define multiple `REQUIRE` clauses within the scope of the same constraint. The semantics between these is that of a conjunction (under standard 2-valued boolean logic) between the constraint predicates of the clauses, such that the constraint is upheld if and only if for all `REQUIRE` clauses, the joint predicate evaluates to `true`. +[[return-record]] +==== Return record + +Since constraints always are named, but user-defined names are optional, the system must sometimes generate a constraint name. +In order for a user to be able to drop such a constraint, the system-generated name is therefore returned in a standard Cypher result record. +The result record has a fixed structure, with three string fields: `name`, `definition`, and `details`. + +A constraint command will always return exactly one record, if successful. +Note that also `DROP CONSTRAINT` will return a record. + +===== Name + +This field contains the name of the constraint, either user- or system-defined. + +===== Definition + +This field contains the constraint definition, which is the contents of the constraint creation command following (and including) the `FOR` clause. + +===== Details + +The contents of this field are left unspecified, to be used for implementation-specific messages and/or details. + +Consider the following constraint: +[source, Cypher] +---- +ADD CONSTRAINT myConstraint +FOR (n:Node) +REQUIRE NODE KEY n.prop1, n.prop2 +---- + +A correct result record for it could be: + +---- +name | definition | details +----------------------------------------------------------------------- +myConstraint | FOR (n:NODE) | n/a + | REQUIRE NODE KEY n.prop1, n.prop2 | +---- + === Examples In this section we provide several examples of constraints that are possible to express in the specified syntax. From 52f3a5e43266f483e43ca2898903879b867cad77 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Thu, 4 May 2017 14:11:17 +0200 Subject: [PATCH 21/30] Add tests verifying NODE KEY works in grammar --- grammar/basic-grammar.xml | 2 ++ grammar/commands.xml | 15 ++++++++------- tools/grammar/src/test/resources/cypher.txt | 18 ++++++++++++------ 3 files changed, 22 insertions(+), 13 deletions(-) diff --git a/grammar/basic-grammar.xml b/grammar/basic-grammar.xml index 69a71c427c..9efef247de 100644 --- a/grammar/basic-grammar.xml +++ b/grammar/basic-grammar.xml @@ -716,6 +716,8 @@ ANY NONE SINGLE + NODE + KEY diff --git a/grammar/commands.xml b/grammar/commands.xml index 6ffa9ebe47..4cd3f16d08 100644 --- a/grammar/commands.xml +++ b/grammar/commands.xml @@ -66,7 +66,7 @@ - CREATE &SP; CONSTRAINT &SP; &SP; + ADD &SP; CONSTRAINT &SP; &SP; FOR &SP; &SP; REQUIRE &SP; @@ -88,17 +88,18 @@ - - + + + - - UNIQUE &SP; &WS; , &WS; + + UNIQUE &SP; - - exists ( ) + + NODE &SP; KEY &SP; &WS; , &WS; diff --git a/tools/grammar/src/test/resources/cypher.txt b/tools/grammar/src/test/resources/cypher.txt index 8c3b14c25d..3694daeee3 100644 --- a/tools/grammar/src/test/resources/cypher.txt +++ b/tools/grammar/src/test/resources/cypher.txt @@ -312,16 +312,22 @@ CALL db.labels() YIELD * WHERE label CONTAINS 'User' AND foo + bar = foo RETURN count(label) AS numLabels§ CALL db.labels() YIELD x WHERE label CONTAINS 'User' AND foo + bar = foo RETURN count(label) AS numLabels§ -CREATE CONSTRAINT foo +ADD CONSTRAINT foo FOR (p:Person) REQUIRE UNIQUE p.name§ -CREATE CONSTRAINT bar -FOR (p:Person) -REQUIRE UNIQUE p.name, p.email§ -CREATE CONSTRAINT baz +ADD CONSTRAINT baz FOR (p:Person) REQUIRE exists(p.name)§ -CREATE CONSTRAINT cru +ADD CONSTRAINT cru FOR ()-[r:REL]-() REQUIRE exists(r.property)§ DROP CONSTRAINT foo_bar_baz§ +ADD CONSTRAINT nodeKey +FOR (n:Node) +REQUIRE NODE KEY n.prop§ +ADD CONSTRAINT nodeKey +FOR (n:Node) +REQUIRE NODE KEY n.p1, n.p2, n.p3§ +ADD CONSTRAINT nodeKey +FOR (n:Node) +REQUIRE NODE KEY n.p1 ,n.p2, n.p3§ From 8eca4413993a82f2c6e13a35613ab2a44f9e35ce Mon Sep 17 00:00:00 2001 From: Petra Selmer Date: Wed, 17 Jan 2018 16:23:36 +0000 Subject: [PATCH 22/30] Reformatted title --- cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index c7d033c1bb..d377c6c684 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -1,4 +1,4 @@ -= CIP2016-12-16 - Constraints syntax += CIP2016-12-16 Constraints syntax :numbered: :toc: :toc-placement: macro From 53b04457adce9dcb9e7107fcef55037c4aee73f9 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 19 Jul 2019 17:22:14 +0200 Subject: [PATCH 23/30] Use CREATE instead of ADD --- .../CIP2016-12-14-Constraint-syntax.adoc | 34 ++++++++++--------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index d377c6c684..f8581aab37 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -34,8 +34,8 @@ The constraint syntax is defined as follows: .Grammar definition for constraint syntax. [source, ebnf] ---- -constraint command = add-constraint | drop-constraint ; -add-constraint = "ADD", "CONSTRAINT", [ constraint-name ], "FOR", pattern, "REQUIRE", constraint-predicate, { "REQUIRE", constraint-predicate } ; +constraint command = create-constraint | drop-constraint ; +add-constraint = "CREATE", "CONSTRAINT", [ constraint-name ], "FOR", pattern, "REQUIRE", constraint-predicate, { "REQUIRE", constraint-predicate } ; constraint-name = symbolic-name constraint-predicate = expression | unique | node-key ; unique = "UNIQUE", property-expression @@ -99,7 +99,7 @@ Following on rule <> above, entities for which the property .Example of a constraint definition using `UNIQUE`, over the domain of nodes labeled with `:Person`: [source, cypher] ---- -ADD CONSTRAINT only_one_person_per_name +CREATE CONSTRAINT only_one_person_per_name FOR (p:Person) REQUIRE UNIQUE p.name ---- @@ -115,7 +115,7 @@ The domain of a node key constraint is thus exactly defined as all entities whic .Example of a constraint definition using `NODE KEY`, over the domain of nodes labeled with `:Person`: [source, cypher] ---- -ADD CONSTRAINT person_details +CREATE CONSTRAINT person_details FOR (p:Person) REQUIRE NODE KEY p.name, p.email, p.address ---- @@ -125,7 +125,7 @@ In the context of a single property, a semantically equivalent constraint is ach .Example of a constraint definition equivalent to a `NODE KEY` on a single property `name`: [source, cypher] ---- -ADD CONSTRAINT person_details +CREATE CONSTRAINT person_details FOR (p:Person) REQUIRE UNIQUE p.name REQUIRE exists(p.name) @@ -161,7 +161,7 @@ The contents of this field are left unspecified, to be used for implementation-s Consider the following constraint: [source, Cypher] ---- -ADD CONSTRAINT myConstraint +CREATE CONSTRAINT myConstraint FOR (n:Node) REQUIRE NODE KEY n.prop1, n.prop2 ---- @@ -199,7 +199,7 @@ Owing to the duplication of the `rgb` property, the following attempt at adding [source, cypher] ---- -ADD CONSTRAINT only_one_color_per_rgb +CREATE CONSTRAINT only_one_color_per_rgb FOR (c:Color) REQUIRE UNIQUE c.rgb ---- @@ -218,7 +218,7 @@ It may, however, be eliminated by the introduction of a constraint asserting the [source, cypher] ---- -ADD CONSTRAINT colors_must_have_rgb +CREATE CONSTRAINT colors_must_have_rgb FOR (c:Color) REQUIRE exists(c.rgb) ---- @@ -229,7 +229,7 @@ If we instead want to make the _combination_ of the properties `name` and `rgb` [source, cypher] ---- -ADD CONSTRAINT color_schema +CREATE CONSTRAINT color_schema FOR (c:Color) REQUIRE NODE KEY c.rgb, c.name ---- @@ -242,7 +242,7 @@ More complex constraint definitions are considered below: .Multiple property existence using conjunction [source, cypher] ---- -ADD CONSTRAINT person_properties +CREATE CONSTRAINT person_properties FOR (p:Person) REQUIRE exists(p.name) AND exists(p.email) ---- @@ -250,7 +250,7 @@ REQUIRE exists(p.name) AND exists(p.email) .Using larger pattern [source, cypher] ---- -ADD CONSTRAINT not_rating_own_posts +CREATE CONSTRAINT not_rating_own_posts FOR (u1:User)-[:RATED]->(:Post)<-[:POSTED_BY]-(u2:User) REQUIRE u.name <> u2.name ---- @@ -258,7 +258,7 @@ REQUIRE u.name <> u2.name .Property value limitations [source, cypher] ---- -ADD CONSTRAINT road_width +CREATE CONSTRAINT road_width FOR ()-[r:ROAD]-() REQUIRE 5 < r.width < 50 ---- @@ -266,7 +266,7 @@ REQUIRE 5 < r.width < 50 .Cardinality [source, cypher] ---- -ADD CONSTRAINT spread_the_love +CREATE CONSTRAINT spread_the_love FOR (p:Person) REQUIRE size((p)-[:LOVES]->()) > 3 ---- @@ -274,7 +274,7 @@ REQUIRE size((p)-[:LOVES]->()) > 3 .Endpoint requirements [source, cypher] ---- -ADD CONSTRAINT can_only_own_things +CREATE CONSTRAINT can_only_own_things FOR ()-[:OWNS]->(t) REQUIRE (t:Vehicle) OR (t:Building) OR (t:Object) ---- @@ -282,7 +282,7 @@ REQUIRE (t:Vehicle) OR (t:Building) OR (t:Object) .Label coexistence [source, cypher] ---- -ADD CONSTRAINT programmers_are_people_too +CREATE CONSTRAINT programmers_are_people_too FOR (p:Programmer) REQUIRE p:Person ---- @@ -292,7 +292,7 @@ Assuming a function `acyclic()` that takes a path as argument and returns `true` .Constraint example from CIR-2017-172 [source, cypher] ---- -ADD CONSTRAINT enforce_dag_acyclic_for_R_links +CREATE CONSTRAINT enforce_dag_acyclic_for_R_links FOR p = ()-[:R*]-() REQUIRE acyclic(p) ---- @@ -308,6 +308,8 @@ Alternative syntaxes have been discussed: * `GIVEN`, `CONSTRAIN`, `ASSERT` instead of `FOR` * `ASSERT`, `ENFORCE`, `IMPLIES` instead of `REQUIRE` +* `ADD` instead of `CREATE` +** It is desirable for verb pairs for modifying operations to be consistent in the language, and recent discussions are (so far informally) suggesting `INSERT`/`DELETE` to be used for data modification, thus making `CREATE` and `DROP` available for schema modification such as constraints. The use of an existing expression to express uniqueness -- instead of using the operator `UNIQUE` -- becomes unwieldy for multiple properties, as exemplified by the following: ---- From 0ec02df35e29fc4a12b72595b43e9778166f4d7c Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 19 Jul 2019 17:29:46 +0200 Subject: [PATCH 24/30] Make textual clarifications --- cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc index f8581aab37..c8954dcf7a 100644 --- a/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc +++ b/cip/1.accepted/CIP2016-12-14-Constraint-syntax.adoc @@ -70,7 +70,7 @@ The semantics for constraints follow these general rules: 1. The constraint pattern define the constraint _domain_, where all entities that would be returned by a `MATCH` clause with the same pattern constitute the domain, with one notable exception (see <>). -2. The constraint expressions defined in the `REQUIRE` clauses of the constraint definition must all evaluate to `true`. +2. The constraint expressions defined in the `REQUIRE` clauses of the constraint definition must all evaluate to `true`, at all times. 3. [[domain-exception]]Entities for which a constraint expression evaluate to `null` under Cypher's ternary logic are _excluded_ from the constraint domain, even if they fit within the constraint pattern. @@ -78,7 +78,7 @@ The semantics for constraints follow these general rules: The following list describes the situations in which an error will be raised: -* Attempting to add a constraint on a graph where the data does not comply with the constraint criterion. +* Attempting to add a constraint on a graph where the data does not comply with a constraint predicate. * Attempting to add a constraint with a name that already exists. * Attempting to add a constraint that the underlying engine does not support enforcing. * Attempting to drop a constraint referencing a non-existent name. @@ -87,7 +87,7 @@ The following list describes the situations in which an error will be raised: ==== Mutability Once a constraint has been added, it may not be amended. -Should a user wish to change its definition, it has to be dropped and added anew with an updated structure. +Should a user wish to change a constraint definition, the constraint has to be dropped and added anew with an updated structure. [[uniqueness]] ==== Uniqueness @@ -158,7 +158,7 @@ This field contains the constraint definition, which is the contents of the cons The contents of this field are left unspecified, to be used for implementation-specific messages and/or details. -Consider the following constraint: +.Example: consider the following constraint: [source, Cypher] ---- CREATE CONSTRAINT myConstraint @@ -300,7 +300,7 @@ REQUIRE acyclic(p) === Interaction with existing features The main interaction between the constraints and the rest of the language occurs during updating statements. -Existing constraints will cause any updating statements to fail, thereby fulfilling the main purpose of this feature. +Existing constraints will cause some updating statements to fail, thereby fulfilling the main purpose of this feature. === Alternatives @@ -368,7 +368,7 @@ ADD CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName) Constraints make Cypher's notion of schema more well-defined, allowing users to maintain graphs in a more regular, easier-to-manage form. Additionally, this specification is deliberately defining a constraint _language_ within which a great deal of possible concrete constraints are made possible. -This allows different implementers of Cypher to independently choose how to limit the scope of supported constraint expressions that fit their model and targeted use cases. +This allows different implementers of Cypher to independently choose how to limit the scope of supported constraint expressions that fit their model and targeted use cases, while retaining a common and consistent semantic and syntactic model. == Caveats to this proposal From 4ef7b32893fc80e9614b061a6641e0a08e76d62b Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Mon, 22 Jul 2019 17:37:40 +0200 Subject: [PATCH 25/30] Update grammar to use CREATE Add test for DROP --- grammar/commands.xml | 2 +- tools/grammar/src/test/resources/cypher.txt | 13 +++++++------ 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/grammar/commands.xml b/grammar/commands.xml index 4cd3f16d08..01bc9bf947 100644 --- a/grammar/commands.xml +++ b/grammar/commands.xml @@ -66,7 +66,7 @@ - ADD &SP; CONSTRAINT &SP; &SP; + CREATE &SP; CONSTRAINT &SP; &SP; FOR &SP; &SP; REQUIRE &SP; diff --git a/tools/grammar/src/test/resources/cypher.txt b/tools/grammar/src/test/resources/cypher.txt index 3694daeee3..18d25f2560 100644 --- a/tools/grammar/src/test/resources/cypher.txt +++ b/tools/grammar/src/test/resources/cypher.txt @@ -312,22 +312,23 @@ CALL db.labels() YIELD * WHERE label CONTAINS 'User' AND foo + bar = foo RETURN count(label) AS numLabels§ CALL db.labels() YIELD x WHERE label CONTAINS 'User' AND foo + bar = foo RETURN count(label) AS numLabels§ -ADD CONSTRAINT foo +CREATE CONSTRAINT foo FOR (p:Person) REQUIRE UNIQUE p.name§ -ADD CONSTRAINT baz +CREATE CONSTRAINT baz FOR (p:Person) REQUIRE exists(p.name)§ -ADD CONSTRAINT cru +CREATE CONSTRAINT cru FOR ()-[r:REL]-() REQUIRE exists(r.property)§ DROP CONSTRAINT foo_bar_baz§ -ADD CONSTRAINT nodeKey +CREATE CONSTRAINT nodeKey FOR (n:Node) REQUIRE NODE KEY n.prop§ -ADD CONSTRAINT nodeKey +CREATE CONSTRAINT nodeKey FOR (n:Node) REQUIRE NODE KEY n.p1, n.p2, n.p3§ -ADD CONSTRAINT nodeKey +CREATE CONSTRAINT nodeKey FOR (n:Node) REQUIRE NODE KEY n.p1 ,n.p2, n.p3§ +DROP CONSTRAINT foo§ From dd6fbcd3adeedfa4850767ed9cad886dd628eead Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 3 Mar 2017 17:16:45 +0100 Subject: [PATCH 26/30] Add Neo4j index extension CIP --- .../neo4j/CIP2016-12-14-Neo4j-indexes.adoc | 138 ++++++++++++++++++ 1 file changed, 138 insertions(+) create mode 100644 cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc diff --git a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc new file mode 100644 index 0000000000..f595b0212a --- /dev/null +++ b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc @@ -0,0 +1,138 @@ += CIP2016-12-16 - Neo4j Indexes +:numbered: +:toc: +:toc-placement: macro +:source-highlighter: codemirror + +*Author:* Mats Rydberg + +[abstract] +.Abstract +-- +This CIP details Neo4j's indexing extension to Cypher, which is based on the standardised constraints syntax. +-- + +toc::[] + +== Background + +In Neo4j, indexes are formed using label and property combinations. +This enables queries that reference these label/property combinations to use the index for faster lookup with reduced cardinality overhead. + +== Proposal + +Indexes in Neo4j are able to index _labeled nodes_ only. +These nodes are kept in a separate, persisted data structure which allows lookups based on providing values for the specified indexed properties. + +=== Syntax + +The index syntax is based on the constraint syntax, and is detailed below: + +.Grammar definition for Neo4j index syntax. +[source, ebnf] +---- +index command = create-index | drop-index ; +create-index = "CREATE", "INDEX", [ index-name ], "FOR", index-pattern, "ON", index-key ; +index-pattern = node-pattern +index-name = symbolic-name +index-key = property-expression { ",", property-expression } ; +drop-index = "DROP", "INDEX", index-name ; +---- + +The `index-key` expression defines the key for the index, and consist of one or more property expressions, which refer to the entity defined in the pattern. + +==== Index names + +Just like constraints, indexes have names. +Neo4j does not support user-defined names, but the index will be assigned a system-generated name. + +==== Removing indexes + +An index is removed by referring to its name. + +.Example of dropping an index with name `index-1`: +[source, cypher] +---- +DROP INDEX index-1 +---- + +=== Semantics + +Indexes do not impose any semantics on the graph, or on queries. +They exist solely for performance reasons. + +Any query that is matching for nodes using a label and property/ies that match an index is viable to be planned using the matching index. +The _query key_ is formed by combining the referenced properties, and using this to scan the index for matching entities. +Predicates in which + +==== Domain + +Only nodes with the specified label, and values for _all_ the properties are considered part of the index domain. +This means that only queries that specify _all_ the properties will be able to be planned with index lookups. +Queries that only reference a subset of properties of an index will need the creation of another, smaller index that is defined using those properties only. + +==== Mutability + +Once an index has been created, its definition may not be amended. +Should a user wish to change its definition, the index will have to be dropped and recreated with an updated structure. + +=== Examples + +Creating indexes is straight-forward following the specified syntax. + +.An index with multiple properties +[source, cypher] +---- +CREATE INDEX +FOR (a:Address) +ON a.street, a.city, a.country +---- + +.An index with a single properties +[source, cypher] +---- +CREATE INDEX +FOR (p:Person) +ON p.name +---- + +==== Combination with Neo4j constraints + +In Neo4j, constraints are upheld through the use of indexes. +Neo4j supports three types of constraints: property uniqueness, property existence, and primary key. +These are expressed as exemplified below. + +.A Neo4j property uniqueness constraint +[source, cypher] +---- +CREATE CONSTRAINT +FOR (a:Address) +REQUIRE UNIQUE a.street +---- + +.A Neo4j node property existence constraint +[source, cypher] +---- +CREATE CONSTRAINT +FOR (a:Address) +REQUIRE exists(a.street) +---- + +.A Neo4j relationship property existence constraint +[source, cypher] +---- +CREATE CONSTRAINT +FOR ()-[o:OWNS]->() +REQUIRE exists(o.since) +---- + +.A Neo4j primary key constraint +[source, cypher] +---- +CREATE CONSTRAINT +FOR (a:Address) +REQUIRE PRIMARY KEY a.street, a.city, a.country +---- + +Creating a constraint as outlined above will also create a matching index. +It will not be possible to drop that index without also dropping the constraint. From 41a0c27adcc6d3a931b09421d9e5fcfc4b163ea7 Mon Sep 17 00:00:00 2001 From: Petra Selmer Date: Wed, 17 Jan 2018 16:17:12 +0000 Subject: [PATCH 27/30] Reformatted title --- cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc index f595b0212a..42e333106b 100644 --- a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc +++ b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc @@ -1,4 +1,4 @@ -= CIP2016-12-16 - Neo4j Indexes += CIP2016-12-16 Neo4j Indexes :numbered: :toc: :toc-placement: macro From d92de3ee5b2ac5f44e617e0260754a38720e7a45 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 26 Jul 2019 16:09:31 +0200 Subject: [PATCH 28/30] Update CIP with latest developments - NODE KEY not PRIMARY KEY - Reference to constraints syntax - Properly define domain - Expanded example to explain domain definition and consequences - Error cases - Add names to examples --- .../neo4j/CIP2016-12-14-Neo4j-indexes.adoc | 120 ++++++++++++++---- 1 file changed, 92 insertions(+), 28 deletions(-) diff --git a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc index 42e333106b..d2402dbc93 100644 --- a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc +++ b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc @@ -26,7 +26,7 @@ These nodes are kept in a separate, persisted data structure which allows lookup === Syntax -The index syntax is based on the constraint syntax, and is detailed below: +The index syntax is based on the constraint syntax (see the Constraint Syntax CIP), and is detailed below: .Grammar definition for Neo4j index syntax. [source, ebnf] @@ -44,7 +44,7 @@ The `index-key` expression defines the key for the index, and consist of one or ==== Index names Just like constraints, indexes have names. -Neo4j does not support user-defined names, but the index will be assigned a system-generated name. +If the user does not provide a name, a system-generated name will be generated. ==== Removing indexes @@ -60,21 +60,31 @@ DROP INDEX index-1 Indexes do not impose any semantics on the graph, or on queries. They exist solely for performance reasons. - -Any query that is matching for nodes using a label and property/ies that match an index is viable to be planned using the matching index. -The _query key_ is formed by combining the referenced properties, and using this to scan the index for matching entities. -Predicates in which +In other words, any query on any graph should behave exactly identical in the presence of indexes as they would otherwise. ==== Domain -Only nodes with the specified label, and values for _all_ the properties are considered part of the index domain. -This means that only queries that specify _all_ the properties will be able to be planned with index lookups. -Queries that only reference a subset of properties of an index will need the creation of another, smaller index that is defined using those properties only. +For a node to be considered part of an index domain, it is required that it + +A. has the label referenced in the index pattern +B. [[B]]has a value different from `null` for all properties referenced in the index key + +A consequence of <> is that an index will only partially support queries that project the indexed properties. +However, queries that pose predicates on the indexed properties will still enjoy full support in many cases. +See <> for more details on this difference. + +==== Errors + +The following list describes the situations in which an error will be raised: + +* Attempting to create an index with a name that already exists. +* Attempting to create an index that the underlying engine does not support enforcing. +* Attempting to drop an index referencing a non-existent name. ==== Mutability Once an index has been created, its definition may not be amended. -Should a user wish to change its definition, the index will have to be dropped and recreated with an updated structure. +Should a user wish to change the definition of an index, the index will have to be dropped and recreated with the amended definition. === Examples @@ -83,29 +93,81 @@ Creating indexes is straight-forward following the specified syntax. .An index with multiple properties [source, cypher] ---- -CREATE INDEX +CREATE INDEX addresses FOR (a:Address) ON a.street, a.city, a.country ---- -.An index with a single properties +.An index with a single property [source, cypher] ---- -CREATE INDEX +CREATE INDEX person_names FOR (p:Person) ON p.name ---- +[[domain-example]] +==== Domain example + +Consider a graph of `:Person` nodes with `name`, `email`, and `age` properties. +Not all nodes in this graph has all properties. +On this graph we declare the following index on all the properties: + +[source, cypher] +---- +CREATE INDEX person_properties +FOR (p:Person) +ON p.name, p.email, p.age +---- + +Queries that _project_ these properties will be unable to find all nodes for its result in the index domain. +The projection query is required to return all nodes regardless of whether the projected properties contain non-null values or not, and nodes with `null` for any of the referenced properties will not be found in the index domain. + +.Projection query: +[source, cypher] +---- +MATCH (p:Person) +RETURN p.name, p.age, p.email +---- + +Queries that pose _conjunctive predicates_ on the properties will however be able to find all required nodes in the index domain. +The predicate query is only required to return all nodes that passes the predicate, and predicates on non-existing properties will discard the tuple. +This applies even when the predicate does not reference all indexed properties. + +.Conjunctive predicate query: +[source, cypher] +---- +MATCH (p:Person) +WHERE p.email ENDS WITH '@opencypher.org' + AND p.age > 25 +RETURN p.name, p.age, p.email +---- + +[NOTE] +While this example is generally applicable, some predicate constructs behave differently for `null` values and need to taken into special consideration. + +.Predicate with special `null` semantics: +[source, cypher] +---- +MATCH (p:Person) +WHERE p.email IS NULL + AND p.age > 25 +RETURN p.name, p.age, p.email +---- + +In this query the index domain does not contain all nodes required for the result. +Similar reasoning must be applied to disjunctive predicates which reference expressions other than indexed properties (e.g. `WHERE p.age > 25 OR p.country = 'SWE'` ). + ==== Combination with Neo4j constraints -In Neo4j, constraints are upheld through the use of indexes. -Neo4j supports three types of constraints: property uniqueness, property existence, and primary key. +In Neo4j, constraints are generally upheld through the use of indexes. +Neo4j supports three types of constraints: property uniqueness, property existence, and node key. These are expressed as exemplified below. .A Neo4j property uniqueness constraint [source, cypher] ---- -CREATE CONSTRAINT +CREATE CONSTRAINT one_address_per_street FOR (a:Address) REQUIRE UNIQUE a.street ---- @@ -113,26 +175,28 @@ REQUIRE UNIQUE a.street .A Neo4j node property existence constraint [source, cypher] ---- -CREATE CONSTRAINT +CREATE CONSTRAINT streets_on_all_addresses FOR (a:Address) REQUIRE exists(a.street) ---- -.A Neo4j relationship property existence constraint -[source, cypher] ----- -CREATE CONSTRAINT -FOR ()-[o:OWNS]->() -REQUIRE exists(o.since) ----- - -.A Neo4j primary key constraint +.A Neo4j node key constraint [source, cypher] ---- -CREATE CONSTRAINT +CREATE CONSTRAINT address_key FOR (a:Address) -REQUIRE PRIMARY KEY a.street, a.city, a.country +REQUIRE NODE KEY a.street, a.city, a.country ---- Creating a constraint as outlined above will also create a matching index. It will not be possible to drop that index without also dropping the constraint. + +An exception to this rule is the relationship existence constraint, which is not upheld by the use of an index. + +.A Neo4j relationship property existence constraint +[source, cypher] +---- +CREATE CONSTRAINT owning_must_have_start_time +FOR ()-[o:OWNS]->() +REQUIRE exists(o.since) +---- From f02ad099f0f8014d1e4f23bd5ab6b4450c17b962 Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 26 Jul 2019 16:12:46 +0200 Subject: [PATCH 29/30] Add detail on relationship to contraints CIP --- cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc index d2402dbc93..4972c8c0cf 100644 --- a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc +++ b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc @@ -24,6 +24,8 @@ This enables queries that reference these label/property combinations to use the Indexes in Neo4j are able to index _labeled nodes_ only. These nodes are kept in a separate, persisted data structure which allows lookups based on providing values for the specified indexed properties. +While not going into exact detail on every aspect, this proposal is intended to comply with all rules stated in the Constraint Syntax CIP, where applicable. + === Syntax The index syntax is based on the constraint syntax (see the Constraint Syntax CIP), and is detailed below: From 2cebf70a90f564151165ea966584408f45a2666b Mon Sep 17 00:00:00 2001 From: Mats Rydberg Date: Fri, 26 Jul 2019 16:15:57 +0200 Subject: [PATCH 30/30] Add result record specification --- .../neo4j/CIP2016-12-14-Neo4j-indexes.adoc | 40 ++++++++++++++++++- 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc index 4972c8c0cf..105a2f65bd 100644 --- a/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc +++ b/cip/vendor-extensions/neo4j/CIP2016-12-14-Neo4j-indexes.adoc @@ -33,7 +33,7 @@ The index syntax is based on the constraint syntax (see the Constraint Syntax CI .Grammar definition for Neo4j index syntax. [source, ebnf] ---- -index command = create-index | drop-index ; +index-command = create-index | drop-index ; create-index = "CREATE", "INDEX", [ index-name ], "FOR", index-pattern, "ON", index-key ; index-pattern = node-pattern index-name = symbolic-name @@ -88,6 +88,44 @@ The following list describes the situations in which an error will be raised: Once an index has been created, its definition may not be amended. Should a user wish to change the definition of an index, the index will have to be dropped and recreated with the amended definition. +[[return-record]] +==== Return record + +Similar to the Constraint Syntax CIP, index commands will yield a single return record. +The result record has a fixed structure, with three string fields: `name`, `definition`, and `details`. + +An index command will always return exactly one record, if successful. +Note that also `DROP INDEX` will return a record. + +===== Name + +This field contains the name of the index, either user- or system-defined. + +===== Definition + +This field contains the index definition, which is the contents of the index creation command following (and including) the `FOR` clause. + +===== Details + +The contents of this field are left unspecified, to be used for implementation-specific messages and/or details. + +.Example: consider the following index: +[source, cypher] +---- +CREATE INDEX myIndex +FOR (n:Node) +ON n.prop1, n.prop2 +---- + +A correct result record for it could be: + +---- +name | definition | details +--------------------------------------- +myIndex | FOR (n:NODE) | n/a + | ON n.prop1, n.prop2 | +---- + === Examples Creating indexes is straight-forward following the specified syntax.