cypher neo4j ONgDB - ghdrako/doc_snipets GitHub Wiki
Using Cypher’s CREATE keyword to insert a subgraph.
CREATE (:Place {city:'Berlin', country:'DE'})
<-[:LIVES_IN {since:2020}]-(:Person {name:'Rosa'})
(:Person {name:'Rosa'})
is read as a person named Rosa.
Between the nodes we see the boxed arrow syntax -[:LIVES_IN]->
which
shows the direction of a relationship and its name. Unlike node labels,
relationships only have one name but you can have many relationships with
arbitrary names and directions between nodes. We can add properties to
relationships too, using the same map syntax, like <-[:LIVES_IN {since:2020}]-
.
Using CREATE always inserts new records in the database
Find any nodes, represented by parentheses (n) and bind those matches the variable n to be returned to the caller.
MATCH (n) RETURN n
Avoiding Duplicates When Enriching a Knowledge Graph
Cypher supports DELETE
which removes the matched records or cleanly aborts if that would leave any relationships dangling. It also has a specalized DETACH DELETE
which removes a matched node and its associated relationships
MATCH (n) DELETE n // Deletes nodes
// Cleanly aborts if it leaves relationships dangling.
MATCH ()-[r:LIVES_IN]->() // Deletes LIVES_IN relationships between any nodes
DELETE r
MATCH (n) DETACH DELETE n // Deletes all nodes and any relationships attached,
// effectively deleting whole graph.
CREATE CONSTRAINT no_duplicate_cities FOR (p:Place) REQUIRE (p.country,p.city) IS NODE KEY
This statement declares that Place nodes need a unique composite key composed from a combination of city and country properties.2 In turn this ensures coexistence of London, UK and London, Ontario in the database but prevents duplicates of either of them (and any other city and country combination).
MERGE (london:Place {city:'London', country:'UK'}) // Creates or matches a node to represent London, UK
// Binds it to variable "london"
MERGE (fred:Person {name:'Fred'}) // Creates or matches a node to represent Fred
// Binds it to variable "fred"
MERGE (fred)-[:LIVES_IN]->(london) // Create or match a LIVES_IN relationship
// between the fred and london nodes
MERGE (karl:Person {name:'Karl'}) // Creates or matches a node to represent Karl
// Binds it to variable "karl"
MERGE (karl)-[:LIVES_IN]->(london) // Create or match a LIVES_IN relationship
// between the karl and london nodes
MATCH (n) RETURN (n) // Show all
// in larger datasets it’s advisable to add a LIMIT to the end so you don’t wait for millions of nodes and relationships to render
update and enrich the data in knowledge graph
MATCH (p:Person) WHERE p.name='Rosa' SET p.dob=19841203 // add a property that specifes Rosa’s date of birth
MATCH (p:Person) WHERE p.name='Rosa' REMOVE p.dob // remove properies (without deleting the associated node or relationship)
MATCH (p:Person) WHERE p.name='Rosa' REMOVE p:Person // strip a label from a node
Graph Local Queries
We’ve anchored our queries to specific nodes, for example asking explicitly about Rosa or Berlin. We call these “graph local” queries because they’re bound to a specific part of the graph, using a particular node as their starting point.
MATCH (p:Person)-[:LIVES_IN]->(:Place {city:'Berlin', country:'DE'}) RETURN (p) // Who lives in Berlin?
MATCH (:Person {name:'Rosa'})-[:FRIEND*2..2]->(fof:Person) RETURN (fof) // Naive Friends of Friends
MATCH (rosa:Person {name:'Rosa'})-[:FRIEND*2..2]->(fof:Person) WHERE rosa <> fof RETURN (fof) // Correctly finding friends of friends
There are more pattern enhancements we can apply through WHERE, including Boolean operations, string matching, path patterns, list operations, property checks, and more, such as:
- WHERE n.name STARTS WITH 'Ka'
- WHERE n.name CONTAINS 'os'
- WHERE NOT n.name ENDS WITH 'y'
- WHERE NOT (p)-[:KNOWS]->(:Person {name:'Karl'})
- WHERE n.name IN ['Rosa', 'Karl'] AND (p)-[LIVES_IN]->(:Place {city:'Berlin'})
MATCH (:Place {city:'Berlin'})<-[:LIVES_IN]-(f:Person)<-[:FRIEND*1..2]-(p:Person) WHERE f <> p RETURN p // Correctly finding friends and friends of friends who live in Berlin
Graph Global Queries
if we want to query the whole graph, as is often the case with knowledge graphs? We call these queries graph global.
Which are the most popular cities to live in?
MATCH (p:Place)<-[l:LIVES_IN]-(:Person)
RETURN p AS place, count(l) AS rels ORDER BY rels DESC
We don’t strictly need the Person label in this query since we know that only Place and Person nodes are joined by LIVES_IN relationships. But when querying graphs, if you know something about your graph, explicitly put it in your query. For example, adding the Person label rather than using an anonymous node () will help the query planner create better query plans by reducing the scope of the search.
Cypher has aggregation functions avg (average), max (maximum), min (minimum), sum, and so forth along with clauses like SKIP (skip some results) and LIMIT (limit the number of results returned). Using these, we can craft queries that run over very large knowledge graphs and still return compact, pertinent aggregate information to the user.
Supporting Tools for Writing Knowledge Graph Queries
The EXPLAIN and PROFILE
keywords help query
developers understand the behavior of their queries. These tools provide a
deeper understanding of how your queries are performing - especially
important as your knowledge graphs become larger.
- https://neo4j.com/docs/java-reference/current/extending-neo4j/procedures/
- https://neo4j.com/docs/java-reference/current/extending-neo4j/functions/
It’s also possible to write queries in any programming language that runs on the Java Virtual Machine. We tend to drop down into a programming language when we need an algorithm that is more easily coded and tested in that language than in Cypher. It’s not uncommon for Cypher and other languages to be mixed, where Cypher calls procedures and functions coded in the external language that are tuned to some specific part of the problem or graph topology. Calling a procedure from Cypher is simple, with just the procedure name and any arguments. For example, the procedure CALL db.labels() will return the labels currently used in the database while CALL db.labels() YIELD label RETURN count (label) will return the number of labels currently in use. Similarly CALL dbms.security.createUser('example_username', 'example_password', false) is an example of how to call a procedure with arguments.