Query Scenarios

In BangDB we have a variety of Cypher Query Types. In this document, we listed all the query types with examples.

Movies data

Graph creation

To Create a new Graph in BangDB CLI, run the following command:

CREATE GRAPH Movies

To use an existing graph use the following command:

USE GRAPH Movies

Graph data

To ingest data in Movies Graph, run following commands from BangDB CLI:

CREATE (Movie:Matrix {"title":"The Matrix","released":"864258308"})

CREATE (Movie:cloudAtlas {"title":"Cloud Atlas", "released":"1356119730"})

CREATE (Movie:forrestGump {"title":"Forrest Gump", "released":"776409908"})

CREATE (Person:keanu {"fullname":"Keanu Reeves", "dob":"19640523123415"})

CREATE (Person:robert {"fullname":"Robert Zemeckis", "dob":"19510809041545"})

CREATE (Person:tom {"fullname":"Tom Hanks", "dob":"195601112345"})

CREATE (Person:tom)-[ACTED_IN {"roles":"Forrest"}]->(Movie:forrestGump)

CREATE (Person:tom)-[ACTED_IN {"roles":"Zachry"}]->(Movie:cloudAtlas)

CREATE (Person:keanu)-[ACTED_IN {"roles":"John Wick"}]->(Movie:Matrix)

CREATE (Person:robert)-[DIRECTED]->(Movie:forrestGump)

CREATE (Movie:forrestGump)-[WON_AWARD]->(Award:oscar)

CREATE (Person:david {"fullname":"David Lynch", "dob":"19460120140525"})-[ACTED_IN {"roles": "Morgue Attendant"}]->(Movie:lostHighway {"title": "Lost Highway", "released": "853338330"})

CREATE (Person:david)-[DIRECTED]->(Movie:lostHighway)

CREATE (Person:david)-[DIRECTED]->(Movie:theStraightStory {"title": "The Straight Story", "released": "927323127"})

CREATE (Person:david)-[DIRECTED]->(Movie:sidebyside {"title": "Side By Side", "released": "1329336940"})

CREATE (Person:keanu)-[ACTED_IN]->(Movie:sidebyside)

Scenarios

How to check all unique relations formed between all Subjects (Sub) & Objects (Obj)?

Query:

S{UNIQUE_SELECT}=>(@S *)-[@r *]->(@O *);RETURN S.label AS Sub, r.rel AS rel, O.label AS Obj

Output:

+-----+------+---------+
|Obj  |Sub   |      rel|
+-----+------+---------+
|Movie|Person| ACTED_IN|
+-----+------+---------+
|Movie|Person| DIRECTED|
+-----+------+---------+
|Award|Movie |WON_AWARD|
+-----+------+---------+

Description:

From the output, we can see the Unique relations formed in the Graph. In BangDB, we mention Alias i.e. [@S, @r, @O] to return properties using them.

Here, UNIQUE_SELECT is used in the query to get a Unique set of results. UNIQUE_SELECT can be used when we want to Unique the output based on all the values returned in the RETURN Statement i.e. [Person, ACTED_IN, Movie] is Unique.

How to list all Persons in DB with all their properties?

Query:

S=>(@P Person:*);RETURN *

S=>(Person:*);RETURN *

Output:

+---------------+--------------+------+------+------------------------------+--+
|fullname       |dob           | label|name  |_pk                           |_v|
+---------------+--------------+------+------+------------------------------+--+
|Tom Hanks      |195601112345  |Person|tom   |2471870506:978343320853564008 |1 |
+---------------+--------------+------+------+------------------------------+--+
|David Lynch    |19460120140525|Person|david |2471870506:2719078225930297886|1 |
+---------------+--------------+------+------+------------------------------+--+
|Keanu Reeves   |19640523123415|Person|keanu |2471870506:3769017240657799182|1 |
+---------------+--------------+------+------+------------------------------+--+
|Robert Zemeckis|19510809041545|Person|robert|2471870506:8066861659955905996|1 |
+---------------+--------------+------+------+------------------------------+--+

Description:

RETURN * is a feature provided in BangDB Cypher, which will return all Nodes (Subject/Object) Properties & Relation Properties in one go.

RETURN * is case-sensitive.

How to list all properties for a specified Person named 'Tom'?

Query:

S=>(@P Person:tom);RETURN *

S=>(@P Person:"tom");RETURN *

S=>(@P Person:* {fullname="Tom Hanks"});RETURN *

Output:

+---------+------------+------+----+-----------------------------+--+
|fullname |dob         | label|name|_pk                          |_v|
+---------+------------+------+----+-----------------------------+--+
|Tom Hanks|195601112345|Person|tom |2471870506:978343320853564008|1 |
+---------+------------+------+----+-----------------------------+--+

Description:

The above three queries will return similar output. The only change is the way of writing it.

In the 1st Query, we provided the Node name i.e. tom after colon (:) which means that we are asking the DB to return all Node Properties for the Node = tom.
In the 2nd Query, we provided double quotes (") with the Node name tom, this won't make any change in the query which makes it different from the 1st Query, but we use double quotes when there is a space between the letters of the Node Name. For example: Suppose if instead of tom it would be tom hanks, then to use it for Node Name Filter, we would have provided S=>(@P Person:"tom hanks");RETURN *
In the 3rd Query, instead of using the Node name, we used its property i.e. fullname. Which is Tom Hanks for Node Name tom.
In BangDB, when we want to put a value filter on a property value, we can use the 3rd query syntax or we can use WHERE. Which we will discuss later in this document.

How to get a list of all the movie names with their release dates? Also how to convert Timestamp Data into a normal DateTime Format?

BangDB Cypher provides MATH_EXP (Math Expressions) & DATE_FMT_TS (Date Format from TimeStamp) functionalities for many use cases.

MATH_EXP - Date Calculations can only be described in MATH_EXP while writing in a Cypher Query.
DATE_FMT_TS - Using this terminology, we can provide TimeStamp Data either in seconds/milliseconds/microseconds & can convert it into any specific Date or DateTime format.

Two examples are shown below specifying the Syntax & the supported DateTime formats.

Query 1:

S=>(@M Movie:*);RETURN M.title AS Title, M.released AS released, MATH_EXP("DATE_FMT_TS($released, %Y/%m/%d %H:%M:%S)") AS ReleasedDate

Output:

+------------------+----------+-------------------+
            |Title             |released  |       ReleasedDate|
+------------------+----------+-------------------+
|The Straight Story|927323127 |1999/05/22 03:15:27|
+------------------+----------+-------------------+
|Forrest Gump      |776409908 |1994/08/09 10:55:08|
+------------------+----------+-------------------+
|Cloud Atlas       |1356119730|2012/12/22 01:25:30|
+------------------+----------+-------------------+
|Lost Highway      |853338330 |1997/01/15 19:55:30|
+------------------+----------+-------------------+
|Side By Side      |1329336940|2012/02/16 01:45:40|
+------------------+----------+-------------------+
|The Matrix        |864258308 |1997/05/22 05:15:08|
+------------------+----------+-------------------+

Query 2:

S=>(@M Movie:*);RETURN M.title AS Title, M.released AS released, MATH_EXP("DATE_FMT_TS($released, %F)") AS ReleasedDate

Output:

+------------------+----------+------------+
|Title             |released  |ReleasedDate|
+------------------+----------+------------+
|The Straight Story|927323127 |  1999-05-22|
+------------------+----------+------------+
|Forrest Gump      |776409908 |  1994-08-09|
+------------------+----------+------------+
|Cloud Atlas       |1356119730|  2012-12-22|
+------------------+----------+------------+
|Lost Highway      |853338330 |  1997-01-15|
+------------------+----------+------------+
|Side By Side      |1329336940|  2012-02-16|
+------------------+----------+------------+
|The Matrix        |864258308 |  1997-05-22|
+------------------+----------+------------+

As you can see, BangDB provides us with this feature to format Date and Time. We can use the following Terminologies as well:

%Y - Year (1951)
%y - Year (22)
%m - Month (08)
%d - Day (09)
%H - Hour
%M - mm(Minutes in number format)
%S - ss(seconds)
%p - AM/PM
%P - am/pm
%z - +05:30(Time zone)
%F - YYYY/MM/DD
%A - Day (Wednesday)
%B - Month (January)
%C - YY(starting 2 numbers)
%D - MM/DD/YYYY
%G - YYYY(Year)
%H - HH(Hours)
%I - DD (Day in number format)
%P - AM/PM
%R - HH:MM(Hours:Minutes)
%T and %X- HH:MM:SS(Hours:Minutes:Seconds)
%Y - YYYY(Year)
%Z - Time Zone(IST)

How to calculate Age using DateOfBirth for all the Persons?

Query:

S=>(@P Person:*);RETURN P.fullname AS Name, P.dob AS DOB, MATH_EXP("DATE((CURTIME-$DOB)/31536000000000)") AS Age

Output:

+---------------+--------------+---------+
|Name           |DOB           |      Age|
+---------------+--------------+---------+
|Tom Hanks      |195601112345  |67.012833|
+---------------+--------------+---------+
|David Lynch    |19460120140525|76.994758|
+---------------+--------------+---------+
|Keanu Reeves   |19640523123415|58.644246|
+---------------+--------------+---------+
|Robert Zemeckis|19510809041545|71.442456|
+---------------+--------------+---------+

Description:

$DOB refers to the Person Node Property dob. To provide a values in MATH_EXP, we add prefix $ before variable name i.e. $DOB. Above MATH_EXP will return Age as LONG Type i.e. in decimals as we are subtracting Current Time from DOB.

CURTIME - Current Time in Microseconds

31536000000000 - microseconds in a Year.

Calculation:60 sec * 60 mins * 24 hours * 365 days = 31536000 seconds

Conversion of seconds into microseconds = 31536000 * 1000000 = 31536000000000 microseconds in an Year.

How to calculate all Age(round-off number) using DateOfBirth for all the Persons?

Query:

S=>(@P Person:*);RETURN P.fullname AS Name, P.dob AS DOB, MATH_EXP("ROUND(DATE((CURTIME-$DOB)/31536000000000))") AS Age

Output:

+---------------+--------------+---------+
            |Name           |DOB           |      Age|
+---------------+--------------+---------+
|Tom Hanks      |195601112345  |67.000000|
+---------------+--------------+---------+
|David Lynch    |19460120140525|77.000000|
+---------------+--------------+---------+
|Keanu Reeves   |19640523123415|59.000000|
+---------------+--------------+---------+
|Robert Zemeckis|19510809041545|71.000000|
+---------------+--------------+---------+

Description:

In this Query, we have applied ROUND functionality on top of Age Calculation. Which Rounds Off the Decimal Numbers.

How to extract Year/Month/Day from dates using Cypher Query?

Query:

S=>(@P Person:*);RETURN P.fullname AS Name, P.dob AS DOB, MATH_EXP("DATE_YEAR($DOB+0)") AS Year, MATH_EXP("DATE_MONTH($DOB+0)") AS Month, MATH_EXP("DATE_DAY($DOB+0)") AS Day

Output:

+---------------+--------------+-----------+---------+--------+
|Name           |DOB           |       Year|Day      |Month   |
+---------------+--------------+-----------+---------+--------+
|Tom Hanks      |195601112345  |1956.000000|11.000000|1.000000|
+---------------+--------------+-----------+---------+--------+
|David Lynch    |19460120140525|1946.000000|20.000000|1.000000|
+---------------+--------------+-----------+---------+--------+
|Keanu Reeves   |19640523123415|1964.000000|23.000000|5.000000|
+---------------+--------------+-----------+---------+--------+
|Robert Zemeckis|19510809041545|1951.000000|9.000000 |8.000000|
+---------------+--------------+-----------+---------+--------+

Description:

In BangDB, we have DATE_YEAR, DATE_MONTH & DATE_DAY terminologies to extract Year / Month /Day from Dates. These terminologies can be used inside MATH_EXP.

How to change the DateTime format using Cypher Query?

BangDB Cypher provides MATH_EXP (Math Expressions) & DATE_FMT (Date Format) functionalities for many use cases.

DATE_FMT - Using this terminology, we can provide any format of DateTime & can convert it into any specific Date or DateTime format whichever we want.
SORT_ASC_QLEX/SORT_DESC_QLEX - Using this terminology, we can sort any number which is in string format.

Two examples are shown below specifying the Syntax & the supported DateTime formats.

Query 1:

S=>(@P Person:*);RETURN P.fullname AS Name, P.dob AS DOB, MATH_EXP("DATE_FMT($DOB, %F)") AS DateOfBirth SORT_ASC_QLEX DOB

Output:

+---------------+--------------+-----------+
|Name           |DOB           |DateOfBirth|
+---------------+--------------+-----------+
|David Lynch    |19460120140525| 1946-01-20|
+---------------+--------------+-----------+
|Robert Zemeckis|19510809041545| 1951-08-09|
+---------------+--------------+-----------+
|Tom Hanks      |195601112345  | 1956-01-11|
+---------------+--------------+-----------+
|Keanu Reeves   |19640523123415| 1964-05-23|
+---------------+--------------+-----------+

As you can see, BangDB provides us with this feature to format Date and Time. We can use different Terminologies also:

%F - YYYY/MM/DD
%D - MM/DD/YYYY

Query 2:

S=>(@P Person:*);RETURN P.fullname AS Name, P.dob AS DOB, MATH_EXP("DATE_FMT($DOB, %Y/%m/%d %H:%M:%S)") AS DateOfBirth SORT_DESC_QLEX DOB

Output:

+---------------+--------------+-------------------+
|Name           |DOB           |        DateOfBirth|
+---------------+--------------+-------------------+
|Keanu Reeves   |19640523123415|1964/05/23 12:34:00|
+---------------+--------------+-------------------+
|Tom Hanks      |195601112345  |1956/01/11 23:45:00|
+---------------+--------------+-------------------+
|Robert Zemeckis|19510809041545|1951/08/09 04:15:00|
+---------------+--------------+-------------------+
|David Lynch    |19460120140525|1946/01/20 14:05:00|
+---------------+--------------+-------------------+

Terminologies:

%Y - Year (1951)
%y - Year (22)
%m - Month (08)
%d - Day (09)
%H - Hour
%M - mm(Minutes in number format)
%S - ss(seconds)
%p - AM/PM
%P - am/pm
%z - +05:30(Time zone)
%A - Day (Wednesday)
%B - Month (January)
%C - YY(starting 2 numbers)
%G - YYYY(Year)
%H - HH(Hours)
%I - DD (Day in number format)
%P - AM/PM
%R - HH:MM(Hours:Minutes)
%T and %X- HH:MM:SS(Hours:Minutes:Seconds)
%Y - YYYY(Year)
%Z - Time Zone(IST)

How to return all triples formed in the Graph?

Query:

S=>(*)-[*]->(*)

Output:

+-----------------+---------+----------------------+
|sub              |pred     |                   obj|
+-----------------+---------+----------------------+
|Person:tom       |ACTED_IN |     Movie:forrestGump|
+-----------------+---------+----------------------+
|Person:tom       |ACTED_IN |      Movie:cloudAtlas|
+-----------------+---------+----------------------+
|Person:david     |ACTED_IN |     Movie:lostHighway|
+-----------------+---------+----------------------+
|Person:keanu     |ACTED_IN |      Movie:sidebyside|
+-----------------+---------+----------------------+
|Person:keanu     |ACTED_IN |          Movie:Matrix|
+-----------------+---------+----------------------+
|Person:david     |DIRECTED |Movie:theStraightStory|
+-----------------+---------+----------------------+
|Person:david     |DIRECTED |     Movie:lostHighway|
+-----------------+---------+----------------------+
|Person:david     |DIRECTED |      Movie:sidebyside|
+-----------------+---------+----------------------+
|Person:robert    |DIRECTED |     Movie:forrestGump|
+-----------------+---------+----------------------+
|Movie:forrestGump|WON_AWARD|           Award:oscar|
+-----------------+---------+----------------------+

Description:

In BangDB star(*) represents 'all', i.e. here we have put a star as SUB, OBJ, and Relation. Thus, it will look for all the nodes and their associated relations.

How to get the list of Persons with Movie names they Acted/Directed in?

Query for Persons Acted In Movies:

S=>(@P Person:*)-[@r ACTED_IN]->(@M Movie:*)

Output:

+-----------+-------+------+-------+--------+
|M_name     |M_label|P_name|P_label|r_rel   |
+-----------+-------+------+-------+--------+
|forrestGump|Movie  |   tom|Person |ACTED_IN|
+-----------+-------+------+-------+--------+
|cloudAtlas |Movie  |   tom|Person |ACTED_IN|
+-----------+-------+------+-------+--------+
|lostHighway|Movie  | david|Person |ACTED_IN|
+-----------+-------+------+-------+--------+
|sidebyside |Movie  | keanu|Person |ACTED_IN|
+-----------+-------+------+-------+--------+
|Matrix     |Movie  | keanu|Person |ACTED_IN|
+-----------+-------+------+-------+--------+

Query for Persons who Directed Movies:

S=>(@P Person:*)-[@r DIRECTED]->(@M Movie:*)

Output:

+----------------+-------+------+-------+--------+
|M_name          |M_label|P_name|P_label|r_rel   |
+----------------+-------+------+-------+--------+
|theStraightStory|Movie  | david|Person |DIRECTED|
+----------------+-------+------+-------+--------+
|lostHighway     |Movie  | david|Person |DIRECTED|
+----------------+-------+------+-------+--------+
|sidebyside      |Movie  | david|Person |DIRECTED|
+----------------+-------+------+-------+--------+
|forrestGump     |Movie  |robert|Person |DIRECTED|
+----------------+-------+------+-------+--------+

How to get the list of Persons aged above 60 years?

Query:

S=>(@P Person:* {DATE((CURTIME-$dob)/31536000000000) > 60});RETURN P.fullname AS Name, P.dob AS dob, MATH_EXP("ROUND(DATE((CURTIME-$dob)/31536000000000))") AS Age

Output:

+---------------+--------------+---------+
            |Name           |dob           |      Age|
            +---------------+--------------+---------+
            |Tom Hanks      |195601112345  |67.000000|
            +---------------+--------------+---------+
            |David Lynch    |19460120140525|77.000000|
            +---------------+--------------+---------+
            |Robert Zemeckis|19510809041545|71.000000|
            +---------------+--------------+---------+

Description:

From Scenarios 5 & 6, we already saw how we can calculate the Age for all the Person's in the DB. In this Query, we can see that we have used the same MATH_EXP in Person Node Property to put a filter for Age above 60 years. BangDB Cypher enables us to use the content of MATH_EXP's inside Property Filters instead of WHERE Clause.

We can get the same output using WHERE Clause as well. By using below query:

S=>(@P Person:*);RETURN P.fullname AS Name, P.dob AS dob, MATH_EXP("ROUND(DATE((CURTIME-$dob)/31536000000000))") AS Age WHERE Age > 60

How to Count the Number of Persons in the Database?

Query:

S=>(@P Person:*);RETURN COUNT(P.name) AS Cnt

Output:

+---+
|Cnt|
+---+
|4  |
+---+

Description:

COUNT is one of the Aggregation Terminologies provided by BangDB Cypher. We will see examples for more Aggregation Terminologies going forward in this document.

How to COUNT the Number of Movies done by a Person?

Query:

S=>(@P Person:*)-[@r *]->(@M Movie:*);RETURN P.fullname AS Name, COUNT(M.name) AS Cnt SORT_DESC Cnt

Output:

+---------------+---+
|Name           |Cnt|
+---------------+---+
|David Lynch    |4  |
+---------------+---+
|Keanu Reeves   |2  |
+---------------+---+
|Tom Hanks      |2  |
+---------------+---+
|Robert Zemeckis|1  |
+---------------+---+

Description:

Here if you see we are using star(*) inside the relation. BangDB allows us to use this functionality. It will look for all the possible relations which are linked with the respected Subject & Object.

How to find out, in how many movies an actor and a director worked on together?

Query:

S=>(@P1 Person:*)-[@r1 DIRECTED]->(@M Movie:*)<-[@r2 ACTED_IN]-(@P2 Person:*); RETURN M.title AS Movie, P1.fullname AS Director, P2.fullname AS Actor, COUNT(*) AS Cnt

Output:

+------------+---------------+------------+---+
|Movie       |Director       |       Actor|Cnt|
+------------+---------------+------------+---+
|Forrest Gump|Robert Zemeckis|   Tom Hanks|1  |
+------------+---------------+------------+---+
|Lost Highway|David Lynch    | David Lynch|1  |
+------------+---------------+------------+---+
|Side By Side|David Lynch    |Keanu Reeves|1  |
+------------+---------------+------------+---+

Description:

Here, in the above query, we have mentioned two relations both traveling from Person Node to Movie Node. But only the relation names are different i.e. ACTED_IN & DIRECTED.

How to find Movie details based on a keyword search/pattern in the Value Field?

Query:

S=>(@P1 Person:*)-[@r1 DIRECTED]->(@M Movie:*)<-[@r2 ACTED_IN]-(@P2 Person:*);RETURN P1.fullname AS Director, P2.fullname AS Actor, M.title AS Title, M.released AS released, MATH_EXP("DATE_FMT_TS($released, %D)") AS ReleasedOn WHERE Title= "F$%"

Output:

+------------+---------+---------------+---------+----------+
|Title       |released |       Director|Actor    |ReleasedOn|
+------------+---------+---------------+---------+----------+
|Forrest Gump|776409908|Robert Zemeckis|Tom Hanks|08/09/94  |
+------------+---------+---------------+---------+----------+

Description:

In BangDB Cypher we can use “$%” to search any string from the starting word. Here we are searching for movies whose Title starts with "F"

How to find a Person who is a Director & also an actor?

Query:

S2=>[S1=>(@P Person:*)-[@r1 DIRECTED]->(@M1 Movie:*)]-[@r2 ACTED_IN]->(@M2 Movie:*);RETURN P.fullname AS Person, P.dob AS dob, MATH_EXP("DATE_FMT($dob, %F)") AS DOB, MATH_EXP("ROUND(DATE((CURTIME-$dob)/31536000000000))") AS Age

Description:

To understand the workflow we have to break this query into 2 parts:

In the 1st part, S1, we are making a relationship from Person to Movie as DIRECTED that means it will return us all Person names who directed any movie.
In the 2nd part i.e. S2, DB will take the list of Persons & will check whether they have a relation ACTED_IN with any Movie or not.
By doing this we will get the Person's list who have Directed as well as Acted in some movie.
This type of query building is referred to as NESTED Query where we want the output to fulfill the conditions of both relations.

Output:

+-----------+--------------+---------+----------+
|Person     |dob           |      Age|DOB       |
+-----------+--------------+---------+----------+
|David Lynch|19460120140525|77.000000|1946-01-20|
+-----------+--------------+---------+----------+

So, from the Output, we can see that the Person David Lynch is a Director as well as an Actor.

How to get the Count of Number of Movies released each Year?

Query:

S=>(@M Movie:*);RETURN M.released AS released, MATH_EXP("DATE_FMT_TS($released, %Y)") AS Year, COUNT(M.title) AS Cnt EXCLUDE_GROUPBY(released)

Output:

+----+---+
|Year|Cnt|
+----+---+
|1999|1  |
+----+---+
|2012|2  |
+----+---+
|1997|2  |
+----+---+
|1994|1  |
+----+---+

Description:

In BangDB star(*) represents 'all', i.e. here we have put a star as SUB, OBJ, and Relation. Thus, it will look for all the nodes and their associated relations.

How to get the list of Movies for which we have Director & Actor Data available in DB?

Query:

<CROSS USING movie>S1=>(@P1 Person:*)-[@r1 DIRECTED]->(@M1 Movie:*);RETURN M1.title AS movie, M1.released AS released, MATH_EXP("DATE_FMT_TS($released, %F)") AS ReleasedOn ++ S2=>(@P2 Person)-[@r2 ACTED_IN]->(@M2 Movie:*);RETURN M2.title AS movie

Output:

+------------+----------+----------+
|movie       |released  |ReleasedOn|
+------------+----------+----------+
|Side By Side|1329336940|2012-02-16|
+------------+----------+----------+
|Lost Highway|853338330 |1997-01-15|
+------------+----------+----------+
|Forrest Gump|776409908 |1994-08-09|
+------------+----------+----------+

Description:

BangDB Cypher provided CROSS / ADD / SUBTRACT functionalities in queries.
In CROSS Query types, we can provide the property on which we want two query outputs to be combined but return only the common values.
For Example: In above query, we are providing movie as the property on which we want both relations to combine their outputs & in return provide the Movie names which are common in both.
Here we can return other properties as well like we returned the released property of Movie Node Type which is in TimeStamp format. So, we used MATH_EXP to convert the TimeStamp into a readable DateTime format using DATE_FMT_TS.

Sales data

Graph creation

To create a new graph use the following command:

CREATE GRAPH Sales

To use an existing graph use the following command:

USE GRAPH MyOldGraph

Graph data

Use the following command to create data. 'CREATE' keyword should be used here.

USE GRAPH Sales

Let's take one example and create a relationship as:

Country to State
State to City
City to Groups
And our Sales Amount, we are defining it as a relationship property, between City to Groups

CREATE (Country:"United States")-[HAS]->(State:"Georgia")-[HAS]->(City:"Byron")-[ASSOCIATED { "SalesAmount": 4.99}]->(Groups:"0-10k")

If everything goes well, BangDB shows us output as following:

{
   "errcode" : 0,
   "msg" : [
      "success"
   ]
}

In a similar way we can ingest as many as records in our graph.

CREATE (Country:"Australia")-[HAS]->(State:"South Australia")-[HAS]->(City:"Perth")-[ASSOCIATED { "SalesAmount": 183388.5186}]->(Groups:"150k-200k")

CREATE (Country:"Canada")-[HAS]->(State:"British Columbia")-[HAS]->(City:"Malabar")-[ASSOCIATED { "SalesAmount": 144920.3202}]->(Groups:"100k-150k")

CREATE (Country:"United States")-[HAS]->(State:"Washington")-[HAS]->(City:"Milwaukie")-[ASSOCIATED { "SalesAmount": 118732.8813}]->(Groups:"100k-150k")

CREATE (Country:"United States")-[HAS]->(State:"Texas")-[HAS]->(City:"College Station")-[ASSOCIATED { "SalesAmount": 1597.7}]->(Groups:"0-10k")

CREATE (Country:"United States")-[HAS]->(State:"Washington")-[HAS]->(City:"Downey")-[ASSOCIATED { "SalesAmount": 116147.121}]->(Groups:"100k-150k")

CREATE (Country:"France")-[HAS]->(State:"Essonne")-[HAS]->(City:"Rockhampton")-[ASSOCIATED { "SalesAmount": 181244.7285}]->(Groups:"150k-200k")

CREATE (Country:"United States")-[HAS]->(State:"California")-[HAS]->(City:"Bluffton")-[ASSOCIATED { "SalesAmount": 2400.34}]->(Groups:"0-10k")

CREATE (Country:"United States")-[HAS]->(State:"Washington")-[HAS]->(City:"Langford")-[ASSOCIATED { "SalesAmount": 128693.9269}]->(Groups:"100k-150k")

CREATE (Country:"United Kingdom")-[HAS]->(State:"England")-[HAS]->(City:"Hof")-[ASSOCIATED { "SalesAmount": 58119.8632}]->(Groups:"50k-70k")

CREATE (Country:"United States")-[HAS]->(State:"New York")-[HAS]->(City:"Braintree")-[ASSOCIATED { "SalesAmount": 1735.98}]->(Groups:"0-10k")

CREATE (Country:"United Kingdom")-[HAS]->(State:"England")-[HAS]->(City:"North Sydney")-[ASSOCIATED { "SalesAmount": 34956.7407}]->(Groups:"30k-50k")

CREATE (Country:"Canada")-[HAS]->(State:"British Columbia")-[HAS]->(City:"Coronado")-[ASSOCIATED { "SalesAmount": 126086.6603}]->(Groups:"100k-150k")

CREATE (Country:"Australia")-[HAS]->(State:"New South Wales")-[HAS]->(City:"St. Leonards")-[ASSOCIATED { "SalesAmount": 155010.3774}]->(Groups:"150k-200k")

CREATE (Country:"United States")-[HAS]->(State:"California")-[HAS]->(City:"Lavender Bay")-[ASSOCIATED { "SalesAmount": 160427.2862}]->(Groups:"150k-200k")

CREATE (Country:"Australia")-[HAS]->(State:"New South Wales")-[HAS]->(City:"North Ryde")-[ASSOCIATED { "SalesAmount": 175222.5111}]->(Groups:"150k-200k")

CREATE (Country:"Australia")-[HAS]->(City:"Perth")

CREATE (Country:"Canada")-[HAS]->(City:"Malabar")

CREATE (Country:"United States")-[HAS]->(City:"Milwaukie")

CREATE (Country:"United States")-[HAS]->(City:"College Station")

CREATE (Country:"United States")-[HAS]->(City:"Downey")

CREATE (Country:"France")-[HAS]->(City:"Rockhampton")

CREATE (Country:"United States")-[HAS]->(City:"Bluffton")

CREATE (Country:"United States")-[HAS]->(City:"Langford")

CREATE (Country:"United Kingdom")-[HAS]->(City:"Hof")

CREATE (Country:"United States")-[HAS]->(City:"Braintree")

CREATE (Country:"United Kingdom")-[HAS]->(City:"North Sydney")

CREATE (Country:"Canada")-[HAS]->(City:"Coronado")

CREATE (Country:"Australia")-[HAS]->(City:"St. Leonards")

CREATE (Country:"United States")-[HAS]->(City:"Lavender Bay")

CREATE (Country:"Australia")-[HAS]->(City:"North Ryde")

Scenarios

How to get a COUNT of all States in a Country?

Query:

s=>(@p Country:*)-[@r HAS]->(@c State:*);RETURN p.name AS Country, UCOUNT(c.name) AS "No. of States"

Description:

If we see the schema, we have two separate NODES for Country and States. But we don't know from which Country this State belongs. To know this, we have defined a relation between Country and States as “HAS”. Using the BangDB syntax to use Nodes and Relations we can know the number of States in a Country.

If you wonder, we have an arrow pointing to our 2nd NODE. To which node this arrow is pointing is actually our OBJECT and the origin of this arrow signifies SUBJECT. We have to go through the schema for better understanding.

Syntax: X=>(@alias1 NODE1:*)-[@alia2 RELATION]->((@alias3 NODE2:*);RETURN ….

In the first RETURN, we are returning Country names, and then we are returning a unique count of no.of States using States NODE.

BangDB provides us two features to count no. event. COUNT & UCOUNT. As we know, in this dataset, state names can be repeated to represent all its Cities. That's why we are using UCOUNT, which means a Unique Count of all the States present in a Country.

+--------------+-------------+
|Country       |No. of States|
+--------------+-------------+
|Canada        |1            |
+--------------+-------------+
|Australia     |2            |
+--------------+-------------+
|France        |1            |
+--------------+-------------+
|United Kingdom|1            |
+--------------+-------------+
|United States |5            |
+--------------+-------------+

How to get a COUNT of all Cities in a Country?

Query:

s=>(@p Country:*)-[@r HAS]->(@c City:*);RETURN p.name AS Country, COUNT(c.name) AS "No. of Cities" SORT_DESC "No. of Cities"

Description:

To make a query for this requirement, we want some relation between Country and City and if we see the schema, we have a relation between these two nodes as "HAS".

Here if you see we are using COUNT instead of UCOUNT. Because in our dataset we know all the cities are already unique. So we can use COUNT here.

We are also sorting the result in descending order of "No. of Cities" using SORT_DESC. It is case-sensitive.

We can also sort the result in ascending order using SORT_ASC.

+--------------+-------------+
|Country       |No. of Cities|
+--------------+-------------+
|United States |7            |
+--------------+-------------+
|Australia     |3            |
+--------------+-------------+
|Canada        |2            |
+--------------+-------------+
|United Kingdom|2            |
+--------------+-------------+
|France        |1            |
+--------------+-------------+

How to list down all Cities in a State of a specific Country?

Query:

s=>(@c Country:*)-[@r1 HAS]->(@s State:*)-[@r2 HAS]->(@ci City:*);RETURN c.name AS Country, s.name AS State, ci.name AS City WHERE Country = Canada

Description:

Here we have to use Country, State, and City NodeTypes in the same query. If we see the schema, we have a relation from Country to State, State to City as "HAS".

If we break this query into two parts, we will have

Country to State relationship.
State to City relationship.

In the 1st part, State was our OBJECT and in the 2nd part, State becomes our SUBJECT. This type of query writing is known as Chain query where we have more than one relation mentioned.

In the RETURN, we can use the respected alias names and call their defined properties.

To put filter conditions on Country names we are using the WHERE condition.

Output:

+----------------+--------+-------+
|State           |City    |Country|
+----------------+--------+-------+
|British Columbia|Malabar | Canada|
+----------------+--------+-------+
|British Columbia|Coronado| Canada|
+----------------+--------+-------+

How to get a SUM of Total Sales in the Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, SUM(r3.SalesAmount) AS "Total Sales Amount"

Description:

As we know we have a relation from Country to City, but now we want to show the Sales Amount and we are defining this in the schema as a relation property.

So now we have to add one more relation from City to Groups.

In the RETURN we are showing Country Names and SUM of SalesAmount using the r3 alias name because SalesAmount property is defined inside this relationship.

Output:

+--------------+------------------+
|Country       |Total Sales Amount|
+--------------+------------------+
|Canada        |271006.980500     |
+--------------+------------------+
|Australia     |513621.407100     |
+--------------+------------------+
|France        |181244.728500     |
+--------------+------------------+
|United Kingdom|93076.603900      |
+--------------+------------------+
|United States |529735.235400     |
+--------------+------------------+

How to get an AVERAGE of Total Sales in a Country?

Query:

s=>(@c Country:*)-[@r1 HAS]->(@s State:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, AVG(r3.SalesAmount) AS "Average Sales Amount"

Description:

In a similar way as we did in the last query, now here I want to return the Average of Sales Amount. So we are using AVG aggr. function.

Output:

+--------------+--------------------+
|Country       |Average Sales Amount|
+--------------+--------------------+
|Canada        |135503.490250       |
+--------------+--------------------+
|Australia     |171207.135700       |
+--------------+--------------------+
|France        |181244.728500       |
+--------------+--------------------+
|United Kingdom|46538.301950        |
+--------------+--------------------+
|United States |66217.528175        |
+--------------+--------------------+

How to get a STANDARD DEVIATION of Total Sales in a Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, STD(r3.SalesAmount) AS "Standard Deviation of Sales Amount"

Description:

In a similar way as we did in the last query, now here I want to return the Standard Deviation of Sales Amount. So we are using STD aggr. function.

Output:

+--------------+----------------------------------+
|Country       |Standard Deviation of Sales Amount|
+--------------+----------------------------------+
|Canada        |13317.408630                      |
+--------------+----------------------------------+
|Australia     |14608.975125                      |
+--------------+----------------------------------+
|France        |0.000000                          |
+--------------+----------------------------------+
|United Kingdom|16378.800993                      |
+--------------+----------------------------------+
|United States |70485.851042                      |
+--------------+----------------------------------+

How to get a MAXIMUM Sales amount in a Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, MAX(r3.SalesAmount) AS "Maximum Sales Amount"

Description:

If we look at the dataset, we have a Country that has many States and States which have many cities. So clearly Country names are gonna repeat in the data.

But now we want to return the highest Sales Amount number respective of the Country and BangDB provides us this aggr. function, MAX, to do the same.

Output:

+--------------+--------------------+
|Country       |Maximum Sales Amount|
+--------------+--------------------+
|Canada        |144920.320200       |
+--------------+--------------------+
|Australia     |183388.518600       |
+--------------+--------------------+
|France        |181244.728500       |
+--------------+--------------------+
|United Kingdom|58119.863200        |
+--------------+--------------------+
|United States |160427.286200       |
+--------------+--------------------+

How to get a MINIMUM Sales amount in a Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, MIN(r3.SalesAmount) AS "Minimum Sales Amount"

Description:

In a similar way as we did in the last query, now here I want to return the Lowest Sales Amount respective of the Country. So we are using MIN aggr. function.

Output:

+--------------+--------------------+
|Country       |Minimum Sales Amount|
+--------------+--------------------+
|Canada        |126086.660300       |
+--------------+--------------------+
|Australia     |155010.377400       |
+--------------+--------------------+
|France        |181244.728500       |
+--------------+--------------------+
|United Kingdom|34956.740700        |
+--------------+--------------------+
|United States |1597.700000         |
+--------------+--------------------+

How to get a SKEW result of the Sales amount in a Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, SKEW(r3.SalesAmount) AS "Skew Sales Amount"

Description:

In a similar way as we did in the last query, now here I want to return the Skew value of Sales Amount. So we are using SKEW aggr. function.

Output:

+--------------+-----------------+
|Country       |Skew Sales Amount|
+--------------+-----------------+
|Canada        |0.000000         |
+--------------+-----------------+
|Australia     |-1.143416        |
+--------------+-----------------+
|France        |0.000000         |
+--------------+-----------------+
|United Kingdom|0.000000         |
+--------------+-----------------+
|United States |-0.200120        |
+--------------+-----------------+

How to get a Total Sales amount of a specific Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, SUM(r3.SalesAmount) AS "Total Sales Amount" WHERE Country = Australia

Description:

Country name we can get from Country Node itself.

Sales Amount as we already know is present in the "ASSOCIATED" property. So to make things work, we have to link the Country and City and Groups nodes in a chain query and return the result as per our requirement.

Also, we want to sum the sales amount so we will use SUM aggr. function here and using the WHERE condition we will be able to filter our Country name.

Output:

+---------+------------------+
|Country  |Total Sales Amount|
+---------+------------------+
|Australia|513621.407100     |
+---------+------------------+

How to get a Total Sales amount of a specific Country and a specific State?

Query:

s=>(@c Country:*)-[@r1 HAS]->(@s State:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, s.name AS State, SUM(r3.SalesAmount) AS "Total Sales Amount" WHERE Country = Australia AND State = "New South Wales"

Description:

As we did in the last query, here we will return Country and State names and in the WHERE condition we are filtering the results based on Country name and the State name.

Output:

+---------------+---------+------------------+
|State          |Country  |Total Sales Amount|
+---------------+---------+------------------+
|New South Wales|Australia|     330232.888500|
+---------------+---------+------------------+

Total Sales Amount for all the cities in a specific Country?

Query:

s=>(@c Country:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, ci.name AS City, SUM(r3.SalesAmount) AS "Total Sales Amount" WHERE Country = "United States" SORT_ASC City LIMIT 5

Description:

As we did in the last query, here we will return Country and City names and SUM aggr. function is being used here also.

In the WHERE condition we are filtering the results based on Country name and the City name.

Here in output we are getting 18 rows. We can limit the no. of rows using the LIMIT keyword. It is case-sensitive. Sorting on the alphabet is also possible.

Output:

+---------------+-------------+------------------+
|City           |Country      |Total Sales Amount|
+---------------+-------------+------------------+
|Bluffton       |United States|       2400.340000|
+---------------+-------------+------------------+
|Braintree      |United States|       1735.980000|
+---------------+-------------+------------------+
|College Station|United States|       1597.700000|
+---------------+-------------+------------------+
|Downey         |United States|     116147.121000|
+---------------+-------------+------------------+
|Langford       |United States|     128693.926900|
+---------------+-------------+------------------+

Total Sales Amount for all the States in a specific Country?

Query:

s=>(@c Country:*)-[@r1 HAS]->(@s State:*)-[@r2 HAS]->(@ci City:*)-[@r3 ASSOCIATED]->(@g Groups:*);RETURN c.name AS Country, s.name AS State, SUM(r3.SalesAmount) AS "Total Sales Amount" WHERE Country = "Australia"

Description:

Using the WHERE condition we are here filtering the result based on Country names and using the SUM function we are calculating the sum of the Sales amount for the following States in a specific country.

Output:

+---------------+---------+------------------+
|State          |Country  |Total Sales Amount|
+---------------+---------+------------------+
|New South Wales|Australia|     330232.888500|
+---------------+---------+------------------+
|South Australia|Australia|     183388.518600|
+---------------+---------+------------------+

Patient Schema & Data

Here we have defined a schema and ingested data as shown in below steps. We are defining the following attributes in the schema: patientID, event, event_Date, Weight.

Schema Registration

To register schema from BangDB CLI, run:

register schema "/home/sachin/Patient_Schema.json"

Download the schema file

{
  "schema": "Patient",
  "streams": [
    {
      "name": "Data",
      "inpt": [],
      "swsz": 86400,
      "type": 1,
      "attr": [
        {"name": "patientID","type": "STRING","kysz": 24},
        {"name": "event","type": "STRING","kysz": 24,"sidx": 0},
        {"name": "event_date","type": "STRING","kysz": 64},
        {"name": "weight","type": "LONG"}
      ],
      "catr": [
        {"name": "observationID","kysz": 24,"iatr": ["event","event_date"],"fnr": 1,"seq": 0,"type": "STRING","opnm": "ADD"}
      ],
      "rels": [
        {"sub": "patientID","sub_label": "Patient","obj": "event","obj_label": "Event","rel": "is_associated_with","sub_props": ["event_date"],"obj_props": ["event_date"],"sub_updt": 1},
        {"sub": "event","sub_label": "Event","obj": "observationID","obj_label": "Observation","rel": "is_associated_with","sub_props": ["event_date"],"obj_props": ["weight"]}
      ]
    }
  ]
}

Data

To push the data in BangDB Schema, run below commands from CLI:

insert into Patient.Data values null {"patientID": "P1", "weight": 50, "event": "001", "event_date": 20201024}
          
insert into Patient.Data values null {"patientID": "P1", "weight": 55, "event": "002", "event_date": 20210524}

insert into Patient.Data values null {"patientID": "P1", "weight": 60, "event": "003", "event_date": 20211124}

insert into Patient.Data values null {"patientID": "P1", "weight": 55, "event": "004", "event_date": 20220124}

Scenarios

DB can have various records holding Patient's body weight in Database. How to fetch the latest Body Weight record stored in the Database?

Possible Solution: PIPE query enables a person to pass a list of outputs received from the first query to the second query for a more prominent search.

PIPE Syntax: <PIPE USING `variable_name`>S1=>()-[]->();RETURN..... ++ S2=>()-[]->();RETURN......

Query:

<PIPE USING event_date>S1=>(@P Patient:"P1");RETURN P.event_date AS event_date ++ S2=>(@E Event:*)-[@r is_associated_with]->(@O Observation:*);RETURN E.event_date AS event_date, O.weight AS Weight

Output:

+------ +-------------+
|Weight |latest_event |
+-------+-------------+
|55     |20220124     |
+-------+-------------+

So, as we can see, based on the PIPE Query we are able to fetch the latest Weight for the given PatientID.