THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

Modeling Many-to-Many relationships in SQL Server 2017 Graph Database

Relational databases do not natively support many to many relationships although a junction table is an easy workaround. Graph databases natively support many-to-many relationships. One of the first sample databases Microsoft provided for SQL Server was Northwind. It has three many-to-many relationships as can be seen in the entity relationship diagram shown below. I previously published a list of references for understanding graphs here.

image

Figure 1. Northwind database ER diagram.

I’m going to focus on one junction table, the EmployeeTerritories table.

image

Figure 2. Many-to-many relationship implemented with the EmployeeTerritory junction table. The colors carry through to the code samples that follow.

I always tell people to start with obtaining an understanding the data. Let’s run some queries to find out more about the nature of the data.

SELECT * FROM dbo.Employees WHERE EmployeeID NOT IN (SELECT EmployeeID FROM dbo.EmployeeTerritories);
SELECT * FROM dbo.Territories WHERE TerritoryID NOT IN (SELECT TerritoryID FROM dbo.EmployeeTerritories);

Not all territories have employees. All employees have territories. As I pointed out in this post, directionality of an edge matters. It’s something you have to pay attention to. The following query shows how to display all of the rows in the many-to-many relationship.

SELECT E.FirstName, E.LastName, T.TerritoryDescription
FROM dbo.Employees AS E
INNER JOIN dbo.EmployeeTerritories AS ET
ON E.EmployeeID = ET.EmployeeID
INNER JOIN dbo.Territories AS T
ON T.TerritoryID = ET.TerritoryID;

FirstName LastName  TerritoryDescription
--------- --------- --------------------
Nancy     Davolio   Wilton             
Nancy     Davolio   Neward             
Andrew    Fuller    Westboro           
Andrew    Fuller    Bedford            
Andrew    Fuller    Georgetow          
Andrew    Fuller    Boston             
Andrew    Fuller    Cambridge          
Andrew    Fuller    Braintree          
Andrew    Fuller    Louisville         
Janet     Leverling Atlanta            
Janet     Leverling Savannah           
Janet     Leverling Orlando            
Janet     Leverling Tampa              
Margaret  Peacock   Rockville          
Margaret  Peacock   Greensboro         
Margaret  Peacock   Cary               
Steven    Buchanan  Providence         
Steven    Buchanan  Morristown         
Steven    Buchanan  Edison             
Steven    Buchanan  New York           
Steven    Buchanan  New York           
Steven    Buchanan  Mellvile           
Steven    Buchanan  Fairport           
Michael   Suyama    Phoenix            
Michael   Suyama    Scottsdale         
Michael   Suyama    Bellevue           
Michael   Suyama    Redmond            
Michael   Suyama    Seattle            
Robert    King      Hoffman Estates    
Robert    King      Chicago            
Robert    King      Denver             
Robert    King      Colorado Springs   
Robert    King      Santa Monica       
Robert    King      Menlo Park         
Robert    King      San Francisco      
Robert    King      Campbell           
Robert    King      Santa Clara        
Robert    King      Santa Cruz         
Laura     Callahan  Philadelphia       
Laura     Callahan  Beachwood          
Laura     Callahan  Findlay            
Laura     Callahan  Racine             
Anne      Dodsworth Hollis             
Anne      Dodsworth Portsmouth         
Anne      Dodsworth Southfield         
Anne      Dodsworth Troy               
Anne      Dodsworth Bloomfield Hills   
Anne      Dodsworth Roseville          
Anne      Dodsworth Minneapolis
      

Create and populate a node table of employee data.

CREATE TABLE dbo.EmployeesNode(
    EmployeeID int IDENTITY(1,1) NOT NULL,
    LastName nvarchar(20) NOT NULL,
    FirstName nvarchar(10) NOT NULL,
    Title nvarchar(30) NULL,
    TitleOfCourtesy nvarchar(25) NULL,
    BirthDate datetime NULL,
    HireDate datetime NULL,
    Address nvarchar(60) NULL,
    City nvarchar(15) NULL,
    Region nvarchar(15) NULL,
    PostalCode nvarchar(10) NULL,
    Country nvarchar(15) NULL,
    HomePhone nvarchar(24) NULL,
    Extension nvarchar(4) NULL,
    Photo VARBINARY(MAX) NULL,  -- changed from deprecated IMAGE data type
    Notes NVARCHAR(MAX) NULL,   -- changed from deprecated TEXT data type
    ReportsTo int NULL,
    PhotoPath nvarchar(255) NULL
) AS NODE;
GO

SET IDENTITY_INSERT dbo.EmployeesNode ON

INSERT INTO dbo.EmployeesNode
(EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, BirthDate, HireDate, Address, City, Region, PostalCode, Country, HomePhone, Extension, Photo, Notes, ReportsTo, PhotoPath)
SELECT EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, BirthDate, HireDate, Address, City, Region, PostalCode, Country, HomePhone, Extension, Photo, Notes, ReportsTo, PhotoPath
FROM dbo.Employees;

SET IDENTITY_INSERT dbo.EmployeesNode OFF

Create and populate a node table of territories data.

CREATE TABLE dbo.TerritoriesNode(
    TerritoryID nvarchar(20) NOT NULL,
    TerritoryDescription nchar(50) NOT NULL,
    RegionID int NOT NULL
) AS NODE;
GO

INSERT INTO dbo.TerritoriesNode
(TerritoryID, TerritoryDescription, RegionID)
SELECT TerritoryID, TerritoryDescription, RegionID
FROM dbo.Territories;

It’s necessary to create a query to retrieve node id pairs to populate the edge table. It’s a modification to the junction table query shown above. I suggest running just the SELECT portion of the following query so you can see what the edge table is being populated with.

INSERT INTO belongsTo
SELECT E.$node_id, T.$node_id
FROM dbo.EmployeesNode E
INNER JOIN dbo.EmployeeTerritories ET
ON E.EmployeeID = ET.EmployeeID
INNER JOIN dbo.TerritoriesNode T
ON T.TerritoryID = ET.TerritoryID;

Run a graph query to see the many-to-many relationship represented as 49 rows.

SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
FROM EmployeesNode, belongsTo, TerritoriesNode
WHERE MATCH (EmployeesNode-(belongsTo)->TerritoriesNode);

FirstName LastName TerritoryDescription
--------- -------- --------------------
Andrew    Fuller   Westboro           
Andrew    Fuller   Bedford            
Andrew    Fuller   Georgetow          
Andrew    Fuller   Boston             
Andrew    Fuller   Cambridge          
Andrew    Fuller   Braintree          
Steven    Buchanan Providence         
Anne      Dodswort Hollis             
Anne      Dodswort Portsmouth         
Nancy     Davolio  Wilton             
Steven    Buchanan Morristown         
Steven    Buchanan Edison             
Steven    Buchanan New York           
Steven    Buchanan New York           
Steven    Buchanan Mellvile           
Steven    Buchanan Fairport           
Laura     Callahan Philadelphia       
Nancy     Davolio  Neward             
Margaret  Peacock  Rockville          
Margaret  Peacock  Greensboro         
Margaret  Peacock  Cary               
Janet     Leverlin Atlanta            
Janet     Leverlin Savannah           
Janet     Leverlin Orlando            
Janet     Leverlin Tampa              
Andrew    Fuller   Louisville         
Laura     Callahan Beachwood          
Laura     Callahan Findlay            
Anne      Dodswort Southfield         
Anne      Dodswort Troy               
Anne      Dodswort Bloomfield Hills   
Laura     Callahan Racine             
Anne      Dodswort Roseville          
Anne      Dodswort Minneapolis        
Robert    King     Hoffman Estates    
Robert    King     Chicago            
Robert    King     Denver             
Robert    King     Colorado Springs   
Michael   Suyama   Phoenix            
Michael   Suyama   Scottsdale         
Robert    King     Santa Monica       
Robert    King     Menlo Park         
Robert    King     San Francisco      
Robert    King     Campbell           
Robert    King     Santa Clara        
Robert    King     Santa Cruz         
Michael   Suyama   Bellevue           
Michael   Suyama   Redmond            
Michael   Suyama   Seattle
            

The results are the same, although the order is not. There is no implicit order in a relational database. I didn’t use a SORT to make sure you understand that you should not expect the same order of results from different queries.

As you can see, the graph query has very simple code.

I’m pretty sure somebody is going to ask what if the order of the nodes in the MATCH is changed?

SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
FROM EmployeesNode, belongsTo, TerritoriesNode
WHERE MATCH (TerritoriesNode-(belongsTo)->EmployeesNode);

No rows are returned because of the directionality of the edge. We defined our graph has having employees belonging to territories and not territories belonging to employees. It’s the direction of the arrow that matters, not the order of the nodes in the MATCH. The following query returns all 49 rows.

SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
FROM EmployeesNode, belongsTo, TerritoriesNode
WHERE MATCH (TerritoriesNode<-(belongsTo)-EmployeesNode);

Bidirectional matching is not supported. Original graph query with another arrow added.

SELECT EmployeesNode.FirstName, EmployeesNode.LastName, TerritoriesNode.TerritoryDescription
FROM EmployeesNode, belongsTo, TerritoriesNode
WHERE MATCH (EmployeesNode<-(belongsTo)->TerritoriesNode);

Msg 102, Level 15, State 1, Line 60
Incorrect syntax near '>'.

Graph nodes and edges appear under Graph Tables apart from regular tables.

image

Figure 3. Node and Edge tables appear under Graph Tables.

My next post in this series shows how to migrate Northwind's adjacency list in the Employees table to SQL Server 2017 nodes and edges. 



Published Monday, June 19, 2017 7:06 AM by John Paul Cook

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is a database and Azure specialist in Houston. He previously worked as a Data Platform Solution Architect in Microsoft's Houston office. Prior to joining Microsoft, he was a SQL Server MVP. He is experienced in SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. John is also a Registered Nurse currently studying to be a psychiatric nurse practitioner. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2. Connect on LinkedIn

This Blog

Syndication

Privacy Statement