Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server / SQL-Server-2014

SQL Server: Detect Circular Relations In A Relational Database

4.83/5 (5 votes)
4 Mar 2015CPOL4 min read 20.9K   129  
Table Hierarchy or Generation Finder For Relational Database In SQL Server

Introduction

Managing proper relation between tables have always been a big issue in the relational database. It plays a big role in terms of modeling and normalization. SQL Server lets us do so many crazy things without any prior warning, even when we are doing improper table relations. Applications may be working just fine, but there could be presents of big mass at the table relation levels.

Today I am trying to make a SQL query which will help to find table Hierarchy or Generation for each table at the relational database.

Background

In words, A Good Relation Database means there is:

  • Presence of at least one table that has no dependencies
  • An absence of any circulation flow at table relations

Here is a sample good relational database design:

Image 1

And if you are thinking about the worst one, here it is:

Image 2

If the database table number is small in size, it’s easy to get all the information with the design diagram. But if you have so many tables and even with different schemas, it’s hard to line them all with a single step.

So let’s start rolling over the query I was mentioning.

Table Relations Detail

First, we have to find the relations between different tables of the database. Basically, it is like finding the dependencies using the foreign key relations. To find the relations, I am using this query and the detail would be like:

SQL
--------------- Table relations detail =>
CREATE TABLE #tblRelation (ObjectId VARCHAR(100) NOT NULL,
                            Name VARCHAR(100) NOT NULL,
                            [Schema] VARCHAR(100) NOT NULL,
                            [Column] VARCHAR(100) NOT NULL,
                            FkFromObjectId VARCHAR(100) NOT NULL,                            
                            FkFromTbl VARCHAR(100) NOT NULL,
                            FkFromSchema VARCHAR(100) NOT NULL,
                            FkFromClm VARCHAR(100) NOT NULL)
INSERT
    INTO #tblRelation
    SELECT
        parent_object_id,
        OBJECT_NAME(parent_object_id),
        OBJECT_SCHEMA_NAME(parent_object_id),
        c.NAME,
        referenced_object_id,
        OBJECT_NAME(referenced_object_id),
        OBJECT_SCHEMA_NAME(referenced_object_id),
        cref.NAME
    FROM
        sys.foreign_key_columns fk
    INNER JOIN
        sys.columns c
           ON fk.parent_column_id = c.column_id
              AND fk.parent_object_id = c.object_id
    INNER JOIN
        sys.columns cref
           ON fk.referenced_column_id = cref.column_id
              AND fk.referenced_object_id = cref.object_id
SELECT *
    FROM #tblRelation
    ORDER BY Name, [Schema], FkFromTbl, FkFromSchema

Now if any relation exists, we will see something like this:

Image 3

If there is no relation between tables, there will be no entity at this table.

Find First Generation Tables and Table Schema Detail

The second step would be to find the first generation tables using the relations. By the word first generation, I meant a table:

  • which doesn’t have any dependencies or any foreign key
  • or its dependents on its own or applied foreign keys are from his own columns
SQL
---------- Find first generation tables and schema detail for all table =>
CREATE TABLE #tblDetail (Name VARCHAR(100) NOT NULL,
                         [Schema] VARCHAR(100) NOT NULL,
                         Generation INT NULL)
INSERT
    INTO #tblDetail(Name, [Schema], Generation)
    (SELECT
        TABLE_NAME,
        TABLE_SCHEMA,
        (CASE
            WHEN(
            /*if tbl has no fk, first generation tbl, 0*/
            (SELECT COUNT(*)
                FROM #tblRelation
                WHERE Name = TABLE_NAME AND [Schema] = TABLE_SCHEMA) = 0
            )
            THEN(SELECT 0)
            
            WHEN(
            /*if tbl has fk, but all of them from his own columns, first generation tbl, 0*/
            (SELECT COUNT(*)
                FROM #tblRelation
                WHERE Name = TABLE_NAME AND [Schema] = TABLE_SCHEMA) =
            (SELECT COUNT(*)
                FROM #tblRelation
                WHERE Name = TABLE_NAME
                    AND FkFromTbl = TABLE_NAME  AND [Schema] = TABLE_SCHEMA)        
            )
            THEN(SELECT 0)
            
            /*tbl has fk, from other tbl columns, NULL*/
            ELSE(SELECT NULL)
          END)    
        FROM INFORMATION_SCHEMA.TABLES
            WHERE TABLE_TYPE = 'BASE TABLE'
            AND TABLE_NAME != 'sysdiagrams')
            
SELECT *
    FROM #tblDetail
    WHERE Generation IS NOT NULL
    ORDER BY Generation, Name, [Schema]

As you can see, I am not only comparing the table name but also the schema name. This one is really important because the database could have multiple tables with the same name for different schemes. Now, if we find any first generation table at the database, we are assigning its generation to “0”.

On,select we will get something like this:

Image 4

How To Know If There Is No Independent Table at Database?

It is important that we find at least one first-generation table to continue our work. But if the database doesn’t have any first generation tables, we put a small check for that:

SQL
/*no first generation at relations*/
IF NOT EXISTS(SELECT *
                FROM #tblDetail
                WHERE Generation IS NOT NULL)
BEGIN
    PRINT 'No first generation table found at table relations.';
END

If we don’t find any first generation table, we will find a message. But don’t worry, the process will not throw any error for that. It will try its best to go until the finish line.

Image 5

And no entity would be selected on the previous select statement.

Finding Generations for Other Tables

Finding generations of other tables is simple.

  • Find tables whose generation has not found yet “where Generation is NULL at @tblDetail
  • And all of its dependencies could be 100% resolved, using tables whose generation is already found
  • Apply its generation to currentMaxGeneration + 1;
  • Continue unless generation of all not found.
SQL
/*set new generation using olds*/
WHILE(EXISTS(SELECT *
                FROM #tblDetail
                WHERE Generation IS NULL))
BEGIN
    /*find new generation tables*/
    DECLARE @tblNewGeneration TABLE(Name VARCHAR(100) NOT NULL,
                                    [Schema] VARCHAR(100) NOT NULL)                                
    /*Select tables where all of its foreign key table’s generation were found*/                            
    INSERT INTO @tblNewGeneration
    SELECT Name, [Schema]
        FROM #tblDetail AS dtl
        WHERE Generation IS NULL
        AND NOT EXISTS (
                SELECT DISTINCT FkFromTbl,FkFromSchema
                    FROM #tblRelation AS rel
                    WHERE rel.Name = dtl.Name
                    AND rel.[Schema] = dtl.[Schema]
                EXCEPT
                SELECT Name, [Schema]
                    FROM #tblDetail
                    WHERE Generation IS NOT NULL
        )

    /*no new generation table found from old genertaion tables*/
    IF(NOT EXISTS(SELECT * FROM @tblNewGeneration))
    BEGIN
        PRINT 'Circular flow found at table relations.';
        BREAK;
    END

    /*set generation*/
    DECLARE @crntGeneration INT
    SET @crntGeneration = (SELECT MAX(Generation)  
                            FROM #tblDetail
                            WHERE Generation IS NOT NULL);                            
    UPDATE #tblDetail
        SET Generation = @crntGeneration + 1
        WHERE [Schema]+'.'+Name
        IN (
            SELECT [Schema]+'.'+Name
                FROM @tblNewGeneration
        )

    /*clean the new generation tbls from temp*/    
    DELETE FROM @tblNewGeneration;
END

But there is a chance, existence of a circular flow on the relations. So we use a check inside the loop to break the loop if any circular flow is found in relations.

How Do We Understand If Any Circular Flow Exists on the Relations?

  • There are still tables whose generation need to be found.
  • But no one’s dependencies could be resolved, using tables whose generation was already found.

And if we face a situation like this, a message would be displayed.

Image 6

Result: Success

If we haven’t faced any of those situations, it means all the generation for the tables has been populated. And we will see no "NULL" entry at generation column.

SQL
---- Result =>
SELECT *
    FROM #tblDetail
    ORDER BY Generation, Name, [Schema]

Image 7

Result: Failure

As I mentioned earlier, this query will try its best to reach the finish line without any error. But there are confidences. Sometimes, at the end of the process, we may find some “NULL” entries at the generation column which means the database relation faces some problems like:

  • First generation tables not found
  • Circular flow found in relations

Image 8

Where to Apply This Query?

I basically wrote this one for self-satisfaction and applied over a few of the relational databases I had worked with. Few databases replied well for feedback.

We could apply it for:

  • Quickly review the table hierarchy to understand the system
  • Intending to change the relation between tables or changed, now to find out the effects
  • After adding or eliminating tables to find out the effects

Limitations

  • There could be some errors which I haven’t faced yet. So, if you find any, please let me know.
  • There could be some sort of tool which does this thing for you. If you know any of those, please mention them in the comments

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)