Click here to Skip to main content
15,867,835 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I've been struggling for a while trying to figure it out to put some hierarchical values in a flat table into a specific dictionary format. The main issue is that I couldn't figure it out how to nest each category inside their corresponding key.

I have this table (as a pandas DataFrame) with the column stating the hierarchy as a number: The table has three columns:

Level    Name           Description
  0      Main               ...
  1      Sub main           ...
  2      Sub sub main       ...
  1      Sub main           ...
  2      Sub sub main       ...
  3      Sub sub sub main   ...
  0      Main_2             ...
       .  .  .

And the expected output should be something like this:
{
    "nodes": [
        {
            "name": "main",
            "description": "",
            "owners":{
                "users":["Sandra"]
            },
            "terms":[{
                "name":"",
                "description":""
            }]
            
        },
        
        {
            "nodes": [
                {
                    "name": "sub_main",
                    "description": "",
                    "owners":{
                        "users":[""]
                    },
                    "terms":[{
                        "name":"",
                        "description":"",
                        "inherits":[""]
                    }]

                },
                {
                    "nodes": [
                        {
                            "name": "sub_sub_main",
                            "description": "",
                            "owners":{
                                "users":[""]
                            },
                            "terms":[{
                                "name":"",
                                "description":"",
                                "inherits":[""]
                            }]

                        },
                    ]
                }
            ]
        }
    ]
}


I have a large table with multiple hierarchical levels. Sometimes it's just 2 or 3 levels and in others, more. But, all of them are in order.

The other thing is that in inherits section, there must appear the parents above them.

What I have tried:

So far, I have used a function that iterates through the dataframe and selects the level that should be related to each row:

reference link: python - Pandas convert Dataframe to Nested Json - Stack Overflow[^]
df = pd.read_excel(_path)

def fdrec(df):
    drec = dict()
    ncols = df.values.shape[1]
    for line in df.values:
        d = drec
        for j, col in enumerate(line[:-1]):
            if not col in d.keys():
                if j != ncols-2:
                    d[col] = {}
                    d = d[col]
                else:
                    d[col] = line[-1]
            else:
                if j!= ncols-2:
                    d = d[col]
    return drec


After that, I create a for loop that iterates the list created and stores it in separate dict structures that should be appended once it detects the hierarchy. However, here is where I'm stuck. I don't know how to proceed with this problem from here:
list_dict = []
for k,v in _dict_.items():
    domain_template = OrderedDict({
        "nodes": [
            {
                "name": "",
                "description": "",
                "owners":{
                    "users":["Sandra"]
                },
                "terms":[{
                    "name":"",
                    "description":""
                }]

            }
        ]
    })

    subdomain_template = OrderedDict({
        "nodes": [
            {
                "name": "",
                "description": "",
                "owners":{
                    "users":[""]
                },
                "terms":[{
                    "name":"",
                    "description":"",
                    "inherits":[""]
                }]

            }
        ]
    })

    for name,desc in v.items():
        print(k, name)
        
        if k < 1:
            domain_template["nodes"][0]["name"] = name
            domain_template["nodes"][0]["description"] = desc
            domain_template["nodes"][0]["terms"][0]["name"] = name
            domain_template["nodes"][0]["terms"][0]["description"] = desc
            
            list_dict.append(domain_template)
            
        elif k >= 1:
            subdomain_template["nodes"][0]["name"] = name
            subdomain_template["nodes"][0]["description"] = desc
            subdomain_template["nodes"][0]["terms"][0]["name"] = name
            subdomain_template["nodes"][0]["terms"][0]["description"] = desc
            subdomain_template["nodes"][0]["terms"][0]["inherits"] = domain_template["nodes"][0]["name"] + "." + name
            
            list_dict.append(subdomain_template)


I'm not very confident about what I have developed so far but, I would like to know if anyone knows a better approach to this task.

Thank you all in advance!
Posted
Updated 28-Dec-21 23:28pm

1 solution

 
Share this answer
 
Comments
Member 15482995 29-Dec-21 5:35am    
Hi Richard, thanks for your answer. I have tried it but, the structure of the result does not match the one I would like. For the final result, I would like to populate the structure I stated in the question so they follow the right "nesting". Otherwise, I would just have a transformed dataframe into a dictionary.
Richard MacCutchan 29-Dec-21 5:52am    
Then I guess you will have to stick with your own solution.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900