I've been struggling for a while trying to figure it out to put some hierarchical values in a flat table into a specific dictionary format. The main issue is that I couldn't figure it out how to nest each category inside their corresponding key.
I have this table (as a pandas DataFrame) with the column stating the hierarchy as a number: The table has three columns:
Level Name Description
0 Main ...
1 Sub main ...
2 Sub sub main ...
1 Sub main ...
2 Sub sub main ...
3 Sub sub sub main ...
0 Main_2 ...
. . .
And the expected output should be something like this:
{
"nodes": [
{
"name": "main",
"description": "",
"owners":{
"users":["Sandra"]
},
"terms":[{
"name":"",
"description":""
}]
},
{
"nodes": [
{
"name": "sub_main",
"description": "",
"owners":{
"users":[""]
},
"terms":[{
"name":"",
"description":"",
"inherits":[""]
}]
},
{
"nodes": [
{
"name": "sub_sub_main",
"description": "",
"owners":{
"users":[""]
},
"terms":[{
"name":"",
"description":"",
"inherits":[""]
}]
},
]
}
]
}
]
}
I have a large table with multiple hierarchical levels. Sometimes it's just 2 or 3 levels and in others, more. But, all of them are in order.
The other thing is that in
inherits
section, there must appear the parents above them.
What I have tried:
So far, I have used a function that iterates through the dataframe and selects the level that should be related to each row:
reference link:
python - Pandas convert Dataframe to Nested Json - Stack Overflow[
^]
df = pd.read_excel(_path)
def fdrec(df):
drec = dict()
ncols = df.values.shape[1]
for line in df.values:
d = drec
for j, col in enumerate(line[:-1]):
if not col in d.keys():
if j != ncols-2:
d[col] = {}
d = d[col]
else:
d[col] = line[-1]
else:
if j!= ncols-2:
d = d[col]
return drec
After that, I create a for loop that iterates the list created and stores it in separate dict structures that should be appended once it detects the hierarchy. However, here is where I'm stuck. I don't know how to proceed with this problem from here:
list_dict = []
for k,v in _dict_.items():
domain_template = OrderedDict({
"nodes": [
{
"name": "",
"description": "",
"owners":{
"users":["Sandra"]
},
"terms":[{
"name":"",
"description":""
}]
}
]
})
subdomain_template = OrderedDict({
"nodes": [
{
"name": "",
"description": "",
"owners":{
"users":[""]
},
"terms":[{
"name":"",
"description":"",
"inherits":[""]
}]
}
]
})
for name,desc in v.items():
print(k, name)
if k < 1:
domain_template["nodes"][0]["name"] = name
domain_template["nodes"][0]["description"] = desc
domain_template["nodes"][0]["terms"][0]["name"] = name
domain_template["nodes"][0]["terms"][0]["description"] = desc
list_dict.append(domain_template)
elif k >= 1:
subdomain_template["nodes"][0]["name"] = name
subdomain_template["nodes"][0]["description"] = desc
subdomain_template["nodes"][0]["terms"][0]["name"] = name
subdomain_template["nodes"][0]["terms"][0]["description"] = desc
subdomain_template["nodes"][0]["terms"][0]["inherits"] = domain_template["nodes"][0]["name"] + "." + name
list_dict.append(subdomain_template)
I'm not very confident about what I have developed so far but, I would like to know if anyone knows a better approach to this task.
Thank you all in advance!