Click here to Skip to main content
15,889,480 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
I am working on Python code that counts the number of unique commenters in a chat using their unique id found in the field name “_id” nested in the commenters field name. The JSON looks like this.

Json:
{ 
"_id":"123adfvssw",
"content_type":"video",
"content_id":"12345",
"commenter":{ 
"display_name":"student1",
"name":"student1",
"type":"user",
},
"source":"chat",
"state":"published",
"message":{ 
"body":"Hi",
"fragments":[ 
{ 
"text":"Hi"
}
],
"is_action":false
},
"more_replies":false
}

{ 
"_id":"123adfvssw",
"content_type":"video",
"content_id":"12345",
"commenter":{ 
"display_name":"student2",
"name":"student2",
"type":"user",
},
"source":"chat",
"state":"published",
"message":{ 
"body":"Hey!",
"fragments":[ 
{ 
"text":"Hey"
}
],
"is_action":false
},
"more_replies":false
}

{ 
"_id":"123adfvssw",
"content_type":"video",
"content_id":"12345",
"commenter":{ 
"display_name":"student1",
"name":"student1",
"type":"user",
},
"source":"chat",
"state":"published",
"message":{ 
"body":"How are you?",
"fragments":[ 
{ 
"text":"How are you?"
}
],
"is_action":false
},
"more_replies":false
}


In all, the topic received 3 commenters. However, student1 commented more than once. So in retrospect, there are only two unique commenters in this thread. My question is how do I ensure that I only count the unique commenters using their _id field in the JSON? I am able to count all the commenter fields in the text but I am unable to count the unique commenters. The initial code I wrote counts all the commenters field which prints 3. However, the real answer is 2 since student1 commented twice. I am now trying to put the commenter's _id in an array/list so that I can count the ids that are unique. However, I am having some trouble storing the multiple values through a loop. Please help if you can.

What I have tried:

Code that Prints Number of Commenters Field:

import json

import  requests

from collections import Counter

files ="/chatinfo.txt"

with open(files) as f:

    commenters = 0

    for line in f:

        jsondata = json.loads(line)

        if "commenter" in jsondata:

            commenters += 1


print(commenters)

Output
3


An attempt at getting the Commenter _id Field value in an array/list to compare and only count unique commenters _id:
import json
files = "/chatinfo.txt"
with open(files) as f:
	num_with_field = 0
	for line in f:
		jsondata = json.loads(line)
		dictjson = json.dumps(jsondata)
		if "commenter" in jsondata:
			commenterid = []
			commenterid.append(jsondata["commenter"]["_id"])
			print(commenterid)

Output:
			
['193984934']'['157255102']
['100365638']

____________

However, after this, I try to see what's in the array/list. I get ['100365638'] instead of all three values.

print(commenterid)

Output

['100365638']'


Out of the three, it looks like only 1 value was stored in the array/list commenterid.

Problem 1:
Can anyone help me with filling my array/list with the three values I need using the loop? The array/list should contain ['193984934']['157255102']['100365638'].

Problem 2:

In addition, how can I count the unique ids in that array? So far I've only seen how to count the frequency of the ids.

Counter(commenterid).values() # counts the elements' frequency.


Do you think
len(set(commenterid))
would work? Also if you have a better way of doing this other than storing the values I need in an array or list I would love to see it. Thanks in advance.
Posted
Updated 25-Sep-18 7:15am
v3
Comments
Richard Deeming 25-Sep-18 13:11pm    
If you want to delete your question, then use the delete button to delete your question.

DO NOT vandalise your question to remove the content. Especially not after someone has kindly taken the time to answer your question!

I have rolled back your vandalism.
amccants 25-Sep-18 14:40pm    
Okay. I'm new to Codeproject and I didn't think it was a big deal since the question was never truly answered. And when I mean not answered, I mean not answered in a way that produces a true solution. No one answered it which is why I removed the content so that I could think of a more concise way of asking the question. So when I decided to ask the question a different way it wouldn't be flagged as a duplicate. I don't think I vandalize anything. This is my code.

You need to use a dictionary and check if your new id has been added before adding or you could just increment a counter for the id: 5. Data Structures — Python 3.7.0 documentation[^]
 
Share this answer
 
Comments
amccants 25-Sep-18 15:01pm    
Thank you. Incrementing a counter for id will not produce the unique commenters. I believe your first suggestion will work. I will try it out later on today and let you know.
Mehdi Gholam 26-Sep-18 9:31am    
The keys in your dictionary will give you the unique commenters, the count is a bonus of how many times for each person.
I gave you a suggestion at https://www.codeproject.com/Questions/1260731/How-do-I-count-the-unique-commeters-in-json-text-f[^]. But for some reason you deleted it.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900