Good afternoon,
I am trying to scrape the overall volume of unread messages on my Instagram profile and I am using Selenium through Python to access it. I have managed to reach my mailbox and I have 5 unread messages, signified with the classic 'blue' dot next to them.
The issue I am facing is that BeautifulSoup is not reading the respective div and classes to count the number of unread messages.
counter = 0
#count messages
soup = BeautifulSoup(browser.page_source, features='html.parser')
new_message = soup.find_all(lambda tag: tag.name=="div" and tag.get("class") == "Igw0E rBNOH YBx95 ybXk5 _4EzTm soMvl")
for i in new_message:
counter += 1
print('Unread messages: ', counter)
The class, as shown through the console is as follows. However, something tells me that Instagram's based on JS and this is why I cannot count the divs. Any ideas?
<pre><div class=" Igw0E rBNOH YBx95 ybXk5 _4EzTm soMvl "><div class=" _41V_T Sapc9 Igw0E IwRSH eGOV_ _4EzTm " style="height: 8px; width: 8px;"></div></div>
What I have tried:
I have tried numerous variations of new_message, such as:
new_message = soup.find_all("div", {"class" : "Igw0E rBNOH YBx95 ybXk5 _4EzTm soMvl"})
new_message = soup.find_all("div", {"class" : "_41V_T Sapc9 Igw0E IwRSH eGOV_ _4EzTm"})
and by its style, but to no avail.
new_message = soup.find_all("div",{"style" : "height: 8px; width: 8px;"})
Also tried checking whether it locates something to print and it does, but I am unsure as to why the counter is not working:
browser.implicitly_wait(10)
new_message = soup.find_all(lambda tag: tag.name=="div" and tag.get("class") == "_4EzTm")
for message in new_message:
counter =+ 1
print('Unread messages: ', counter)
if new_message is not None:
print("Found")
else:
print("Failed")