'Nonetype' object has no attribute 'findall' while using bs4

Question

0.00/5 (No votes)

See more:

Python

from bs4 import BeautifulSoup
import html5lib
import requests

tw_link = open("TW_Links.txt","r")
im_link = open("IMG_Links.txt","w+")

def get_images(urli):
  rs = requests.Session()
  urls=rs.get(urli)
  soup = BeautifulSoup(urls.text , "html5lib")
  #print(soup.prettify())
  content = soup.find("div", {"class": "tt_article_useless_p_margin"})
  images = content.findAll('img')
  for img in images:
    img_url = img['src']+"?original"
    print(img_url,file=im_link)

def get_links():
  count=1
  for line in tw_link:
    print(line,count)
    count+=1
    get_images(line)
get_links()

What I have tried:

<pre>The code seems to work fine when using a single link, but when i pass the urls to the function i'm getting the following error.<br />
<br />
AttributeError Traceback (most recent call last) in () 23 count+=1 24 get_images(line) ---> 25 get_links()<br />
<br />
1 frames in get_links() 22 print(line,count) 23 count+=1 ---> 24 get_images(line) 25 get_links()<br />
<br />
in get_images(urli) 12 print(soup.prettify()) 13 content = soup.find("div", {"class": "tt_article_useless_p_margin"}) ---> 14 images = content.findAll('img') 15 for img in images: 16 img_url = img['src']+"?original"<br />
<br />
AttributeError: 'NoneType' object has no attribute 'findAll'

My guess is that i'm triggering some sort of Bot Detection (because when passing a single link different page is loaded not the one that's being loaded currently), is there any way to bypass that..? I've tried using time.sleep(5) but that also didn't work

Posted 18-Dec-20 2:41am

SHIVAM SAH

Updated 19-Dec-20 4:15am

Richard MacCutchan

v2

Add a Solution

Comments

Richard MacCutchan 18-Dec-20 9:27am

The error message is telling you that the variable named content does not contain a valid reference to an object. Which in turn probably means that soup.find in the line above, did not find the relevant HTML tag. You will need to do some debugging to find out why it fails.

SHIVAM SAH 19-Dec-20 2:15am

so far i've tried using the encoding='utf-8' while reading the file yet it still seems to fail.

Richard MacCutchan 19-Dec-20 5:01am

It is no good randomly changing things in the hope that the problem will go away. Do some proper debugging and find out why the failure occurs. Only then can you reliably modify the code to correct it.

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

SHIVAM SAH · Accepted Answer · 2020-12-19T04:15:00