Click here to Skip to main content
15,881,248 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
I am trying to perform web scraping using Python on the ESPN website to extract historical NFL football game results scores only into a csv file. I’m unable to find a way to add the dates as displayed in the desired output. Could someone help me a way to get the desired output from the current output. The website I am using to scrape the data and the desired output is below:

NFL Website:
[DELETED]

Current Output:
Week #, Away Team, Away Score, Home Team, Home Score
Week 17, Cowboys, 27, Titans, 13
Week 17, Cardinals, 19, Falcons, 20
Week 17, Bears, 10, Lions, 41

Desired Game Results Output:
Week #, Date, Away Team, Away Score, Home Team, Home Score
Week 17, 12/29/2022, Cowboys, 27, Titans, 13
Week 17, 1/1/2023, Cardinals, 19, Falcons, 20
Week 17, 1/1/2023, Bears, 10, Lions, 41

What I have tried:

Python
from urllib.request import urlopen
import bs4
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np

daterange = 1
url_list = []

while daterange < 19:
    url = "https://www.espn.com/nfl/scoreboard/_/week/"+str(daterange)+"/year/2022/seasontype/2"
    url_list.append(url)
    daterange = daterange + 1 


j = 1
away_team = []
home_team = []
away_team_score = []
home_team_score = []
week = []

for url in url_list:
    response = urlopen(url)
    urlname = requests.get(url)
    bs = bs4.BeautifulSoup(urlname.text,'lxml')
    print(response.url)
    i = 0
    while True:
        try:
            name = bs.findAll('div',{'class':'ScoreCell__TeamName ScoreCell__TeamName--shortDisplayName truncate db'})[i]
        except Exception:
            break

        name = name.get_text()
        try:
            score = bs.findAll('div',{'class':'ScoreCell__Score h4 clr-gray-01 fw-heavy tar ScoreCell_Score--scoreboard pl2'})[i]
        except Exception:
            break


        score = score.get_text()

        if i%2 == 0:
            away_team.append(name)
            away_team_score.append(score)

        else:
            home_team.append(name)
            home_team_score.append(score)
            week.append("week "+str(j))

        i = i + 1
    j = j + 1
    
web_scraping = list (zip(week, home_team, home_team_score, away_team, away_team_score))
web_scraping_df = pd.DataFrame(web_scraping, columns = ['week','home_team','home_team_score','away_team','away_team_score'])
web_scraping_df
Posted
Updated 27-Jan-23 18:21pm
v3
Comments
Richard MacCutchan 28-Jan-23 4:49am    
Where is the date field on the input data?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900