Trying to scrape tables on multiple pages. had some trouble with this one, was looping with out stoping but now: The loop stops after two rounds and I get an IndexError: Reindexing only valid with uniquely valued Index objects. 4 Pages over all to scrape in this round. <pre><pre> import pandas as pd import requests headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36'} results = pd.DataFrame() stats = 2018 while stats < 2023: goToNextStats = True desc = 1 while goToNextStats == True: base_URL = 'https://basketball.realgm.com/nba/stats/{}/Averages/Qualified/points/All/desc/{}/Regular_Season'.format(stats,desc) response = requests.get(base_URL, headers) if response.status_code == 200: temp_df = pd.read_html(base_URL)[2] temp_df.columns = list(temp_df.iloc[0,:]) if len(temp_df) == 0: goToNextStats = False stats +=1 continue print ('Aquiring Season: %s\tPage: %s' %(stats, desc)) temp_df['Season'] = '%s-%s' %(stats-1, stats) results = results.append(temp_df, sort=False).reset_index(drop=True) desc+=1 results.to_csv('/avg.csv', index=False)
Quote:InvalidIndexError Traceback (most recent call last) <ipython-input-78-2c377d5de3a4> in <module> 34 temp_df['Season'] = '%s-%s' %(stats-1, stats) 35 ---> 36 results = results.append(temp_df, sort=False).reset_index(drop=True) 37 38 desc+=1
if response.status_code == 200:
var
This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)