Click here to Skip to main content
15,867,937 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a dataframe with 'ProductID'and 'Balance' columns.
Some 'Balance' columns have values =0 (df['Balance']=0).
My dataframe contains different products.
Index ProductID Balance
1 10 100
2 20 0
3 30 200
4 20 150
5 20 240
6 40 100
7 30 200

I want to impute the zero balance at index 2 with average balance based on the product ID=20 ((0 +150 + 240)/3 =130) .

I am using the below code:
1) First getting the balance_average based on the ProductID.

balance_average = df.pivot_table(values='Balance', index='ProductID')

2) Identifying the zero balance rows

zero_bool = (df['Balance'] == 0)

3) Applying balance_average

df.loc[zero_bool,'Balance'] = df.loc[zero_bool,'ProductID'].apply(lambda x: balance_average[x])

I am getting KeyError: '20'.
20 is the corresponding ProductID of the first zero balance column.
Could someone tell me what's wrong with the above code?

What I have tried:

imputing the zero balance with average balance of similar products.
Posted

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900