Click here to Skip to main content
15,881,380 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hello everybody,
I am having this dataset of data:
product    Marketplace    product_type
1                  200               X
2                  300               A
2                  400               A
2                  200               A
3                  500               A
3                  400               A
3                  300               B


The output is looking like this:
product    Marketplace    product_type
1                  200               X
2                  300               A
2                  400               A
2                  200               A
3                  500               B
3                  400               B
3                  300               B


Basically, I'm changing the product type values if they differ for the same product. I tried the following code, but it works extremely hard for large amounts of data. Is there anything I could do about this or some suggestions?

What I have tried:

Python
mp_correspondence = {200:1, 
                     300:2,
                     400:3,
                     500:4,
                    }
df['ranking'] = df['Marketplace'].map(mp_correspondence)
df
product_list = set(df['product'])
for i in product_list:
    df_product_frame = df[df['product'] == i].copy()
    nr_rows = df_product_frame['product'].count()
    if nr_rows > 1:
        df['product_type'] = (df.assign(ranking=df['Marketplace'].map(mp_correspondence)) \
                         .sort_values('ranking').groupby('product')['product_type'].transform('first'))
Posted
Updated 15-Mar-22 3:34am
v3
Comments

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900