Click here to Skip to main content
15,891,833 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
How to loop over 300 000 rows and edit each row string one by one?

I have a list of 11-digit numbers stored in one single column in Excel, and I need to separate the digits according to this pattern: 2-2-1-3-3.

I use the code below to loop to test the solution for only 20 rows and it's working.

Example: 00002451018 becomes 00 00 2 451 018.

priceListTest contains the column Column1 which has these 11 digit numbers. Somehow I need to loop all over these 300 000 rows and use the get_slices to change the pattern for each row like from the example above and store it into the new column New Value.

The for index, row it's working very slowly when I have to use it for 300 000 rows. Maybe there is a better method, but I'm new to python.

Thanks in advance!

What I have tried:

for index, row in priceListTest.iterrows(): 
    #print(index,row)
    def get_slices(n, sizes, n_digits=11):
        for size in sizes:
            n_digits -= size
            
            val, n = divmod(n, 10 ** n_digits)
            yield f'{val:0{size}}'

    n = row['Column1']
    newVar = (' '.join(get_slices(n, [2, 2, 1, 3, 3])))
    priceListTest.at[index,['New Value']] = newVar
Posted
Updated 9-Nov-21 0:06am
Comments
Richard MacCutchan 9-Nov-21 4:19am    
It would most likely be easier to use a formula in the new column. Excel would then do the conversion automatically.

1 solution

You can use this formula in the column where you want the formatted text:
=TEXTJOIN(" ", TRUE, LEFT(TEXT(A1, "00000000000"), 2), MID(TEXT(A1, "00000000000"), 2,2), MID(TEXT(A1, "00000000000"), 4,1), MID(TEXT(A1, "00000000000"), 5,3), RIGHT(TEXT(A1, "00000000000"), 3))

The reference A1 in the formula should be altered to the cell reference containing your number. Once you have pasted the formula in the first cell of the new column, you can just use Excel's auto copy to repeat it for the entire column.

Also note that if the original column contains a text string rather than a numeric value then you can use the simplified formula:
=TEXTJOIN(" ", TRUE, LEFT(A1, 2), MID(A1, 2,2), MID(A1, 4,1), MID(TEXT(A1, 5,3), RIGHT(A1, 3))
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900