Click here to Skip to main content
15,900,589 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a Dataframe with lots of floats...

I want to convert all values in the Dataframe to 0 and 1 based on whether the values are greater than a certain number.



any ideas? Thank you!

What I have tried:

for item in dataframe:
if item > 100:
item = 1
else:
item = 0


error I am getting:

'<=' not supported between instances of 'str' and 'float'
Posted
Updated 23-Mar-22 23:04pm
v2

1 solution

The only thing I can see wrong with the code you have is that item is a string and you must first convert it to a numeric type. As the error message says
Quote:
'<=' not supported between instances of 'str' and 'float'
then I suggest you convert it to a float, although interestingly there is no <= in the code you shared, so that cannot be the error message you received.

You could try something like this
Python
for item in dataframe:
if float(item) > 100:
...
or you could convert the entire column of the dataframe to be float (or integer) - probably the best option if you want to reset the values to a numeric - see Pandas Convert Column to Float in DataFrame - Spark by {Examples}[^]

I have just seen the following comment in the documentation for pandas Dataframe

Quote:
You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.
You are going to have to find an alternative way to achieve whatever it is you are trying to do - perhaps at the point of populating the datafram
 
Share this answer
 
v2
Comments
CPallini 24-Mar-22 4:42am    
5.
Jasmine L 24-Mar-22 4:57am    
hm, that did not work either :/ I just need to map the entire dataset to 0 and 1 based on >=
CPallini 24-Mar-22 6:14am    
In order to get better help you should provide full details (for instance an *exact* example of input and expected output).
CHill60 24-Mar-22 7:22am    
"did not work" does not help us to help you. You need to provide exactly what happened / what was the exact wording of any error.
However, in this case note the amendment I have just made to my solution
Jasmine L 24-Mar-22 15:59pm    
When I:

for item in dataframe:
if float(item) > 100:
...

output:

0

Output needed:

entire data frame:
0 0 1 1 0 0
0 0 1 0 0 0
1 0 0 1 0 1

where:
0 = value less than
1 = value greater than

All cells are numerical

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900