I am using a large dataset and I am not used to using one this big (286,212 rows, 19 columns) and I am not sure how to go about my problem. the data is made up of values for each day of the year for 782 grid references and I have this for 15 years. It looks as follows
**Month Day Grid x2004 x2005 x2006 x2007**
1 1 A10 0.091 0.134 NA 0.066
1 2 A10 0.12 0.10 0.23 0.054
1 3 A10 0.55 NA NA 0.08
1 1 B10 NA 0.134 NA 0.17
1 2 B10 0.14 0.151 NA 0.21
1 3 B10 0.43 0.162 0.24 NA
However some of the days are missing and I want to insert the mean of that day for that specific grid using values from the other years. So if the Grid A10 for day 1 in 2006 is missing. I want to insert the mean for day 1 grid A10 from 2004, 2005, 2007, in this case 0.097.
What I have tried:
I am trying the following code
x<-for(i in 1:ncol(data)){
data[is.na(data[,i]) ,i] <- mean(data[,i], na.rm = TRUE)
}
but it seems to be finding the column mean i think and adding it in. I have also tried to change it to
x<-for(i in 1:nrow(data)){
data[is.na(data[i,]) ,i] <- mean(data[i,], na.rm = TRUE)
}
and that didn't work either. I have already asked on stackoverflow but have not got a solution yet. I am not a computer programmer, and this is the last bit of coding I need to do the stats analysis for my PhD so I am quite desperate to figure it out, but I am just not sure how to go about it. Iknow this forum is for other progamming languages but Please help if you can.