Imputing Missing Values in R from reference data frame
By : Ugur Celikkiran
Date : March 29 2020, 07:55 AM
wish helps you I have a data frame 'dat' of dim 17000 x 3 of walking data. The interval column is 5 minute intervals for each 24 hour period, the date column is the date and the steps column is the number of steps taken in said 5 minute period on said date. NA's are present. , This is a little clunky, but it works: code :
library(dplyr)
df1.1 < df1 %>%
group_by(date) %>%
summarise(avg = mean(steps, na.rm = TRUE)) %>%
merge(df1, ., all.x=TRUE) %>%
mutate(steps = ifelse(is.na(steps)==TRUE, avg, steps)) %>%
select(avg)
df1 < data.frame(date = c(rep("20150101", 12), rep("20150102", 12)), interval = rep(seq(12), 2),
steps = c(5, 7, NA, 12, 3, NA, 0, 4, 12, 10, 4, 0, 3, NA, 2, 1, NA, 15, 0, 4, 7, 2, NA, 2),
stringsAsFactors = FALSE)
> head(df1)
date interval steps
1 20150101 1 5
2 20150101 2 7
3 20150101 3 NA
4 20150101 4 12
5 20150101 5 3
6 20150101 6 NA
> head(df1.1)
date interval steps
1 20150101 1 5.0
2 20150101 2 7.0
3 20150101 3 5.7
4 20150101 4 12.0
5 20150101 5 3.0
6 20150101 6 5.7
> df1 %>% group_by(date) %>% summarise(avg = mean(steps, na.rm = TRUE))
Source: local data frame [2 x 2]
date avg
1 20150101 5.7
2 20150102 4.0

Creating data frame with incremental minutes as rows (R)
By : Melawati
Date : March 29 2020, 07:55 AM
it helps some times Look up the function seq.POSIXt. This function is designed to create sequences of time. For your problem: code :
seq(ISOdate(2016,2,02, 14, 00, 00), by = "min", length.out = 5)

Imputing values in all columns of data.frame with mice
By : Shawn Debnath
Date : March 29 2020, 07:55 AM
I wish did fix the issue. I am trying to impute values using a linear model using mice. My understanding of mice is that it iterates over the rows. For a column with NAs it is using all other columns as predictors, fits the model, and then samples from this model to fill up the NAs. Here is an example where I generate some data, and than introduce missing data using ampute. , I tried to run your code and end up with the same type of problem: code :
library(mice)
n < 100
xx<data.frame(x = 1:n + rnorm(n,0,0.1), y =(1:n)*2 + rnorm(n,0,1))
head(xx)
res < (ampute(xx))
head(res$amp)
tempData < mice(res$amp,m=5,maxit=50,seed=500)
summary(tempData)
Multiply imputed data set
Call:
mice(data = res$amp, m = 5, maxit = 50, seed = 500)
Number of multiple imputations: 5
Missing cells per column:
x y
21 23
Imputation methods:
x y
"pmm" "pmm"
VisitSequence:
x
1
PredictorMatrix:
x y
x 0 0
y 0 0
Random generator seed value: 500
n < 100
xx<data.frame(x = 1:n + rnorm(n,0,0.1), y =(1:n)**2 + rnorm(n,0,1))
head(xx)
res < (ampute(xx))
head(res$amp)

Pandas: Imputing Missing Values to Data Frame
By : Timothy Smith
Date : March 29 2020, 07:55 AM
it helps some times In pandas NA should be NaN, 1st you need to replace it , then we can using fillna code :
df.Y=df.Y.replace('NA',np.nan)
df.Y=df.Y.fillna(pd.Series([1,2],index=df.index[df.Y.isnull()]))
df
Out[1375]:
W X Y Z
0 1 3 1.0 2
1 0 1 1.0 3
2 1 2 2.0 1
df.loc[df.Y=='NA','Y']=[1,2]
df
Out[1380]:
W X Y Z
0 1 3 1 2
1 0 1 1 3
2 1 2 2 1

Looping over rows of a data frame to simulate
By : jzsn
Date : March 29 2020, 07:55 AM
wish of those help This is more of a programing in R question than any concept question. I tried but my lack of expertise in R is frustrating me: , The random draw functions are all vectorized:

