logo
down
shadow

Random Forest for a mixture of categorical,numeric and "unwanted" variables which include missing values


Random Forest for a mixture of categorical,numeric and "unwanted" variables which include missing values

By : smcbird
Date : November 22 2020, 03:01 PM
help you fix your problem I am trying to use Random Forest package in R for my data set which includes categorical and numerical variables as well as some "unwanted coloumns" (coloumns which I do not want to include as my predictor variables). Moreover, some of my desirable variables (which are supposed to be used as predictor) are missing. How can I handle that? , I assumed your dataset looks like something like this.
code :
mydf <- data.frame(target = c(1:100), 
                   param1 = c(rep("a",10), rep("b", 50), 
                              rep("c", 20), rep("a",15), rep(NA, 5)), 
                   param2 = runif(100,0,1), 
                   param3 = c(runif(20,1,10),runif(50,20,30),rep(NA,10),
                              runif(10,0,5), runif(10,70,80)))
mydf2 <- mydf[,c(target,param1,param2]
myrf <- randomForest(target ~ ., mydf2)


Share : facebook icon twitter icon
Using require-js and grunt.js - error missing either a "name", "include" or"modules" optio

Using require-js and grunt.js - error missing either a "name", "include" or"modules" optio


By : Kaan Akyuz
Date : March 29 2020, 07:55 AM
should help you out My Gruntfile.js file: , Updated the grunt.js file to use name:
code :
module.exports = function (grunt) {
    grunt.initConfig({
        pkg : grunt.file.readJSON('package.json'),
        requirejs : {
            compile: {
                options: {
                    name: "views/app",
                    baseUrl: "public_html/js",
                    mainConfigFile: "public_html/js/config.js",
                    out: "public_html/app.min.js"
                }
            }
        }
    });

    grunt.loadNpmTasks('grunt-contrib-requirejs');

    grunt.registerTask('default', ['requirejs']);
};
require(['views/app'], function(AppView) {
  new AppView;
});
caret - random-forests not working: "Something is wrong; all the Accuracy metric values are missing:"

caret - random-forests not working: "Something is wrong; all the Accuracy metric values are missing:"


By : Thomas Williams
Date : March 29 2020, 07:55 AM
To fix the issue you can do When I run the first cforest model, I can see that "In addition: There were 31 warnings (use warnings() to see them)". These say that
code :
> names(formals(cforest))
[1] "formula"  "data"     "subset"   "weights"  "controls" "xtrafo"  
[7] "ytrafo"   "scores"   
> names(formals(randomForest:::randomForest.default))
 [1] "x"           "y"           "xtest"       "ytest"      
 [5] "ntree"       "mtry"        "replace"     "classwt"    
 [9] "cutoff"      "strata"      "sampsize"    "nodesize"   
[13] "maxnodes"    "importance"  "localImp"    "nPerm"      
[17] "proximity"   "oob.prox"    "norm.votes"  "do.trace"   
[21] "keep.forest" "corr.bias"   "keep.inbag"  "..."       
missing values in object - Random Forest Confusion Matrix in R

missing values in object - Random Forest Confusion Matrix in R


By : Baryt Monti
Date : March 29 2020, 07:55 AM
With these it helps As mentioned by @lukeA, I was having problem due to NA values. Another option that worked for me was to clean my data a little bit more.:
code :
training <- training[, colSums(is.na(training)) == 0]
what is difference between "variance explained " in Random Forest and "merror" in XGBoost

what is difference between "variance explained " in Random Forest and "merror" in XGBoost


By : Markus Olsen
Date : November 27 2020, 03:01 PM
it should still fix some issue Variance explained and XGBoost's merror are not the same. They relate to very different statistical concepts.
code :
library(xgboost)
library(randomForest)
# Model: Random forest
model.rf <- randomForest(
    Species ~ ., data = iris)
cm.rf <- model.rf$confusion
cm.rf
#           setosa versicolor virginica class.error
#setosa         50          0         0        0.00
#versicolor      0         47         3        0.06
#virginica       0          3        47        0.06

# Model: XGBoost
model.xg <- xgboost(
    data = as.matrix(iris[, 1:4]),
    label = as.factor(iris[, 5]),
    nrounds = 10,
    eval.metric = "merror",
    num_class = 4) 
pred <- levels(iris[, 5])[as.integer(predict(model.xg, as.matrix(iris[, 1:4])))]
cm.xg <- table(pred, as.factor(iris[, 5]))
cm.xg
#
#pred         setosa versicolor virginica
#  setosa         50          0         0
#  versicolor      0         48         0
#  virginica       0          2        50
merror <- function(cm)
    sum(setdiff(as.integer(cm), diag(cm))) / sum(as.integer(cm))
    # Model: Random forest
    merror.rf <- merror(cm.rf[, 1:3])
    merror.rf
    #[1] 0.02

    # Model: XGBoost
    merror.xg <- merror(cm.xg)
    merror.xg
    #[1] 0.01333333
model.xg$evaluation_log
#    iter train_merror
# 1:    1     0.026667
# 2:    2     0.020000
# 3:    3     0.020000
# 4:    4     0.020000
# 5:    5     0.020000
# 6:    6     0.020000
# 7:    7     0.013333
# 8:    8     0.013333
# 9:    9     0.013333
#10:   10     0.013333
"Unreplaced values treated as NA as .x is not compatible": Recoding numeric variables

"Unreplaced values treated as NA as .x is not compatible": Recoding numeric variables


By : Huberson
Date : March 29 2020, 07:55 AM
around this issue Just learned it myself. :-)
Might it be that you loaded the packages "car" & "dplyr"?
Related Posts Related Posts :
  • Authentication failure with rdrop2
  • DT data table display error
  • Issue when adding new rows (with nested dataframes within) to a dataframe
  • R-How to compare two dataframe and update list column value
  • Series vector for approximating pi
  • what is difference between "variance explained " in Random Forest and "merror" in XGBoost
  • R - Cast dataframe on unique rows - reshape2
  • ggplot2: plot correct proportions using geom_bar
  • Speedup query for R data.table - can this two-argument function be applied by group more quickly?
  • apply a function to several columns at once with mutate
  • R 'cowplot' neatly produce gridded plot with shared (common) legends and unique legends
  • Repeat R script for many times and save results to text file
  • How to negative lookbehind for special characters
  • data.table inner join produces error when no match is found
  • Create a new column base on existing column, but row above
  • Is there a way to visualize the process of source() in RStudio?
  • google places api consumes 10 request but I am doing only 1
  • Statistical mode of a categorical variable in R (using mlv)
  • Using for-loop to mutate a data.frame in r
  • Make plot with regression line for mixed model
  • Shortcut to select matces cases in R studio
  • vectoriced norm/matrix multiplication
  • Negative log10 transformation in R
  • Plot data with duplicate points
  • Visualizing crosstab tables with a plot in R - changing colours
  • How to manually modify automated numbers and labels in plot
  • How can I follow any redirections of a url in R?
  • Add jitter to box plot using markers in plotly
  • Adding an extra item to the legend
  • ggplot fills in data in the wrong order
  • Convert list to data frame
  • R: filtering by list(s) of strings and returning all results that start with the content of the lists
  • R:How to attach parts of a data frame with different headers and/or an overflowing piece of the dat frame
  • How to use 'par' for manipulating plot margins?
  • Can dplyr::case_when return mix of NAs and non-NAs?
  • Text preprocessing and topic modelling using text2vec package
  • Uploading multiple files in Shiny, process the files, rbind the results and return a download
  • R levelplot: color green-white-red (white on 0) according to one variable, but show the values of another variable
  • Why [i] doesn't point to the starting point in a vector
  • In R after generating a mvrnorm distribution, Y, what does Y[,1] do?
  • expand a data frame to have as many rows as range of two columns in original row
  • Getting started with R and CFA
  • Re order x-axis in ggplot so time goes from 12AM to 11PM in R
  • R - Automatically stack every nth column of a data frame and save them as new objects
  • How to format dplyr output in R into doubles (or other workable format)?
  • Dataframe to matrix conversion using tapply turns zeros to NAs
  • Smallest multiple of 1:20 - How can I make it quicker?
  • How to specify the size of a graph in ggplot2 independent of axis labels
  • How can I find the number of a vector's elements in another vector?
  • ROC curve from train/test set in caret R package
  • extract certain data from multiple excel files with R
  • Matrix with counts of wins and losses between methods in R
  • Grouping string variables from a dataframe by best string match to make subsets
  • Reorder does not work after adding second geom_points
  • cover POS data formate to the one can apply Arules (Apriori)
  • Matching values between data frames based on overlapping dates
  • Grouped bar chart turns into stacked bar chart ggplot
  • R: How to fill in NA Values within a Column based on grouping?
  • Two action buttons, but only the first one, that is written in the server file, works?
  • Barchart grouped by variable both count up to 100 percent
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org