logo
down
shadow

ROC curve from train/test set in caret R package


ROC curve from train/test set in caret R package

By : Rupesh Adhikari
Date : November 22 2020, 03:01 PM
wish helps you It's hard to know for sure without a reproducible answer, but presumably your response variable bin.frail isn't numeric. For example, it might be coded using letters (e.g., "Y", "N"); or with numbers which are being stored as a factor. You could check this using is.numeric(whas$bin.frail).
As a side note, in your call to roc() it looks like mod1pred is being created from your training data whereas testing$bin.frail is from your test data. You could correct this by adding newdata = testing to your call to predict when creating mod1pred.
code :


Share : facebook icon twitter icon
Train test split in `r`'s `caret` package

Train test split in `r`'s `caret` package


By : Pavel Cechir
Date : March 29 2020, 07:55 AM
To fix the issue you can do If I understand the question correctly, this can be done all within caret using LGOCV (Leave-group-out-CV = repeated train/test split) and setting the training percentage p = 0.8 and the repeats of the train/test split to number = 1 if you really want just one model fit per k that is tested on a testset. Setting number > 1 will repeatedly assess model performance on number different train/test splits.
code :
data(iris)
library(caret)
set.seed(123)
mod <- train(Species ~ ., data = iris, method = "knn", 
             tuneGrid = expand.grid(k=1:20),
             trControl = trainControl(method = "LGOCV", p = 0.8, number = 1,
                                      savePredictions = T))
> head(mod$pred)
    pred    obs rowIndex k  Resample
1 setosa setosa        5 1 Resample1
2 setosa setosa        6 1 Resample1
3 setosa setosa       10 1 Resample1
4 setosa setosa       12 1 Resample1
5 setosa setosa       16 1 Resample1
6 setosa setosa       17 1 Resample1
> tail(mod$pred)
         pred       obs rowIndex  k  Resample
595 virginica virginica      130 20 Resample1
596 virginica virginica      131 20 Resample1
597 virginica virginica      135 20 Resample1
598 virginica virginica      137 20 Resample1
599 virginica virginica      145 20 Resample1
600 virginica virginica      148 20 Resample1 
ROC metric in train(), caret package

ROC metric in train(), caret package


By : Des O' Leary
Date : March 29 2020, 07:55 AM
seems to work fine There are two separate issues here.
The first is the error message, which says it all: you have to use something else than "0", "1" as values for your dependent factor variable Y.
code :
df$Y <- make.names(df$Y)
df$Y
# "X1" "X1" "X1" "X0" "X0"
levels(df$Y) <- c("X0", "X1")
df$Y
# [1] X1 X1 X1 X0 X0
# Levels: X0 X1
Warning messages:
1: In train.default(x, y, weights = w, ...) :
  The metric "ROC" was not in the result set. Accuracy will be used instead.
model_nn <- train(
  Y ~ ., df,
  method = "nnet",
  metric="ROC",
  trControl = trainControl(
    method = "cv", number = 10,
    verboseIter = TRUE,
    classProbs=TRUE,
    summaryFunction = twoClassSummary # ADDED
  )
)
How to do train, validation and test using Caret package in R?

How to do train, validation and test using Caret package in R?


By : Edward
Date : March 29 2020, 07:55 AM
this will help train allows you to do validation and mutch more. You can supply a trainControl function to the trControl argument that allows you to specify the details of your training procedure. By default train already splits 75% of the data you pass into it for training and 25% for validation, you can also change this in the trainControl.
I suggest you check out train and trainControl documentation, here and here to know more about the details you can specify in your training procedure.
code :
library(caret)
library(datasets)

# Loading the iris dataset
data(iris)

# Specifying an 80-20 train-test split
train_idx = createDataPartition(iris$Species, p = .8, list = F)

# Creating the training and testing sets
train = iris[train_idx, ]
test = iris[-train_idx, ]

# Declaring the trainControl function
train_ctrl = trainControl(
  method  = "cv", #Specifying Cross validation
  number  = 5, # Specifying 5-fold
  verboseIter = TRUE, # So that each iteration you get an update of the progress
  classProbs = TRUE # So that you can obtain the probabilities for each example
)

rf_model = train(
  Species ~., # Specifying the response variable and the feature variables
  method = "rf", # Specifying the model to use
  data = train, 
  trControl = train_ctrl,
  preProcess = c("center", "scale") # Do standardization of the data
)

# Get the predictions of your model in the test set
predictions = predict(rf_model, newdata = test)

# See the confusion matrix of your model in the test set
confusionMatrix(predictions, test$Species)
Using your own model in train (caret package)?

Using your own model in train (caret package)?


By : Monica Banciu
Date : March 29 2020, 07:55 AM
will help you Apparantly, I just had to put the arguments in the function even if I never use them :
The train function in R caret package

The train function in R caret package


By : raldje
Date : March 29 2020, 07:55 AM
should help you out I've created a reproducible example based on your code snippet. The first thing to notice about your code is that it's specifying repeatedcv as the method, but it doesn't give any repeats, so the number=4 parmeter is just telling it to resample 4 times (this is not an answer to your question but important to understand).
mod_fit$finalModel gives you only 1 set of coefficients because it's the one final model that's derived by aggergating the non-repeated k-fold CV results from each of the 4 folds.
Related Posts Related Posts :
  • how to loop for division funciton in r
  • Why does ggplot not allow suppressing of messages generated by its geoms?
  • Download multiple excel files linked through urls in R
  • sparklyr : spark_apply function is not working in cluster mode
  • dplyr mutate - How do I pass one row as a function argument?
  • R selecting rows by conditions given in an external table
  • Native regex way to replace multiple leading chars with equal number spaces
  • stan - difficulty vectorizing
  • How to define a function that calls shiny functions?
  • How to count number of observations in a "n" dimensional range in R
  • Superimposing asymmetric t-distribution using ggplot2
  • Makefile to render all targets of all .Rmd files in directory
  • Authentication failure with rdrop2
  • DT data table display error
  • Issue when adding new rows (with nested dataframes within) to a dataframe
  • R-How to compare two dataframe and update list column value
  • Series vector for approximating pi
  • what is difference between "variance explained " in Random Forest and "merror" in XGBoost
  • R - Cast dataframe on unique rows - reshape2
  • ggplot2: plot correct proportions using geom_bar
  • Speedup query for R data.table - can this two-argument function be applied by group more quickly?
  • apply a function to several columns at once with mutate
  • R 'cowplot' neatly produce gridded plot with shared (common) legends and unique legends
  • Repeat R script for many times and save results to text file
  • How to negative lookbehind for special characters
  • data.table inner join produces error when no match is found
  • Create a new column base on existing column, but row above
  • Is there a way to visualize the process of source() in RStudio?
  • google places api consumes 10 request but I am doing only 1
  • Statistical mode of a categorical variable in R (using mlv)
  • Using for-loop to mutate a data.frame in r
  • Make plot with regression line for mixed model
  • Shortcut to select matces cases in R studio
  • vectoriced norm/matrix multiplication
  • Negative log10 transformation in R
  • Plot data with duplicate points
  • Visualizing crosstab tables with a plot in R - changing colours
  • How to manually modify automated numbers and labels in plot
  • How can I follow any redirections of a url in R?
  • Add jitter to box plot using markers in plotly
  • Adding an extra item to the legend
  • ggplot fills in data in the wrong order
  • Convert list to data frame
  • R: filtering by list(s) of strings and returning all results that start with the content of the lists
  • R:How to attach parts of a data frame with different headers and/or an overflowing piece of the dat frame
  • How to use 'par' for manipulating plot margins?
  • Can dplyr::case_when return mix of NAs and non-NAs?
  • Text preprocessing and topic modelling using text2vec package
  • Uploading multiple files in Shiny, process the files, rbind the results and return a download
  • R levelplot: color green-white-red (white on 0) according to one variable, but show the values of another variable
  • Why [i] doesn't point to the starting point in a vector
  • In R after generating a mvrnorm distribution, Y, what does Y[,1] do?
  • expand a data frame to have as many rows as range of two columns in original row
  • Getting started with R and CFA
  • Re order x-axis in ggplot so time goes from 12AM to 11PM in R
  • R - Automatically stack every nth column of a data frame and save them as new objects
  • How to format dplyr output in R into doubles (or other workable format)?
  • Dataframe to matrix conversion using tapply turns zeros to NAs
  • Smallest multiple of 1:20 - How can I make it quicker?
  • How to specify the size of a graph in ggplot2 independent of axis labels
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org