logo
down
shadow

Column splitting in R


Column splitting in R

By : user2174954
Date : October 16 2020, 08:10 PM
To fix this issue I am using a data frame, constructed from scraped data, which contains a character column. I am trying to split it into two columns, one which contains the elements before "|" and another which contains the elements after that symbol . , Try something like this
code :
x <- c(" 45  cubiertos | 1 . ",
" 5000  cubiertos "    ,
" 45  cubiertos | 1 . ",
" 60  cubiertos | 2 . ",
" 57  cubiertos | 1 . ",
" 35  cubiertos "     ,
" 70  cubiertos | 2 . "        ,
" 50  cubiertos | 2 . "        ,
" 45  cubiertos | 2 . "        ,
 " 146  cubiertos | 4 . ")

library(stringr)
str_split(x,"\\|", simplify = TRUE)


Share : facebook icon twitter icon
Splitting a single Excel column into two separate columns, splitting the values?

Splitting a single Excel column into two separate columns, splitting the values?


By : tkl
Date : March 29 2020, 07:55 AM
wish help you to fix your issue How do I split a single Excel column whose values are defined as such: , This would work if you're willing to use VBA.
code :
Option Explicit

Sub SplitHyperLinkFormula()
    Dim r As Range
    For Each r In Selection
        If InStr(1, r.Formula, "=hyperlink", vbTextCompare) = 1 Then
            r.Offset(0, 1).Value = GetHyperlink(r.Formula) 'Split URL
            r.Offset(0, 2).Value = r.Value                 'Split Title
        End If
    Next r
End Sub

Function GetHyperlink(s As String)
    'Requires =HYPERLINK formula, assumes hyperlink has no commas.
    s = Left(s, InStr(s, ",") - 2)
    GetHyperlink = Right(s, Len(s) - 12)
End Function
In Rapidminer after splitting a column how do I change the type of the new column column?

In Rapidminer after splitting a column how do I change the type of the new column column?


By : PauMarin96
Date : March 29 2020, 07:55 AM
With these it helps Guess Types should work. If it doesn't it could be that one of the values hasn't been split properly and is still a nominal. Parse Numbers will also work but again will not cope with nominals.
Adding a new column,splitting a column value in two then insert a part of the value in the new column

Adding a new column,splitting a column value in two then insert a part of the value in the new column


By : user3010265
Date : March 29 2020, 07:55 AM
To fix this issue I have a table with a column 'Name', like the following: , Try this
code :
UPDATE data SET FirstName = SUBSTRING_INDEX(Name, ' ', 1), LastName = SUBSTRING(SUBSTRING_INDEX(Name, ' ', 2),
       LENGTH(SUBSTRING_INDEX(Name, ' ', 1)) + 1)
Splitting a dataframe column where new column values depend upon original data

Splitting a dataframe column where new column values depend upon original data


By : bpieszko
Date : March 29 2020, 07:55 AM
To fix this issue I often work with dataframes that have columns with character string values that need to be separated. This results from a "select multiple" option in the data entry programme (which I cannot change unfortunately). I have tried tidyr::separate but that does not order the results properly. An example: , We can try with separate_rows and dcast
code :
library(tidyr)
library(reshape2)
library(dplyr)
separate_rows(df, sick) %>%
  mutate(sick = factor(sick, levels = c("diarrhoea", "cough", "malaria")), sick1 = sick) %>% 
  dcast(., x~sick, value.var = "sick1", drop=FALSE) %>%
  bind_cols(., df[2]) %>%
  select(x, sick, diarrhoea, cough, malaria) 
#  x              sick diarrhoea cough malaria
#1 1              <NA>      <NA>  <NA>    <NA>
#2 2           malaria      <NA>  <NA> malaria
#3 3 diarrhoea malaria diarrhoea  <NA> malaria
library(splitstackshape)
dcast(cSplit(df, "sick", " ", "long")[, sick:= factor(sick, levels = 
    c("diarrhoea", "cough", "malaria"))], x~sick, value.var = "sick", drop = FALSE)[,
       sick := df$sick][]
Splitting rows in Pandas based on column values and mapping column names

Splitting rows in Pandas based on column values and mapping column names


By : kate
Date : March 29 2020, 07:55 AM
like below fixes the issue I have a dataframe with two columns Person Name and Company Name. I want to create two more columns called Name and Name_Type. Name would be concat of Person and Company Name and Name_Type column would determine if the name is Person type or Company type. Some rows have empty strings, which creates four possibilities: , You can use apply, unstack and merge
code :
df = pd.DataFrame({"Person_name": ["Aaron", "", "Phil", "Joe"], 
                  "Company_name": ["", "XYZ Inc", "ABC LLC", ""]})

def logic(row):
    if row.Company_name and row.Person_name:
        return pd.Series([[row.Person_name, "Person_name"], [row.Company_name, "Company_name"]])
    else:
        return pd.Series([[row.Person_name, "Person_name"] if row.Person_name else [row.Company_name, "Company_name"]])
df2 = df.apply(logic, 1).unstack().apply(pd.Series).dropna().reset_index().set_index("level_1").sort_index()
dff = pd.merge(df,df2, left_index=True, right_index=True).iloc[:, [0,1,3,4]]
dff.columns = ["Company_name", "Person_name", "Name", "Name_Type"]
    Company_name    Person_name Name    Name_Type
0                   Aaron       Aaron   Person_name
1   XYZ Inc                     XYZ Inc Company_name
2   ABC LLC         Phil        Phil    Person_name
2   ABC LLC         Phil        ABC LLC Company_name
3                   Joe         Joe     Person_name
Related Posts Related Posts :
  • how to loop for division funciton in r
  • Why does ggplot not allow suppressing of messages generated by its geoms?
  • Download multiple excel files linked through urls in R
  • sparklyr : spark_apply function is not working in cluster mode
  • dplyr mutate - How do I pass one row as a function argument?
  • R selecting rows by conditions given in an external table
  • Native regex way to replace multiple leading chars with equal number spaces
  • stan - difficulty vectorizing
  • How to define a function that calls shiny functions?
  • How to count number of observations in a "n" dimensional range in R
  • Superimposing asymmetric t-distribution using ggplot2
  • Makefile to render all targets of all .Rmd files in directory
  • Authentication failure with rdrop2
  • DT data table display error
  • Issue when adding new rows (with nested dataframes within) to a dataframe
  • R-How to compare two dataframe and update list column value
  • Series vector for approximating pi
  • what is difference between "variance explained " in Random Forest and "merror" in XGBoost
  • R - Cast dataframe on unique rows - reshape2
  • ggplot2: plot correct proportions using geom_bar
  • Speedup query for R data.table - can this two-argument function be applied by group more quickly?
  • apply a function to several columns at once with mutate
  • R 'cowplot' neatly produce gridded plot with shared (common) legends and unique legends
  • Repeat R script for many times and save results to text file
  • How to negative lookbehind for special characters
  • data.table inner join produces error when no match is found
  • Create a new column base on existing column, but row above
  • Is there a way to visualize the process of source() in RStudio?
  • google places api consumes 10 request but I am doing only 1
  • Statistical mode of a categorical variable in R (using mlv)
  • Using for-loop to mutate a data.frame in r
  • Make plot with regression line for mixed model
  • Shortcut to select matces cases in R studio
  • vectoriced norm/matrix multiplication
  • Negative log10 transformation in R
  • Plot data with duplicate points
  • Visualizing crosstab tables with a plot in R - changing colours
  • How to manually modify automated numbers and labels in plot
  • How can I follow any redirections of a url in R?
  • Add jitter to box plot using markers in plotly
  • Adding an extra item to the legend
  • ggplot fills in data in the wrong order
  • Convert list to data frame
  • R: filtering by list(s) of strings and returning all results that start with the content of the lists
  • R:How to attach parts of a data frame with different headers and/or an overflowing piece of the dat frame
  • How to use 'par' for manipulating plot margins?
  • Can dplyr::case_when return mix of NAs and non-NAs?
  • Text preprocessing and topic modelling using text2vec package
  • Uploading multiple files in Shiny, process the files, rbind the results and return a download
  • R levelplot: color green-white-red (white on 0) according to one variable, but show the values of another variable
  • Why [i] doesn't point to the starting point in a vector
  • In R after generating a mvrnorm distribution, Y, what does Y[,1] do?
  • expand a data frame to have as many rows as range of two columns in original row
  • Getting started with R and CFA
  • Re order x-axis in ggplot so time goes from 12AM to 11PM in R
  • R - Automatically stack every nth column of a data frame and save them as new objects
  • How to format dplyr output in R into doubles (or other workable format)?
  • Dataframe to matrix conversion using tapply turns zeros to NAs
  • Smallest multiple of 1:20 - How can I make it quicker?
  • How to specify the size of a graph in ggplot2 independent of axis labels
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org