Is there an R function to select one variable from each group (group_by()) from the dataframe?

By : zhaolin zhu
Date : July 29 2020, 09:00 PM
may help you . I've a dataset where two variables interests me: trial and truth. Trial numbers the questions people were asked (in total 20). And truth stands for the correct answer for each question. I want to calculate the log10() of the truth for each question. I came up with this: , Use unique like this:
code :
##   trial truth
## 1     1    34
## 3     2   321
## 5     3    78
Lines <- "trial truth
1   1   34
2   1   34
3   2   321
4   2   321
5   3   78
6   3   78"
mydata <- read.table(text = Lines)

How to give numbers to each group of a dataframe with dplyr::group_by?

By : Life Elixir
Date : March 29 2020, 07:55 AM
it should still fix some issue Use mutate to add a column which is just a numeric form of from as a factor:
code :
df %>% mutate(group_no = as.integer(factor(from)))

#   from dest group_no
# 1    a    b        1
# 2    a    c        1
# 3    b    d        2
mutate(df, group_no = as.integer(factor(from)))

dplyr: group_by and using the group value at summarising when creating a new variable

By : A.Aziz M
Date : March 29 2020, 07:55 AM
should help you out I am working on a dataframe where I am using group_by and summarise to get some results using dplyr. However, one of the variables I intend to generate in summarise needs to access a second dataframe value based on the value of the grouping variable, and I cannot guess how to do that. Here's an example. , This should do it...
code :
by.country <- ExampleData %>% group_by(country) %>% 
                      summarise(km2.country=sum(area)/1000000) %>% 
                      left_join(country.areas) %>% #note this brings in a new variable also called area

# A tibble: 2 × 4
    country km2.country      area PercOfCountry
      <chr>       <dbl>     <dbl>         <dbl>
1   Bolivia   17243.639 1090353.0    0.01581473
2 Venezuela    6142.899  916560.5    0.00670212

Select only first rows in each h2o dataframe group_by group (for merging)?

By : user2610634
Date : March 29 2020, 07:55 AM
should help you out Is there a way to select only first rows in each h2o dataframe group_by group? , here you go:
code :
import h2o

df1 = h2o.H2OFrame({'receipt_key': ['a1', 'a2'] , 'b':[1,3] , 'c':[2,4], 'item_id': [1,1]})
df1['receipt_key'] = df1['receipt_key'] .asfactor()
df2 = h2o.H2OFrame({'receipt_key': ['a1', 'a1','a2'] , 'e':[5,7,9] , 'f':[6,8,10], 'item_id': [1,2,1]})
df2['receipt_key'] = df2['receipt_key'].asfactor()

df3 = df1.merge(df2)
df_subset = df3[['receipt_key','b','c','e','f','item_id']]

receipt_key b   c   e   f   item_id
a1          1   2   5   6   1
a2          3   4   9   10  1

Using dplyr to group_by and conditionally mutate a dataframe by group

By : Zangson Zhang
Date : March 29 2020, 07:55 AM
wish helps you @eipi10 's answer works. However, I think you should use case_when instead of ifelse. It is vectorised and will be much faster on larger datasets.

convert group variable to group name after dplyr::group_by in r

By : Vikas
Date : March 29 2020, 07:55 AM
it fixes the issue I want to split the data into separate group and look at it. , We can use replace to change the values
