logo
Tags down

shadow

Create groups, breaks and conditions using If Else statements in R or Python


By : Armando Luna
Date : July 30 2020, 06:00 PM
wish help you to fix your issue I have a large dataset (2 million records), df, that I am trying to Group and create Breaks within datetimes. I would like to define a group and create these "breaks", if the following conditions apply: (This is a large dataset, and I do not know the contents of the subject, recipients and length columns) , Using dplyr :
code :
library(dplyr)

df %>%
  #Add row number
  mutate(row = row_number(), 
  #Convert to Posixct
         Date = lubridate::mdy_hms(Date)) %>%
  #Keep only TRUE rows
  filter(Edit) %>%
  #Create groups
  group_by(gr = cumsum(c(TRUE, diff(row) > 1))) %>%
  #Get first, last and difference between the dates
  summarise(Start = first(Date), 
            End = last(Date), 
            Duration = difftime(End, Start, "secs"), 
            Group = "A", Subject = "hey", Length = 80) %>%
   select(-gr)

# A tibble: 3 x 6
#  Start               End                 Duration Group Subject Length
#  <dttm>              <dttm>              <drtn>   <chr> <chr>    <dbl>
#1 2020-01-02 01:00:01 2020-01-02 01:00:30 29 secs  A     hey         80
#2 2020-01-02 01:02:00 2020-01-02 01:02:05  5 secs  A     hey         80
#3 2020-01-02 01:03:00 2020-01-02 01:03:20 20 secs  A     hey         80


Share : facebook icon twitter icon

If you have to put breaks in the if statements, why do we need to bother with conditions in the while operation?


By : keshar gurung
Date : March 29 2020, 07:55 AM
With these it helps Edit (as the question's title is changed):
The checking condition in the while/do-while loop is the one that is primarily checked to determine if the program is to stay in or to get out of the while/do-while loop - not the break statement.
code :
while (!(Carselect == 1 || Carselect == 2 || Carselect == 3));
while (Carselect != 1 && Carselect != 2 && Carselect != 3);

How to write if statements , multiple conditions to create new column in data frame


By : J.Doe
Date : March 29 2020, 07:55 AM
this one helps. You could simply use data table subsetting... first initialize a column, then assign it values based on your conditions. so here DF is my dataframe, and TEMP is the parameter I am classifying my new comlumn "control temp" with.
code :
DF$Control_Temp <- NA
DF$Control_Temp[DF$TEMP <= 50 & DF$TEMP2 == -1] <- 'Y'
DF$Control_Temp[DF$TEMP > 50 & DF$TEMP <= 100 & DF$TEMP2 == -1] <- 'N'
DF$Control_Temp[DF$TEMP > 100 & DF$TEMP2 == -1 ] <- 'Y'

Create groups/classes based on conditions within columns


By : YellowDev
Date : March 29 2020, 07:55 AM
help you fix your problem You can do this without having to loop or iterate through your dataframe. Per Wes McKinney you can use .apply() with a groupBy object and define a function to apply to the groupby object. If you use this with .shift() (like here) you can get your result without using any loops.
Terse example:
code :
# Group by Employee ID
grouped = df.groupby("Employee ID")
# Define function 
def get_unique_events(group):
    # Convert to date and sort by date, like @Khris did
    group["Effective Date"] = pd.to_datetime(group["Effective Date"])
    group = group.sort_values("Effective Date")
    event_series = (group["Effective Date"] - group["Effective Date"].shift(1) > pd.Timedelta('365 days')).apply(lambda x: int(x)).cumsum()+1
    return event_series

event_df = pd.DataFrame(grouped.apply(get_unique_events).rename("Unique Event")).reset_index(level=0)
df = pd.merge(df, event_df[['Unique Event']], left_index=True, right_index=True)
df['Output'] = df['Unique Event'].apply(lambda x: "Unique Leave Event " + str(x))
df['Match'] = df['Desired Output'] == df['Output']

print(df)
  Employee ID Effective Date        Desired Output  Unique Event  \
3         100     2013-01-01  Unique Leave Event 1             1
2         100     2014-07-01  Unique Leave Event 2             2
1         100     2015-06-05  Unique Leave Event 2             2
0         100     2016-01-01  Unique Leave Event 2             2
6         200     2013-01-01  Unique Leave Event 1             1
5         200     2015-01-01  Unique Leave Event 2             2
4         200     2016-01-01  Unique Leave Event 2             2
7         300        2014-01  Unique Leave Event 1             1

                 Output Match
3  Unique Leave Event 1  True
2  Unique Leave Event 2  True
1  Unique Leave Event 2  True
0  Unique Leave Event 2  True
6  Unique Leave Event 1  True
5  Unique Leave Event 2  True
4  Unique Leave Event 2  True
7  Unique Leave Event 1  True
import pandas as pd

data = {'Employee ID': ["100", "100", "100","100","200","200","200","300"],
        'Effective Date': ["2016-01-01","2015-06-05","2014-07-01","2013-01-01","2016-01-01","2015-01-01","2013-01-01","2014-01"],
        'Desired Output': ["Unique Leave Event 2","Unique Leave Event 2","Unique Leave Event 2","Unique Leave Event 1","Unique Leave Event 2","Unique Leave Event 2","Unique Leave Event 1","Unique Leave Event 1"]}
df = pd.DataFrame(data, columns=['Employee ID','Effective Date','Desired Output'])

# Group by Employee ID
grouped = df.groupby("Employee ID")

# Define a function to get the unique events
def get_unique_events(group):
     # Convert to date and sort by date, like @Khris did
    group["Effective Date"] = pd.to_datetime(group["Effective Date"])
    group = group.sort_values("Effective Date")
    # Define a series of booleans to determine whether the time between dates is over 365 days
    # Use .shift(1) to look back one row
    is_year = group["Effective Date"] - group["Effective Date"].shift(1) > pd.Timedelta('365 days')
    # Convert booleans to integers (0 for False, 1 for True)
    is_year_int = is_year.apply(lambda x: int(x))    
    # Use the cumulative sum function in pandas to get the cumulative adjustment from the first date.
    # Add one to start the first event as 1 instead of 0
    event_series = is_year_int.cumsum() + 1
    return event_series

# Run function on df and put results into a new dataframe
# Convert Employee ID back from an index to a column with .reset_index(level=0)
event_df = pd.DataFrame(grouped.apply(get_unique_events).rename("Unique Event")).reset_index(level=0)

# Merge the dataframes
df = pd.merge(df, event_df[['Unique Event']], left_index=True, right_index=True)

# Add string to match desired format
df['Output'] = df['Unique Event'].apply(lambda x: "Unique Leave Event " + str(x))

# Check to see if output matches desired output
df['Match'] = df['Desired Output'] == df['Output']

print(df)
  Employee ID Effective Date        Desired Output  Unique Event  \
3         100     2013-01-01  Unique Leave Event 1             1
2         100     2014-07-01  Unique Leave Event 2             2
1         100     2015-06-05  Unique Leave Event 2             2
0         100     2016-01-01  Unique Leave Event 2             2
6         200     2013-01-01  Unique Leave Event 1             1
5         200     2015-01-01  Unique Leave Event 2             2
4         200     2016-01-01  Unique Leave Event 2             2
7         300        2014-01  Unique Leave Event 1             1

                 Output Match
3  Unique Leave Event 1  True
2  Unique Leave Event 2  True
1  Unique Leave Event 2  True
0  Unique Leave Event 2  True
6  Unique Leave Event 1  True
5  Unique Leave Event 2  True
4  Unique Leave Event 2  True
7  Unique Leave Event 1  True

How to create groups based on conditions


By : user7253776
Date : March 29 2020, 07:55 AM
wish help you to fix your issue I have this kind of data: , A variation on @ycw's answer:
code :
library(data.table)
setDT(df)

df[, g := rleid( z <- out==0 | shift(out==0) )*NA^(!z) ]

    group size      int      out  g
 1:     A 1000 5.585529 0.000000  1
 2:     A 1000 5.709466 0.000000  1
 3:     A 1000 4.890697 0.000000  1
 4:     A 1000 0.000000 0.000000  1
 5:     A 1000 0.000000 0.000000  1
 6:     A    0 0.000000 4.080678  1
 7:     A    0 0.000000 4.883752 NA
 8:     A    0 0.000000 6.817312 NA
 9:     A 2000 4.546503 0.000000  3
10:     A 2000 5.605887 0.000000  3
11:     A 2000 3.182044 0.000000  3
12:     A 2000 0.000000 0.000000  3
13:     A 2000 0.000000 0.000000  3
14:     A 2000 0.000000 0.000000  3
15:     A 2000 0.000000 0.000000  3
16:     A    0 0.000000 5.370628  3
17:     A    0 0.000000 5.520216 NA
18:     A    0 0.000000 4.249468 NA
19:     A 5000 5.630099 0.000000  5
20:     A 5000 4.723816 0.000000  5
21:     A 5000 4.715840 0.000000  5
22:     A 5000 0.000000 0.000000  5
23:     A 5000 0.000000 0.000000  5
24:     A    0 0.000000 5.816900  5
25:     A    0 0.000000 4.113642 NA
26:     A    0 0.000000 4.668422 NA
    group size      int      out  g
df[, g := match(g, unique(na.omit(g)))]
w = df[.(unique(na.omit(g))), on=.(g), which=TRUE, mult="first"]
df[, g2 := cumsum(.I %in% w)]
    group size      int      out  g g2
 1:     A 1000 5.585529 0.000000  1  1
 2:     A 1000 5.709466 0.000000  1  1
 3:     A 1000 4.890697 0.000000  1  1
 4:     A 1000 0.000000 0.000000  1  1
 5:     A 1000 0.000000 0.000000  1  1
 6:     A    0 0.000000 4.080678  1  1
 7:     A    0 0.000000 4.883752 NA  1
 8:     A    0 0.000000 6.817312 NA  1
 9:     A 2000 4.546503 0.000000  2  2
10:     A 2000 5.605887 0.000000  2  2
11:     A 2000 3.182044 0.000000  2  2
12:     A 2000 0.000000 0.000000  2  2
13:     A 2000 0.000000 0.000000  2  2
14:     A 2000 0.000000 0.000000  2  2
15:     A 2000 0.000000 0.000000  2  2
16:     A    0 0.000000 5.370628  2  2
17:     A    0 0.000000 5.520216 NA  2
18:     A    0 0.000000 4.249468 NA  2
19:     A 5000 5.630099 0.000000  3  3
20:     A 5000 4.723816 0.000000  3  3
21:     A 5000 4.715840 0.000000  3  3
22:     A 5000 0.000000 0.000000  3  3
23:     A 5000 0.000000 0.000000  3  3
24:     A    0 0.000000 5.816900  3  3
25:     A    0 0.000000 4.113642 NA  3
26:     A    0 0.000000 4.668422 NA  3
    group size      int      out  g g2

How do I create a JOIN between 2 SELECT Statements with GROUPs?


By : Sathya Moorthy
Date : March 29 2020, 07:55 AM
I hope this helps . I think that you can do what you want, joining them like this (updated):
code :
    strSQL = "INSERT INTO Weekly (Symbol, WeekEnd, WeeklyHi, WeeklyLow, AdjClose) " &
"Select A.Symbol, A.WeekEnd, A.WeeklyHi, A.WeeklyLow, ISNULL(B.AdjClose, 0) as AdjClose " &
"FROM " &
    "(SELECT Symbol, WeekEnd, MAX(DailyHi) as WeeklyHi, MIN(DailyLow) as WeeklyLow " &
    "FROM TempDaily " &
    "GROUP BY Symbol, WeekEnd ) A " &
    "LEFT JOIN " &
    "(Select wdata.WeekEnd, MAX(wdata.AdjClose) as AdjClose " &
    "FROM " &
        "(Select CloseDate, WeekEnd, " &
        "FIRST_VALUE(AdjClose) OVER (PARTITION BY WeekEnd ORDER BY CloseDate " &
        "DESC ROWS UNBOUNDED PRECEDING) As AdjClose " &
        "FROM TempDaily) wdata " &
     "GROUP BY wdata.WeekEnd) B ON A.WeekEnd = B.WeekEnd "
Related Posts Related Posts :
  • Using disabledDate in Antd Datepicker in table
  • iterator .end() from std::list returns "0xcdcdcdcdcdcdcdcd" but .begin() as expected
  • how to convert HAC flexible query to DAO query
  • Cannot refresh UI if update in ItemView
  • How to make a function to use dict keys as variables to a class?
  • Best approach to remove cassandra-topology.properties file in running cluster nodes
  • plsql store procedure loop compare value
  • Replace values in XML file with values of a vector
  • Convert old SQL Database in compatibility mode
  • Sum same property object by group
  • What do you do about the JLabel classes? It says, "JLabel not a statement" for the error
  • Is std::sqrt the same as sqrt in C++
  • Iterate through std::initializer_list
  • Why does the overidden run method in java.lang.Thread produce a bizarre output?
  • Typescript: type one parameter based on the other
  • How to add a CSS to this JavaScript or HTML on click buttons?
  • Is it OK to inherit an empty Interface?
  • Functional Interface call for a new Instance
  • Microsoft Bot Framework: Smilies in MS Teams
  • changing background image of div using javascript
  • How to convert two arrays of strings to the array of objects like key and value with particular keys in javascript?
  • What is the fastest way to find if a column has at least one NULL value in ORACLE database?
  • Rename headers - 'list' object is not callable
  • Codeblocks c++ code doesn't run in VS 19 (vector subscript out of range)
  • Passing res.send value from node.js backend to react.js
  • Vim shortcuts to select and copy the current line without the next line
  • Is it possible to pass data from an angular7 component or service to index.html file?
  • When I tried to add ArrayList into ArrayList second ArrayList is repeating
  • If I implement IEquatable<T>, will I lose the option to compare by reference?
  • Authorize with both ASP.NET core MVC/Razor site AND a WebAPI
  • Compare two version of zip file and find which file has been modified within that zip
  • Dynamically generated href won't show properly
  • Best way to saving completed progress in table?
  • Does UIWindow function not work in Xcode11.3?
  • TypeError: __init__() takes 2 positional arguments but 6 were given
  • Converting string (with timezone) to datetime in python
  • How to overwrite the theme in shopify
  • Get the no of consecutive days a Field value is Stale
  • How to keep track of previous value of variable in swift?
  • Can't get result when running the query from Spring Data Jpa
  • If Condition Simplification
  • Python list generation from two strings
  • How to find distinct records in vespa.ai?
  • Why erase on std::vector promote iterator
  • How to use data to set other data in Vue.js
  • Azure AD does not return groups on claims
  • ASP Net Core Web API: Client side GroupBy is not supported
  • How to correct TypeError: Unicode-objects must be encoded before hashing with ReportLab
  • how to destroy an object in C++
  • How to do pagination using groupby in vespa.ai?
  • How can I print the longest word from a user defined list?
  • C# I have a DLL file and I need to make a class that inherits from a class that's in the DLL file?
  • Can someone explain to me why my factorial recursion code can't be compiled
  • Pass a PHP variable to a JS variable
  • Showing messages based on scroll position
  • How to copy cells via vba macro without getting subscript out of range
  • Replace substring in shell script
  • enabling authentication in ignite
  • Swipe to delete rows with multi section in tableview?
  • [BootstrapVue warn]: popover - Unable to find target element in document
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org