logo
Tags down

shadow

Pandas - return x columns based on difference in dates


By : jsgbrl
Date : October 14 2020, 02:24 PM
it fixes the issue Use numpy broadcasting for subtract values, convert timedeltas to days and create DataFrame by constructor:
code :
df = pd.DataFrame({'A': ['3/31/2018', '6/22/2018', '7/5/2018']})
df['A'] = pd.to_datetime(df.A)

rng = pd.date_range('1/31/2019', periods=36, freq='M')

df = pd.DataFrame((rng.values - df['A'].values[:, None])
                  .astype("timedelta64[D]").astype(int), columns=rng)
print (df)
   2019-01-31  2019-02-28  2019-03-31  2019-04-30  2019-05-31  2019-06-30  \
0         306         334         365         395         426         456   
1         223         251         282         312         343         373   
2         210         238         269         299         330         360   

   2019-07-31  2019-08-31  2019-09-30  2019-10-31     ...      2021-03-31  \
0         487         518         548         579     ...            1096   
1         404         435         465         496     ...            1013   
2         391         422         452         483     ...            1000   

   2021-04-30  2021-05-31  2021-06-30  2021-07-31  2021-08-31  2021-09-30  \
0        1126        1157        1187        1218        1249        1279   
1        1043        1074        1104        1135        1166        1196   
2        1030        1061        1091        1122        1153        1183   

   2021-10-31  2021-11-30  2021-12-31  
0        1310        1340        1371  
1        1227        1257        1288  
2        1214        1244        1275  

[3 rows x 36 columns]


Share : facebook icon twitter icon

Python Pandas - replace values with NAN in multiple columns based on mutliple dates?


By : paul sophola
Date : March 29 2020, 07:55 AM
hop of those help? Maybe also not that fast, but already a cleaner approach based on pandas:
code :
df.where(df.apply(lambda x: x.index < pd.Timestamp(x.name[2])))
In [1]: import pandas as pd

In [2]: from StringIO import StringIO

In [3]: s = """,ACTION,ACTION
   ...: ,111,222
   ...: ,1/7/2010,1/5/2010
   ...: DATE,,
   ...: 1/1/2010,    10,                          5
   ...: 1/2/2010,    10,                          5
   ...: 1/3/2010,    10,                          5
   ...: 1/4/2010,    15,                          5
   ...: 1/5/2010,    10,                          5
   ...: 1/6/2010,    10,                          5
   ...: 1/7/2010,    10,                          5
   ...: 1/8/2010,    10,                          5"""

In [4]: df = pd.read_csv(StringIO(s), header=[0,1,2], index_col=0, parse_dates=True)

In [5]: df.where(df.apply(lambda x: x.index < pd.Timestamp(x.name[2])))
Out[5]:
              ACTION
                 111       222
            1/7/2010  1/5/2010
DATE
2010-01-01        10         5
2010-01-02        10         5
2010-01-03        10         5
2010-01-04        15         5
2010-01-05        10       NaN
2010-01-06        10       NaN
2010-01-07       NaN       NaN
2010-01-08       NaN       NaN

Pandas return columns for one dataframe based on multple columns from another


By : Eirin
Date : March 29 2020, 07:55 AM
hop of those help? A better example would help, but if I am following correctly, the following works:
code :
df3 = pd.merge(df2, df, on='HIGH', how='inner', suffixes=['', 'r'])
df4 = pd.merge(df2, df, on='MID', how='inner', suffixes=['', 'r'])
df5 = pd.merge(df2, df, on='LOW', how='inner', suffixes=['', 'r'])
df6 = pd.concat([df3, df4, df5]).drop(['HIGHr', 'MIDr', 'LOWr'], axis=1)

df6

    HIGH    LOW     MID     VALUE1  VALUE2
0   abC     Abc-4   aBc*2   1       bb
1   bcD$22  Bcd-1   bCd     2       dd
2   cdE#2   CdE     cDe&3   3       ee

Pandas Sum values from different columns based on dates


By : Nereo
Date : March 29 2020, 07:55 AM
With these it helps You can join on an auxiliary column that manages the string conversion of your inputs:
code :
import pandas as pd
from datetime import datetime

df['prev'] = (df.Period.apply(lambda x: x.to_timestamp()) - pd.DateOffset(months=1)
aux = df.merge(df, how='left', left_on = 'prev', right_on = 'Period')
df['sum'] = aux.Value_x + aux.Value_y
df= df.drop('prev',axis=1) 

Calculating difference between dates based on grouping one or more columns


By : user3180683
Date : March 29 2020, 07:55 AM
With these it helps We can use diff to subtract Date and select groups where there is at-least one value which is less than equal to 5 days.
code :
library(dplyr)

df %>%
  group_by(id, Buyer) %>%
  filter(any(diff(Date) <= 5))

#      id Date       Buyer  
#   <dbl> <date>     <chr>  
# 1     9 2018-11-29 Jenny  
# 2     9 2018-11-29 Jenny  
# 3     9 2018-11-29 Jenny  
# 4     5 2018-05-25 Chunfei
# 5     5 2019-02-13 Chunfei
# 6     5 2019-02-16 Chunfei
# 7     5 2019-02-16 Chunfei
# 8     5 2019-02-23 Chunfei
# 9     5 2019-02-25 Chunfei
#10     8 2019-02-28 Chunfei
#11     8 2019-02-28 Chunfei
df %>%
  group_by(id, Buyer) %>%
  mutate(diff = c(NA, diff(Date))) %>%
  slice({i1 <- which(diff <= 5); unique(c(i1, i1-1))}) %>%
  select(-diff)

#      id Date       Buyer  
#   <dbl> <date>     <chr>  
# 1     5 2019-02-16 Chunfei
# 2     5 2019-02-16 Chunfei
# 3     5 2019-02-25 Chunfei
# 4     5 2019-02-13 Chunfei
# 5     5 2019-02-23 Chunfei
# 6     8 2019-02-28 Chunfei
# 7     8 2019-02-28 Chunfei
# 8     9 2018-11-29 Jenny  
# 9     9 2018-11-29 Jenny  
#10     9 2018-11-29 Jenny  
df <- structure(list(id = c(9, 9, 9, 4, 4, 4, 5, 5, 5, 5, 5, 5, 8, 
8), Date = structure(c(17864, 17864, 17864, 17681, 17716, 17760, 
17676, 17940, 17943, 17943, 17950, 17952, 17955, 17955), class = "Date"), 
Buyer = c("Jenny", "Jenny", "Jenny", "Chang", "Chang", "Chang", 
"Chunfei", "Chunfei", "Chunfei", "Chunfei", "Chunfei", "Chunfei", 
"Chunfei", "Chunfei")), row.names = c(NA, -14L), class = "data.frame")

Difference between dates between two csv files based on a unique number in Pandas


By : Tell Me How
Date : March 29 2020, 07:55 AM
hope this fix your issue Use merge_asof working only with sorted columns by datetimes, then get difference by Series.sub and last if necessary replace missing values in Last_Visit_Date by Series.fillna:
code :
csv_1['App_Date'] = pd.to_datetime(csv_1['App_Date'])
csv_2['Last_Visit_Date'] = pd.to_datetime(csv_2['Last_Visit_Date'])
csv_1 = csv_1.sort_values('App_Date')
csv_2 = csv_2.sort_values('Last_Visit_Date')

df = pd.merge_asof(csv_1, 
                   csv_2, 
                   left_on='App_Date', 
                   right_on='Last_Visit_Date', 
                   by=['Category','Mobile_Number'])
df['Difference'] = df['App_Date'].sub(df['Last_Visit_Date']).dt.days
df['Last_Visit_Date'] = df['Last_Visit_Date'].fillna(df['App_Date'])
print (df)
  Category  Mobile_Number   App_Date Last_Visit_Date  Difference
0        A      503487338 2018-10-13      2018-10-13         NaN
1        B      503477334 2018-10-16      2018-10-09         7.0
2        C      501022162 2018-10-16      2018-10-11         5.0
3        E      503427339 2018-10-17      2018-10-17         NaN
4        A      503477334 2018-10-18      2018-10-08        10.0
5        C      506012887 2018-10-21      2018-10-14         7.0
csv_1['App_Date'] = pd.to_datetime(csv_1['App_Date'])
csv_2['Last_Visit_Date'] = pd.to_datetime(csv_2['Last_Visit_Date'])
csv_1 = csv_1.sort_values('App_Date')
csv_2 = csv_2.sort_values('Last_Visit_Date')

df = pd.merge_asof(csv_1, 
                   csv_2, 
                   left_on='App_Date', 
                   right_on='Last_Visit_Date', 
                   by=['Category','Mobile_Number'],
                   allow_exact_matches=False)
print (df)
  Category  Mobile_Number   App_Date Last_Visit_Date
0        A      503487338 2018-10-13      2018-10-11
1        B      503477334 2018-10-16      2018-10-09
2        C      501022162 2018-10-16      2018-10-14
3        E      503427339 2018-10-17             NaT
4        A      503477334 2018-10-18      2018-10-08
5        C      506012887 2018-10-21      2018-10-15

df['Difference'] = df['App_Date'].sub(df['Last_Visit_Date']).dt.days
df['Last_Visit_Date'] = df['Last_Visit_Date'].fillna(df['App_Date'])
print (df)
  Category  Mobile_Number   App_Date Last_Visit_Date  Difference
0        A      503487338 2018-10-13      2018-10-11         2.0
1        B      503477334 2018-10-16      2018-10-09         7.0
2        C      501022162 2018-10-16      2018-10-14         2.0
3        E      503427339 2018-10-17      2018-10-17         NaN
4        A      503477334 2018-10-18      2018-10-08        10.0
5        C      506012887 2018-10-21      2018-10-15         6.0
Related Posts Related Posts :
  • How to add result of previous row to contents of present row?
  • Train LSTM with probabilistic labels
  • AWS Cloudwatch Logstream - What is the key, and how can I set it when getting the logstream
  • Page Pagination/Scraping with Requests/BeautifulSoup
  • How to fix NoReverseMatch on redirect
  • Using a list to name output files in Arcpy
  • Need help conditionally vectorizing a list
  • I want to apply a threshold to pixels in image using python. Where did I make a mistake?
  • Problems unsing Beautiful Soup
  • python binning data openAI gym
  • Python: Argparse with list of lists
  • Creating Columns in m x 1 dataframe based on spaces in each row?
  • Explicit relative imports within a package not using the keyword from
  • APScheduler and passing arguments
  • Compare two lists and print out when a change happens
  • Decoding Django POST request body
  • How to fill pandas dataframe columns in for loop
  • Keras backend function: InvalidArgumentError
  • Get index of elements in first Series within the second series
  • Redirecting to a new URL to parse through
  • Transform string into a bit array
  • How to print list one after the other in a vertical order in text file in python
  • Python divide each string by the total lenght of string
  • Pymongo Bulk Delete
  • Python / NiFi: ExecuteScript python, to convert an UTF-16 text files to UTF-8
  • Getting l1 normalized eigenvectors from python instead of l2?
  • Get span inside a class using WebDriver and Selenium
  • Non blocking command process
  • I'm getting positional argument in Django rest framework APIView class empty. Why? And how to pass value into it?
  • Create an array according to index in another array in Python
  • Matplotlib multiple Y-axes, xlabels disappear?
  • feedparser for reddit returning empty
  • physical dimensions and array dimensions
  • can't get my program to return to main loop
  • how to read image into tensor from url directly
  • Can't find a combination of keywords on an xml page using python and beautiful soup
  • Find the rotation of a quad (4 points, planar)
  • Class method input variables
  • Pandas Dataframe, how to group columns together in Python
  • What does "auth.User" in Django do?
  • Python - Get Last Element after str.split()
  • How to access a variable in one python function in another function
  • Manually computed validation loss different from reported val_loss when using regularization
  • Filtering with a only one conditional
  • How to set specific faker random string of specific length and using underscores for spaces?
  • seaborn FacetGrid+map_dataframe fails (but not when using map)
  • How to get GraphQL schema with Python?
  • Python - How to send values between functions once
  • Loop sum find and multiple
  • Map & append multiple values (per each key) from a dict to different columns of a dataframe
  • Python list of dictionaries incrementation error
  • Filtering Spark Dataframe
  • pytest: How to test project-dependent directory creation?
  • Python Group by and Sum with a Blank space
  • Reorder and return the whole of nested dictionary
  • Finding element from one list in nested second list
  • Calculating AUC for Unsupervised LOF in sklearn
  • Storing Specific Whole Numbers - Python
  • Simulate SHL and SHR ASM instructions in Python
  • AttributeError: type object 'DirectView' has no attribute 'as_view'
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org