logo
Tags down

shadow

Make a new dataframe by groupby and apply own function


By : Andrii M
Date : August 01 2020, 04:00 PM
like below fixes the issue I have a dataframe as following. , One way of doing it. A function is not needed.
code :
from datetime import timedelta
a = '2020-01-04'
b = pd.to_datetime(a, format = '%Y-%m-%d')-timedelta(days=2)
consolidated = df.groupby('id')['budget'].sum().reset_index(name='total_budget')
days_ago = df.loc[pd.to_datetime(df['date'], format = '%Y-%m-%d')== b].groupby('id')['budget'].sum().reset_index(name='budget_2_days_ago')
consolidated.merge(days_ago, on='id', how='left')
    id  total_budget    budget_2_days_ago
0   1   100             NaN
1   2   600             200.0
2   3   400             200.0


Share : facebook icon twitter icon

Apply function to Dataframe GroupBy Object and return dataframe


By : user3173653
Date : March 29 2020, 07:55 AM
I hope this helps . It is better not to try to modify the DataFrame from inside the grouping operation. Instead, use the grouping operation to create the new IPs, then use map to map the old IPs to the new ones, and then (if you want) assign the new ones back into the DataFrame:
code :
def randomIP():
    return ".".join(str(np.random.randint(0, 255) for it in xrange(4)))

>>> d = pandas.DataFrame({'IP': ['1.2.3.4', '5.6.7.8', '1.2.3.4', '5.6.7.8', '9.10.11.12', '13.14.15.16'], 'Other': ['blah']*6})
>>> d
            IP Other
0      1.2.3.4  blah
1      5.6.7.8  blah
2      1.2.3.4  blah
3      5.6.7.8  blah
4   9.10.11.12  blah
5  13.14.15.16  blah
>>> d.groupby('IP').apply(lambda g: randomIP())
IP
1.2.3.4           4.183.193.46
13.14.15.16    186.124.189.188
5.6.7.8          152.24.105.42
9.10.11.12      188.140.91.209
>>> d.IP.map(d.groupby('IP').apply(lambda g: randomIP()))
0    47.227.125.190
1      164.86.98.48
2    47.227.125.190
3      164.86.98.48
4     44.150.90.127
5     71.111.59.115
Name: IP, dtype: object
>>> d['IP'] = d.IP.map(d.groupby('IP').apply(lambda g: randomIP()))
>>> d
               IP Other
0  238.227.204.61  blah
1   13.201.160.89  blah
2  238.227.204.61  blah
3   13.201.160.89  blah
4    69.33.243.79  blah
5  164.120.13.218  blah

Apply function to 2nd column in pandas dataframe groupby


By : Kushal Byatnal
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further Groupby can accept any combination of both labels and series/arrays (as long as the array has the same length as your dataframe), so you can map the function to your column and pass it into the groupby, like
code :
df.groupby(['name', df[1].map(foo)])
df['>0'] = df[1] > 0
group_sum = df.groupby(['name', '>0'])['tickets'].sum()

Pandas Dataframe groupby: double groupby & apply function


By : Nexus
Date : March 29 2020, 07:55 AM
I hope this helps . I have a question regarding pandas dataframes: , Here's one way
code :
In [101]: (df.groupby(['a', 'b'], as_index=False)['c'].mean()
             .groupby('a', as_index=False)['c'].mean()
             .rename(columns={'c': 'new col'}))
Out[101]:
   a  new col
0  1       30
1  2      100

pandas DataFrame.groupby and apply custom function


By : Matthias Rauscher
Date : March 29 2020, 07:55 AM
wish help you to fix your issue First reset_index for unique indices, then groupby with idxmax for indices of max values per groups and select rows by loc, last set_index for MultiIndex:
code :
df = df.reset_index()
df = df.loc[df.groupby(['Type','StrikePrice'])['Pos'].idxmax()]
       .set_index(['Type','StrikePrice'])
df = (df.reset_index()
       .sort_values(['Type','StrikePrice', 'Pos'])
       .drop_duplicates(['Type','StrikePrice'], keep='last')
       .set_index(['Type','StrikePrice']))
print (df)

                  Pos  AskPrice
Type StrikePrice               
C    1500.0        12     281.7
P    1400.0        31    1250.2
def f(x):
    return x[x['Pos'] == x['Pos'].max()]

df = df.groupby(level=[0,1], group_keys=False).apply(f)
print (df)
                  Pos  AskPrice
Type StrikePrice               
C    1500.0        12     281.7
P    1400.0        31    1250.2

How to apply a function that returns a dataframe to groupby objects


By : user3632848
Date : March 29 2020, 07:55 AM
help you fix your problem The function I want to apply to each groupby object is the ATR function, it takes in three ndarrays and returns a dataframe object. And I also want to put the result in a new column of the original dataframe. Here is the code that i wrote: , You can try create new column in function:
code :
def get_atr(x):
    x['atr'] = ta.ATR(x['col1'].values, x['col2'].values, x['col3'].values)
    return x

df = df.groupby('group').apply(get_atr)
Related Posts Related Posts :
  • name 'df' is not defined in box plot
  • Comparing dataframe columns
  • Can I Override Global Authentication for a Single Request Type in an ApiView using DRF?
  • Celery chain performances
  • Why am I getting "asynchronous comprehension outside of an asynchronous function"?
  • Creating a file from a docker container
  • doing too many write operations in django rest framework and postgres
  • How to change the order of bar charts in Python?
  • Pandas Data Frame manipulation
  • an undefined error in a simple python code- KeyError: '284882215'
  • Pandas split column in several columns throug string replacement or regex
  • how value is passed from __init__ method in pyhton as it dose not return anyhting
  • Dynamically inherit all Python magic methods from an instance attribute
  • Asking user to input certain information
  • how to test a deep learning model in a new dataset
  • Is np.fft.fft working properly? I am getting very large frequency values
  • How can you delete similar characters at the same positions in 2 strings
  • Does insert (at the end of a list) have O(1) time complexity?
  • Automatically Creating List of Dictionaries Based Upon Two Lists of Equal Length with Python
  • Discrete Cosine Transform (DCT) Coefficient Distribution
  • multiprocessing.Pool not running on last element of iterable
  • Python: sorting string non lexicographically
  • Render images from media directory Django
  • Cannot understand why more vectorization is slower than less vectorization in this case?
  • Django - Use a property as a foreign key
  • creating a function that loops if you do not enter the correct variables
  • Confused on how to store 3D matrices in HDF5 file in matlab?
  • TOTP: Can someone use the same otp within 30s and misuse it
  • is it possible to have 2 type hints for 1 parameter in Python?
  • Can someone explain what this Numpy array property is called?
  • Better way to add the result of apply (multiple outputs) to an existing DataFrame with column names
  • Selecting choice numbers
  • Create variables from list PYTHON
  • This code takes forever to run but doesn't give an error
  • "return" and "return None" behavior difference in generator
  • AttributeError: 'str' object has no attribute 'fbind' error using kivy in Python
  • Python not importing files when not inside conda environment
  • Is it possible to override a class' __call__ method?
  • Python library for live coordinated plotting in map
  • Pandas: counting consecutive rows with condition
  • How to define that a return type of method is an implementation of superclass
  • How can I print to the Visual Studio Code console in Portuguese?
  • Google Appengine Standard Python 2.7: Can't run Google Endpoints on localhost dev_appserver.py anymore
  • google appengine Unauthorized status 401
  • Don't understand cause of this IndentationError in my tic tac toe script
  • How to read in key-value pair from a json file as a pandas dataframe?
  • Can decorator decorate a recursive function?
  • How do I create a nested for loop where I have control of the initial loop index value
  • Unexpected error when creating a SQLite database using python
  • Pythonic way to write cascading of loops and if statements?
  • Python Beginner - Having trouble with multiple choice quiz program
  • Itertools return value NOT used in combinations
  • Return a list of words that contain a letter
  • From rows to columns using Peewee ORM
  • Parse large text document, to keep only "account number", and a specific keyword ("Market Value")
  • Cannot append to my list without getting a nonetype object error
  • Python Train Test Split
  • Optimizing following Python List of Dictionary operation with better solution
  • In Pandas merge colum1 value with colum2, both col data type is object and only few values are null in first column?
  • Python run multiple background loops independently
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org