logo
Tags down

shadow

Get column value corresponding to the idxmax from another column per group


By : user2174790
Date : October 17 2020, 08:10 AM
I hope this helps . I have a dataframe consisting of 3 columns and n rows. , Comments inlined.
code :
# Initialise the grouper.
grouper = df.Index // 100
# Get list of indices corresponding to the max using `apply`.
idx = df.groupby(grouper).apply(
          lambda x: x.set_index('Index')['Max_Mass (kg/m)'].idxmax())
# Compute the max and update the other columns based on `idx` computed previously.
v = df.groupby(grouper, as_index=False)['Max_Mass (kg/m)'].max()
v['Index'] = idx.values
v['Max_Diameter (m)'] = df.loc[df.Index.isin(v.Index), 'Max_Diameter (m)'].values
print(v)
   Max_Mass (kg/m)  Index  Max_Diameter (m)
0               30      3                 3
1               60    201                 3
2               90    300                 1
3              100    400                 1


Share : facebook icon twitter icon

Pandas idxmax() get column name corresponding to max value in row


By : Lee Twito
Date : March 29 2020, 07:55 AM
This might help you IIUC, you can use axis=1 in idxmax:
code :
policy = Q_df.idxmax(axis=1)

(0, 0)    (1, 0)
(0, 1)    (0, 2)
(0, 2)    (2, 2)
(1, 0)    (1, 2)
(1, 1)    (0, 1)
(1, 2)    (2, 2)
(2, 0)    (1, 0)
(2, 1)    (0, 1)
(2, 2)    (0, 2)
dtype: object

Python Pandas Group Dataframe by Column / Sum Integer Column by String Column


By : Whily Bobs
Date : March 29 2020, 07:55 AM
I hope this helps you . I have been stuck all day and have been through numerous SO articles and am still stuck on my last final piece. I imported a CSV into a massive dataframe, then eventually got the smaller dataframe below: (Note: My df is indexed on 'Name' right now, which is what I need to base the group or sum off of) , Option 1
set_index then groupby
code :
df.set_index('Classification', append=True) \
    .groupby(level=[0, 1]).sum().reset_index(1)

                  Classification  Value 1  Value 2
Name                                              
Company 1  Classification Code 1    11000    10000
Company 2  Classification Code 1     3000     7500
Company 3  Classification Code 2    35000    42000
Company 4  Classification Code 3    14500    11500
df.groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})

                  Classification  Value 1  Value 2
Name                                              
Company 1  Classification Code 1    11000    10000
Company 2  Classification Code 1     3000     7500
Company 3  Classification Code 2    35000    42000
Company 4  Classification Code 3    14500    11500
df.apply(pd.to_numeric, errors='ignore').groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})
df['Value 1'] = df['Value 1'].astype(int)
df['Value 2'] = df['Value 2'].astype(int)
d1 = df.apply(pd.to_numeric, errors='ignore').groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})

d1[df.columns]
d1 = df.apply(pd.to_numeric, errors='ignore').groupby(level=0).agg(
    {'Classification': 'first', 'Value 1': 'sum', 'Value 2': 'sum'})

d1.reindex_axis(df.columns, 1)

python pandas - getting a column value after running idxmax / argmax


By : Faceless
Date : March 29 2020, 07:55 AM
it should still fix some issue I am trying to go through some data to find which category of products had the highest revenue. , By using coldspeed's data :-)
code :
(df.groupby('item_category_id').total_sales.sum()).loc[lambda x : x==x.max()]


Out[11]: 
item_category_id
1    440
Name: total_sales, dtype: int64

Pandas/Pythonic way to groupby a column X, within each group, return value in column Y based on value in column Z


By : trollking
Date : March 29 2020, 07:55 AM
I wish this helpful for you Reproducible example: , Use -
code :
date_fill = dt.datetime.strptime('2015-12-15', '%Y-%m-%d')
df['Y'] = pd.to_datetime(df['Y'], format='%Y-%m-%d')

df_g = df.loc[df.groupby(['X'])['Y'].idxmin()]
df2 = df[df['Y']==date_fill]
target_map = pd.Series(df2['Z'].tolist(),index=df2['X']).to_dict()
df_g.index = range(1, 2*len(df_g)+1, 2)
df_g = df_g.reindex(index=range(2*len(df_g)))
df_g['Y'] = df_g['Y'].fillna(date_fill)
df_g = df_g.bfill()
df_g.loc[df_g['Y']==date_fill, 'Z'] = df_g[df_g['Y']==date_fill]['X'].map(target_map)
df_g = df_g.bfill()
print(df_g)
     X          Y     Z
0  1.0 2015-12-15  10.0
1  1.0 2015-12-15  10.0
2  2.0 2015-12-15  19.0
3  2.0 2015-12-11  22.0
4  3.0 2015-12-15  34.0
5  3.0 2015-12-12  31.0

Why does `idxmax` throw an error when I try to find the column name which has the maximum value for each row?


By : user3726689
Date : March 29 2020, 07:55 AM
This might help you I'm working with the UK parliamentary election dataset from researchbriefings.parliament.uk, which for some reason has thought it worth including the vote share as a proportion as well as a number, but not worth including which party actually won in each constituency. , Because some columns were string, not float:
code :
>>> parties = pd.Series(['con','lib','lab','natSW','oth'])
>>> elections[parties + '_votes'].dtypes

con_votes      float64
lib_votes       object
lab_votes       object
natSW_votes    float64
oth_votes      float64
dtype: object
>>> for p in parties:
>>>     elections[p + '_votes'] = pd.to_numeric(elections[p + '_votes'], errors='coerce')
>>> elections[parties + '_votes'].idxmax(axis=1)

0        oth_votes
1        con_votes
2        con_votes
3        oth_votes
4        oth_votes
           ...    
17043    lab_votes
17044    lab_votes
17045    con_votes
17046    lab_votes
17047    lab_votes
Length: 17048, dtype: object
Related Posts Related Posts :
  • String Manipulation Recursive Function
  • Filter after Groupby and Sum in pandas?
  • writing a custom function Multiply the average of x,y
  • Spotify API fetch authorization code from redirect_uri
  • sklearn use RandomizedSearchCV with custom metrics and catch Exceptions
  • IndexOutOfRange error when filling a List Python
  • sns stripplot with just top n number of categories
  • Python classes keep calling eachother
  • How do I create a Dataframe_new in python from an existing Dataframe_old.
  • calculating an intercept point between a straight line and an ellipse - python
  • Integrating Tensorflow object detection with keras cnn classifier
  • How to skip comma while reading CSV file in python?
  • Stop Integrating when Output Reaches 0 in scipy.integrate.odeint
  • Changing the current graph of tf.placeholder objects in Tensorflow: Is it possible?
  • Logical error in while statement when used with or operator
  • django-rest-framework: int() argument must be a string, a bytes-like object or a number, not Deferred Attribute
  • how to remove a whitespace in a list in python?
  • How to reduce the number of row repetitions in a numpy array
  • Python: Dividing values of nested list with values with values of dictionary
  • Printing empty Pyramid
  • Python: How to save log file toSharePoint
  • Python Pandas count most frequent occurrences
  • How can I store / cache values from methods in a class for later use in other methods of the same class?
  • Sklearn: Pass class names to make_scorer
  • PyTorch - applying attention efficiently
  • How do I capitalize each parameter in a function definition using Python?
  • Regex matching of a bytes pattern gives unusual results - '.' not equivalent to [\x00-\xff]
  • I need help converting this REST API Curl command to Python requests
  • How do you make a variable comparison to decide a better score in a dice game?
  • How do I run sumo-gui on instant-veins-4.7.1-i1.ova
  • Deal with NAN values when creating models with python
  • Python requests: having a space in header for posting
  • Adding a column to a pandas dataframe based on cell values
  • Get mongod rs.status() results from a python script
  • ImportError: C extension: No module named 'parsing' not built
  • python pandas update column values related to previous updated row during iteration over it
  • 3 nested loops: Optimizing a simple simulation for speed
  • Assign subset of values to pandas dataframe with MultiIndex
  • How to group two sets of buttons on each top corner of the screen using Tkinter?
  • django login using class based for custom user
  • MRJob sort reducer output
  • Python Pandas Counts using rolling time window
  • Getting or editing a string from a column in a csv file with pandas
  • Python - Delete row in matrix/array if row contains
  • Using dicom Images with OpenCV in Python
  • Odoo ghost record
  • Creating and assigning multiple variables in a tkinter application
  • Graph dictionary
  • No changes to original dataframe after applying loop
  • AUC of Random forest model is lower after tuning parameters using hypergrid search and CV with 10 folds
  • Python: Reading multiple CSV files, and assigning each to a different variable
  • How to identify empty rectangle using OpenCV
  • How to iterate multilevel dataframe in python
  • How to limit the contour plot with a line plot?
  • Why subclassing a str or int behaves differently from subclising a list or dict?
  • Python decode with translation table
  • i need to click unordered links in the below URL using selenium, python
  • How to join pandas dataframe with itself?
  • How to apply a color cast to a video frame in OpenCV Python?
  • Is there any existing library for median filtering with kernel size greater then 5 using OpenCL acceleration in python?
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org