Python pandas: dataframe grouped by a column(such as, name), and get the value of some columns in each group
By : Tickers
Date : March 29 2020, 07:55 AM
Does that help I'll elaborate a bit more than in the comment. The problem is that extract_text is only able to handle individual strings. However when you groupby and then apply, you're sending a list with all the strings in the group. There are two solutions, the first is the one I indicated (sending individual strings): code :
index_num = df.groupby('age')['text'].apply(lambda x: [extract_text(_) for _ in x])
def extract_text(list_texts):
list_index = []
for text in list_texts:
index_n = None
text_len = len(text)
for i in range(0, text_len, 1):
if text[i] == 'I':
index_n = i
list_index.append(index_n)
return list_index
index_num = df.groupby('age')['text'].apply(extract_text)

pandas update dataframe by another dataframe with group by columns
By : Hila Levy
Date : March 29 2020, 07:55 AM
I wish this helpful for you I have two dataframe like this , I think this will be more space efficient: Edit To Add code :
In [22]: df1,df2 = df1.align(df2,join='left',axis=0)
In [23]: df1
Out[23]:
A B C D E
0 1 1 A B0 A
1 2 1 A1 B1 A
2 3 1 A2 B2 S
3 4 1 A3 B3 S
4 5 1 A4 B4 S
In [24]: df2
Out[24]:
A C D
0 1 c d1
1 6 c1 d1
2 9 c2 d2
3 4 c3 d3
4 NaN NaN NaN
In [26]: equal_rows = df1.A == df2.A
In [27]: df1.loc[equal_rows]
Out[27]:
A B C D E
0 1 1 A B0 A
3 4 1 A3 B3 S
In [28]: df1.loc[equal_rows,['C','D']] = df2.loc[equal_rows,['C','D']]
In [29]: df1
Out[29]:
A B C D E
0 1 1 c d1 A
1 2 1 A1 B1 A
2 3 1 A2 B2 S
3 4 1 c3 d3 S
4 5 1 A4 B4 S
In [30]: df2.dropna(how='all',axis=0, inplace=True)
In [31]: df2
Out[31]:
A C D
0 1 c d1
1 6 c1 d1
2 9 c2 d2
3 4 c3 d3
In [13]: merged = pd.merge(df1,df2,how='left', on=['A'])
In [14]: merged
Out[14]:
A B C_x D_x E C_y D_y
0 1 1 A B0 A c d1
1 2 1 A1 B1 A NaN NaN
2 3 1 A2 B2 S NaN NaN
3 4 1 A3 B3 S c3 d3
4 5 1 A4 B4 S NaN NaN
In [15]: merged.fillna({'C_y':df1.C,'D_y':df1.D},inplace=True)
Out[15]:
A B C_x D_x E C_y D_y
0 1 1 A B0 A c d1
1 2 1 A1 B1 A A1 B1
2 3 1 A2 B2 S A2 B2
3 4 1 A3 B3 S c3 d3
4 5 1 A4 B4 S A4 B4
In [16]: merged.drop(['C_x','D_x'],axis=1,inplace=True)
In [17]: merged
Out[17]:
A B E C_y D_y
0 1 1 A c d1
1 2 1 A A1 B1
2 3 1 S A2 B2
3 4 1 S c3 d3
4 5 1 S A4 B4
In [20]: merged.rename(columns={"C_y":'C','D_y':'D'},inplace=True)
In [21]: merged
Out[21]:
A B E C D
0 1 1 A c d1
1 2 1 A A1 B1
2 3 1 S A2 B2
3 4 1 S c3 d3
4 5 1 S A4 B4

When using python pandas dataframe, how do you group columns?
By : Satish Jadhav
Date : March 29 2020, 07:55 AM
help you fix your problem My input excel (xlsx) file has a format like: , IIUC, use split then group on the first part before '.': code :
df.groupby(df.columns.str.split('.').str[0], axis=1).sum()
g_1 g_2 mz n
0 13 24 1 14
1 13 24 1 14
2 13 24 1 14
mz n n.1 n.2 n.3 g_1 g_1.1 g_2 g_2.1 g_2.2
0 1 2 3 4 5 6 7 8 8 8
1 1 2 3 4 5 6 7 8 8 8
2 1 2 3 4 5 6 7 8 8 8

How to perform group by on multiple columns in a pandas dataframe and predict future values using fbProphet in python?
By : Nikhil Jadhav
Date : March 29 2020, 07:55 AM

How to group by a pandas.Dataframe's columns based on the indexes and values of another pandas.Series?
By : user3551437
Date : March 29 2020, 07:55 AM
To fix the issue you can do After further research, I have found a solution. Here's what I came up with using pandas.Series.map on the DataFrame's columns: code :
def sum_weights_by_classification_labels(security_weights, security_classification):
classification_weights = security_weights.copy()
classification_weights.columns = classification_weights.columns.map(security_classification)
classification_weights = classification_weights.groupby(classification_weights.columns, axis=1).sum()
return classification_weights
def sum_weights_by_classification_labels(security_weights, security_classification):
security_weights_transposed = security_weights.transpose()
merged_data = security_weights_transposed.merge(security_classification, how='left', left_index=True,
right_index=True)
classification_weights = merged_data.groupby(security_classification.name).sum().transpose()
return classification_weights

