BEN CHEN's Homepage

pandas groupby and get list

Pandas

Group by columns, and then get a list of all values from another column for each group.

Retain the same granularity, e.g. repeat the list of all values for each row of a group.

The reindex is the key here.

import pandas as pd

# Sample DataFrame with two groupby columns

data = {'A1': ['X', 'X', 'Y', 'Y', 'X', 'Y'],

'A2': ['A', 'B', 'A', 'B', 'B', 'A'],

'B': [1, 2, 3, 4, 5, 6]}

df = pd.DataFrame(data)

# Group by 'A1' and 'A2' and get the list of all 'B' values for each group

df['B_list'] = df.groupby(['A1', 'A2'])['B'].apply(lambda x: list(x)).reindex(df[['A1', 'A2']]).values

print(df)

If only after the list of all values at group granularity, it is simple.

import pandas as pd

data = {'A1': ['X', 'X', 'Y', 'Y', 'X', 'Y'],

'A2': ['A', 'B', 'A', 'B', 'B', 'A'],

'B': [1, 2, 3, 4, 5, 6]}

df = pd.DataFrame(data)

df2 = df.groupby(['A1', 'A2'])['B'].apply(lambda x: list(x)).reset_index()

print(df2)

Page updated

Google Sites

Report abuse