pandas - Group by of a Column and Sum Contents of another column with python -

- May 15, 2012

i have dataframe merged_df_energy:

merged_df_energy.head()  act_time_aerateur_1_f1 act_time_aerateur_1_f3 act_time_aerateur_1_f5 class_energy 63.333333 63.333333 63.333333 low 0 0 0 high 45.67 0 55.94 high 0 0 23.99 low 0 20 23.99 medium

i create each act_time_aerateur_1_fx (act_time_aerateur_1_f1, act_time_aerateur_1_f3 , act_time_aerateur_1_f5) dataframe wich contains these columns : class_energy , sum_time

for example dataframe corresponding act_time_aerateur_1_f1:

class_energy    sum_time low            63.333333 medium         0 high           45.67

i thing should use group this:

data.groupby(by=['class_energy'])['sum_time'].sum()

any idea me please?

you can add columns [] aggregating:

print (df.groupby(by=['class_energy'])['act_time_aerateur_1_f1', 'act_time_aerateur_1_f3','act_time_aerateur_1_f5'].sum())               act_time_aerateur_1_f1  act_time_aerateur_1_f3  \ class_energy                                                    high                       45.670000                0.000000    low                        63.333333               63.333333    medium                      0.000000               20.000000                   act_time_aerateur_1_f5   class_energy                           high                       55.940000   low                        87.323333   medium                     23.990000

you can use parameter as_index=false:

print (df.groupby(by=['class_energy'], as_index=false)['act_time_aerateur_1_f1', 'act_time_aerateur_1_f3','act_time_aerateur_1_f5'].sum())   class_energy  act_time_aerateur_1_f1  act_time_aerateur_1_f3  \ 0         high               45.670000                0.000000    1          low               63.333333               63.333333    2       medium                0.000000               20.000000        act_time_aerateur_1_f5   0               55.940000   1               87.323333   2               23.990000

if need aggregate first 3 columns:

print (df.groupby(by=['class_energy'], as_index=false)[df.columns[:3]].sum())   class_energy  act_time_aerateur_1_f1  act_time_aerateur_1_f3  \ 0         high               45.670000                0.000000    1          low               63.333333               63.333333    2       medium                0.000000               20.000000        act_time_aerateur_1_f5   0               55.940000   1               87.323333   2               23.990000

...or columns without last:

print (df.groupby(by=['class_energy'], as_index=false)[df.columns[:-1]].sum())   class_energy  act_time_aerateur_1_f1  act_time_aerateur_1_f3  \ 0         high               45.670000                0.000000    1          low               63.333333               63.333333    2       medium                0.000000               20.000000        act_time_aerateur_1_f5   0               55.940000   1               87.323333   2               23.990000

Search This Blog

celery

pandas - Group by of a Column and Sum Contents of another column with python -

Comments

Post a Comment

Popular posts from this blog

mysql - Dreamhost PyCharm Django Python 3 Launching a Site -

java - Sending SMS with SMSLib and Web Services -

java - How to resolve The method toString() in the type Object is not applicable for the arguments (InputStream) -