python - Plot average on subplots (pandas) -

- April 15, 2015

i've managed plot subplots groupby. have 2 columns 'a', , 'b', want plot on subplot (1 per value in 'b') respective averages. prepare data counting, dropping duplicates, , summing (if there more elegant way it, please let me know!).

df = pd.dataframe([[1, 'cat1'], [1, 'cat1'], [4, 'cat2'], [3, 'cat1'], [5, 'cat1'],[1, 'cat2']], columns=['a', 'b']) df = df[['a','b']] df['count'] = df.groupby(['a','b'])['a'].transform('count') df = df.drop_duplicates(['a','b']) df = df.groupby(['a','b']).sum()

then unstack , plot subplots:

plot = df.unstack().plot(kind='bar',subplots=true, sharex=true, sharey=true, layout = (3,3), legend=false) plt.show(block=true)

i add mean each category, have don't know: 1. how calculate mean. if calculate on unstacked groupby, mean of count, rather value 'a'. 2. once have mean value, don't know how plot on same subplot.

any welcomed :)

edit following nickil maveli's answer: i'm trying achieve plot bars of grouped values on a, , plot vertical line mean value on b. using graphs nickil maveli, be:

from i've found on stackexchange, think should using plt.axvline(mean, color='r', linestyle='--'). however, don't know how call have different mean per plot.

iiuc, can use agg on mean , count compute averages , counts beforehand.

df_1 = df.groupby(['a', 'b'])['a'].agg({'counts': 'count'}).reset_index() df_2 = df.groupby('b')['a'].agg({'average': 'mean'}).reset_index()

followed df.merge on column b, common column in both groupby operations. then, duplicated entries among columns , b can removed.

df = df_1.merge(df_2, on='b').drop_duplicates(['a', 'b']) df.drop('average', axis=1, inplace=true) df = df.groupby(['a','b']).sum()

make modifications second dataframe let column take mean values.

df_2['a'] = df_2['average'] df_2 = df_2.groupby(['a','b']).sum()

using layout , targetting multiple axes.

fig, ax = plt.subplots(2, 2, figsize=(8, 8))  target1 = [ax[0][0], ax[0][1]] target2 = [ax[1][0], ax[1][1]]

count groupby plot.

df.unstack().plot(kind='bar', subplots=true, rot=0, xlim=(0,5), ax=target1,                             ylim=(0,3), layout=(2,2), legend=false)

mean groupby plot.

df_2.unstack().plot(kind='bar', width=0.005, subplots=true, rot=0, xlim=(0,5), ax=target2,                     ylim=(0,3), layout=(2,2), legend=false, color='k')

adjusting spacing between subplots.

plt.subplots_adjust(wspace=0.5, hspace=0.5) plt.show()

Search This Blog

celery

python - Plot average on subplots (pandas) -

Comments

Post a Comment

Popular posts from this blog

mysql - Dreamhost PyCharm Django Python 3 Launching a Site -

java - Sending SMS with SMSLib and Web Services -

java - How to resolve The method toString() in the type Object is not applicable for the arguments (InputStream) -