pandas - Inconsistent behavior of `dataframe.groupby(allcolumns).agg(len)` -


for demonstration purposes, first, define couple of simple dataframes, df0 , df1:

>>> import pandas pd >>> import collections co >>> data = [['a',  1], ...         ['b',  2], ...         ['a',  3], ...         ['b',  1], ...         ['a',  2], ...         ['a',  3], ...         ['b',  1]] >>> colnames = tuple('xy') >>> df0 = pd.dataframe(co.ordereddict([(colnames[i], ...                                     [row[i] row in data]) ...                                    in range(len(colnames))])) >>> df0    x  y 0   1 1  b  2 2   3 3  b  1 4   2 5   3 6  b  1 >>> >>> df1 = df0.ix[:, [0]] >>> df1    x 0  1  b 2  3  b 4  5  6  b 

now, here's result of grouping on all columns of df0 , aggregating len aggregator function:

>>> df0.groupby(['x', 'y']).agg(len) x  y  1    1    2    1    3    2 b  1    2    2    1 dtype: int64 

based on result, expected analogous operation df1, namely df1.groupby(['x']).agg(len), give this:

x  4 b  3 dtype: int64 

but that's not happens:

>>> df1.groupby(['x']).agg(len) empty dataframe columns: [] index: [a, b] 

my questions are:

  1. is difference in behavior have expected on basis of pandas documentation, or bug in pandas? (if former case, please point me relevant documentation.)
  2. what's simplest way output expected (as shown above) df1.groupby(['x']).agg(len)?

see note @ bottom of aggrgation section: http://pandas.pydata.org/pandas-docs/stable/groupby.html#aggregation. pandas 'eats' aggregator column, left nothing aggregate.

you have series @ point. this:

in [63]: s = df1['x']  in [64]: s.groupby(s).agg(len) out[64]:  x    4 b    3 name: x, dtype: int64 

pandas doesn't automatically because hard figure out want , makes logic more complicated. suppose call bug (in should raise), technically valid.


Comments

Popular posts from this blog

javascript - how to protect a flash video from refresh? -

visual studio 2010 - Connect to informix database windows form application -

android - Associate same looper with different threads -