python - Ambiguity in Pandas Dataframe / Numpy Array "axis" definition -


i've been confused how python axes defined, , whether refer dataframe's rows or columns. consider code below:

>>> df = pd.dataframe([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]], columns=["col1", "col2", "col3", "col4"]) >>> df    col1  col2  col3  col4 0     1     1     1     1 1     2     2     2     2 2     3     3     3     3 

so if call df.mean(axis=1), we'll mean across rows:

>>> df.mean(axis=1) 0    1 1    2 2    3 

however, if call df.drop(name, axis=1), drop column, not row:

>>> df.drop("col4", axis=1)    col1  col2  col3 0     1     1     1 1     2     2     2 2     3     3     3 

can me understand meant "axis" in pandas/numpy/scipy?

a side note, dataframe.mean might defined wrong. says in documentation dataframe.mean axis=1 supposed mean mean on columns, not rows...

it's perhaps simplest remember 0=down , 1=across.

this means:

  • use axis=0 apply method down each column, or row labels (the index).
  • use axis=1 apply method across each row, or column labels.

here's picture show parts of dataframe each axis refers to:

it's useful remember pandas follows numpy's use of word axis. usage explained in numpy's glossary of terms:

axes defined arrays more 1 dimension. 2-dimensional array has 2 corresponding axes: first running vertically downwards across rows (axis 0), , second running horizontally across columns (axis 1). [my emphasis]

so, concerning method in question, df.mean(axis=1), seems correctly defined. takes mean of entries horizontally across columns, is, along each individual row. on other hand, df.mean(axis=0) operation acting vertically downwards across rows.

similarly, df.drop(name, axis=1) refers action on column labels, because intuitively go across horizontal axis. specifying axis=0 make method act on rows instead.


Comments

Popular posts from this blog

javascript - how to protect a flash video from refresh? -

visual studio 2010 - Connect to informix database windows form application -

android - Associate same looper with different threads -