Python pandas select all but rows

Содержание

pandas: Select rows/columns in DataFrame by indexing «[]»
Select columns of pandas.DataFrame
[Column name] : Get a single column as pandas.Series
[List of column names] : Get single or multiple columns as pandas.DataFrame
Select rows of pandas.DataFrame
[Slice of row name/number] : Get single or multiple rows as pandas.DataFrame
[Boolean array/Series] : Get True rows as pandas.DataFrame
Select elements of pandas.Series
[Label/position] : Get the value of a single element
[List of labels/positions] : Get single or multiple elements as pandas.Series
[Slice of label/position] : Get single or multiple elements as pandas.Series
[Boolean array/Series] : Get True elements as pandas.Series
Select elements of pandas.DataFrame
Note that row and column names are integer
Related Categories
Related Articles

pandas: Select rows/columns in DataFrame by indexing «[]»

You can select and get rows, columns, and elements in pandas.DataFrame and pandas.Series by indexing operators (square brackets) [] .

This article describes the following contents.

Select columns of pandas.DataFrame
- [Column name] : Get a single column as Series
- [List of column names] : Get single or multiple columns as DataFrame
- [Slice of row name/number] : Get single or multiple rows as DataFrame
- [Boolean array/Series] : Get True rows as DataFrame
- [Label/position] : Get the value of a single element
- [List of labels/positions] : Get single or multiple elements as Series
- [Slice of label/position] : Get single or multiple elements as Series
- [Boolean array/Series] : Get True elements as Series
You can also select columns by slice and rows by its name/number or their list with loc and iloc .

The following CSV file is used in this sample code.
```
import pandas as pd print(pd.__version__) # 1.4.1 df = pd.read_csv('data/src/sample_pandas_normal.csv', index_col=0) print(df) # age state point # name # Alice 24 NY 64 # Bob 42 CA 92 # Charlie 18 CA 70 # Dave 68 TX 70 # Ellen 24 CA 88 # Frank 30 NY 57 
```
Select columns of pandas.DataFrame

[Column name] : Get a single column as pandas.Series

You can get the column as pandas.Series by specifying the column name (label) in [] .
```
print(df['age']) print(type(df['age'])) # name # Alice 24 # Bob 42 # Charlie 18 # Dave 68 # Ellen 24 # Frank 30 # Name: age, dtype: int64 # 
```
You may also specify column names as an attribute, like . . Note that if the column name conflicts with existing method names, the method takes precedence.
```
print(df.age) print(type(df.age)) # name # Alice 24 # Bob 42 # Charlie 18 # Dave 68 # Ellen 24 # Frank 30 # Name: age, dtype: int64 # 
```
[List of column names] : Get single or multiple columns as pandas.DataFrame

You can get multiple columns as pandas.DataFrame by specifying a list of column names in [] . The columns will be in the order of the specified list.
```
print(df[['point', 'age']]) print(type(df[['point', 'age']])) # point age # name # Alice 64 24 # Bob 92 42 # Charlie 70 18 # Dave 70 68 # Ellen 88 24 # Frank 57 30 # 
```
If you specify a list with one element, a single column pandas.DataFrame is returned, not pandas.Series .
```
print(df[['age']]) print(type(df[['age']])) # age # name # Alice 24 # Bob 42 # Charlie 18 # Dave 68 # Ellen 24 # Frank 30 # 
```
You may also specify a slice of the column name with loc or a column number with iloc . See the following article for details.
```
print(df.loc[:, 'age':'state']) print(type(df.loc[:, 'age':'state'])) # age state # name # Alice 24 NY # Bob 42 CA # Charlie 18 CA # Dave 68 TX # Ellen 24 CA # Frank 30 NY # print(df.iloc[:, [2, 0]]) print(type(df.iloc[:, [2, 0]])) # point age # name # Alice 64 24 # Bob 92 42 # Charlie 70 18 # Dave 70 68 # Ellen 88 24 # Frank 57 30 # 
```
Select rows of pandas.DataFrame

[Slice of row name/number] : Get single or multiple rows as pandas.DataFrame

You can get multiple rows as a pandas.DataFrame by specifying a slice in [] .
```
print(df[1:4]) print(type(df[1:4])) # age state point # name # Bob 42 CA 92 # Charlie 18 CA 70 # Dave 68 TX 70 # 
```
You may specify a negative value and step ( start:stop:step ) as in a normal slice. For example, you can use slices to extract odd or even rows.
```
print(df[:-3]) print(type(df[:-3])) # age state point # name # Alice 24 NY 64 # Bob 42 CA 92 # Charlie 18 CA 70 # print(df[::2]) print(type(df[::2])) # age state point # name # Alice 24 NY 64 # Charlie 18 CA 70 # Ellen 24 CA 88 # print(df[1::2]) print(type(df[1::2])) # age state point # name # Bob 42 CA 92 # Dave 68 TX 70 # Frank 30 NY 57 # 
```
An error is raised if a row number is specified alone instead of a slice.

If only one row is selected, pandas.DataFrame is returned, not pandas.Series .
```
print(df[1:2]) print(type(df[1:2])) # age state point # name # Bob 42 CA 92 # 
```
You may also specify a slice of row name (label) instead of row number (position). In the case of a slice with row name, the stop row is included.
```
print(df['Bob':'Ellen']) print(type(df['Bob':'Ellen'])) # age state point # name # Bob 42 CA 92 # Charlie 18 CA 70 # Dave 68 TX 70 # Ellen 24 CA 88 # 
```
You can specify the row name/number alone or its list with loc or iloc . See the following article for details.
```
print(df.loc['Bob']) print(type(df.loc['Bob'])) # age 42 # state CA # point 92 # Name: Bob, dtype: object # print(df.loc[['Bob', 'Ellen']]) print(type(df.loc[['Bob', 'Ellen']])) # age state point # name # Bob 42 CA 92 # Ellen 24 CA 88 # print(df.iloc[[1, 4]]) print(type(df.iloc[[1, 4]])) # age state point # name # Bob 42 CA 92 # Ellen 24 CA 88 # 
```
[Boolean array/Series] : Get True rows as pandas.DataFrame

By specifying a boolean array ( list or numpy.ndarray ) in [] , you can extract the True rows as pandas.DataFrame .
```
l_bool = [True, False, False, True, True, False] print(df[l_bool]) # age state point # name # Alice 24 NY 64 # Dave 68 TX 70 # Ellen 24 CA 88 
```
An error is raised if the number of elements does not match.
```
# print(df[[True, False, False]]) # ValueError: Item wrong length 3 instead of 6. 
```
You can also specify the boolean pandas.Series . Rows are extracted based on labels, not order.
```
s_bool = pd.Series(l_bool, index=reversed(df.index)) print(s_bool) # Frank True # Ellen False # Dave False # Charlie True # Bob True # Alice False # dtype: bool print(df[s_bool]) # age state point # name # Bob 42 CA 92 # Charlie 18 CA 70 # Frank 30 NY 57 
```
An error is raised if the number of elements or labels does not match.
```
s_bool_wrong = pd.Series(l_bool, index=['A', 'B', 'C', 'D', 'E', 'F']) # print(df[s_bool_wrong]) # IndexingError: Unalignable boolean Series provided as indexer # (index of the boolean Series and of the indexed object do not match). 
```
Select elements of pandas.Series

Use the following pandas.Series as an example.
```
s = df['age'] print(s) # name # Alice 24 # Bob 42 # Charlie 18 # Dave 68 # Ellen 24 # Frank 30 # Name: age, dtype: int64 
```
[Label/position] : Get the value of a single element

You can get the value of the element by specifying the label/position (index) alone. When specifying by position (index), a negative value can be used to specify the position from the end. -1 is the tail.

You may also specify the label name as an attribute, like . . Note that if the label name conflicts with existing method names, the method takes precedence.
```
print(s[3]) print(type(s[3])) # 68 # print(s['Dave']) print(type(s['Dave'])) # 68 # print(s[-1]) print(type(s[-1])) # 30 # print(s.Dave) print(type(s.Dave)) # 68 # 
```
[List of labels/positions] : Get single or multiple elements as pandas.Series

You can select multiple values as pandas.Series by specifying a list of labels/positions. The elements will be in the order of the specified list.
```
print(s[[1, 3]]) print(type(s[[1, 3]])) # name # Bob 42 # Dave 68 # Name: age, dtype: int64 # print(s[['Bob', 'Dave']]) print(type(s[['Bob', 'Dave']])) # name # Bob 42 # Dave 68 # Name: age, dtype: int64 # 
```
If a list with one element is specified, pandas.Series is returned.
```
print(s[[1]]) print(type(s[[1]])) # name # Bob 42 # Name: age, dtype: int64 # print(s[['Bob']]) print(type(s[['Bob']])) # name # Bob 42 # Name: age, dtype: int64 # 
```
[Slice of label/position] : Get single or multiple elements as pandas.Series

You can also select multiple values as pandas.Series by specifying a slice of label/position. In the case of a label name, the stop element is included.
```
print(s[1:3]) print(type(s[1:3])) # name # Bob 42 # Charlie 18 # Name: age, dtype: int64 # print(s['Bob':'Dave']) print(type(s['Bob':'Dave'])) # name # Bob 42 # Charlie 18 # Dave 68 # Name: age, dtype: int64 # 
```
If one element is selected, pandas.Series is returned.
```
print(s[1:2]) print(type(s[1:2])) # name # Bob 42 # Name: age, dtype: int64 # print(s['Bob':'Bob']) print(type(s['Bob':'Bob'])) # name # Bob 42 # Name: age, dtype: int64 # 
```
[Boolean array/Series] : Get True elements as pandas.Series

By specifying a boolean array ( list or numpy.ndarray ) in [] , you can extract the True elements as pandas.Series .
```
l_bool = [True, False, False, True, True, False] print(s[l_bool]) # name # Alice 24 # Dave 68 # Ellen 24 # Name: age, dtype: int64 
```
An error is raised If the number of elements does not match.
```
# print(s[[True, False, False]]) # IndexError: Boolean index has wrong length: 3 instead of 6 
```
You can also specify the boolean pandas.Series . Elements are extracted based on labels, not order.
```
s_bool = pd.Series(l_bool, index=reversed(df.index)) print(s_bool) # Frank True # Ellen False # Dave False # Charlie True # Bob True # Alice False # dtype: bool print(s[s_bool]) # name # Bob 42 # Charlie 18 # Frank 30 # Name: age, dtype: int64 
```
An error is raised if the number of elements or labels does not match.
```
s_bool_wrong = pd.Series(l_bool, index=['A', 'B', 'C', 'D', 'E', 'F']) # print(s[s_bool_wrong]) # IndexingError: Unalignable boolean Series provided as indexer # (index of the boolean Series and of the indexed object do not match). 
```
Select elements of pandas.DataFrame

You can get the value of an element from pandas.DataFrame by extracting pandas.Series from pandas.DataFrame and then getting the value from that pandas.Series .

You may also extract any group by slices or lists.
```
print(df['Bob':'Dave'][['age', 'point']]) # age point # name # Bob 42 92 # Charlie 18 70 # Dave 68 70 
```
However, this way ( [. ]] [. ] ) is called chained indexing and may result in a SettingWithCopyWarning when assigning values.

You can select rows or columns at once with at , iat , loc , or iloc .
```
print(df.at['Alice', 'age']) # 24 print(df.loc['Bob':'Dave', ['age', 'point']]) # age point # name # Bob 42 92 # Charlie 18 70 # Dave 68 70 
```
Note that row and column names are integer

Be careful when row and column names are integers.

Use the following pandas.DataFrame as an example.
```
df = pd.DataFrame([[0, 10, 20], [30, 40, 50], [60, 70, 80]], index=[2, 0, 1], columns=[1, 2, 0]) print(df) # 1 2 0 # 2 0 10 20 # 0 30 40 50 # 1 60 70 80 
```
If [scalar value] or [list] , the specified value is considered a column name.
```
print(df[0]) # 2 20 # 0 50 # 1 80 # Name: 0, dtype: int64 print(df[[0, 2]]) # 0 2 # 2 20 10 # 0 50 40 # 1 80 70 
```
If [slice] , the specified value is considered a row number, not a row name. Negative values are also allowed.
```
print(df[:2]) # 1 2 0 # 2 0 10 20 # 0 30 40 50 print(df[-2:]) # 1 2 0 # 0 30 40 50 # 1 60 70 80 
```
Use loc or iloc to clearly specify whether it is a name (label) or a number (position).
```
print(df.loc[:2]) # 1 2 0 # 2 0 10 20 print(df.iloc[:2]) # 1 2 0 # 2 0 10 20 # 0 30 40 50 
```
```
s = df[2] print(s) # 2 10 # 0 40 # 1 70 # Name: 2, dtype: int64 
```
In pandas.Series , the specified value is considered a label, not an index.

Use at or iat to clearly specify whether it is a label or an index.
```
print(s.at[0]) # 40 print(s.iat[0]) # 10 
```
Note that if you specify [-1] , it is considered a label named -1 , not the tail. You can use iat .
```
# print(s[-1]) # KeyError: -1 print(s.iat[-1]) # 70 
```
Thus, it is better to use at , iat , loc , or iloc when the row or column name is an integer.

Related Categories

Related Articles
- Check pandas version: pd.show_versions
- pandas: Remove missing values (NaN) with dropna()
- How to fix «ValueError: The truth value . is ambiguous» in NumPy, pandas
- pandas: Rename column/index names (labels) of DataFrame
- pandas: Shuffle rows/elements of DataFrame/Series
- pandas: How to fix SettingWithCopyWarning: A value is trying to be set on .
- pandas: Assign existing column to the DataFrame index with set_index()
- pandas: Copy DataFrame to the clipboard with to_clipboard()
- pandas: Get first/last n rows of DataFrame with head(), tail(), slice
- Missing values in pandas (nan, None, pd.NA)
- pandas: Get and set options for display, data behavior, etc.
- Difference between lists, arrays and numpy.ndarray in Python
- pandas: Split string columns by delimiters or regular expressions
- pandas: Get clipboard contents as DataFrame with read_clipboard()
- pandas: Slice substrings from each element in columns
Источник

Читайте также: Vertical Line in html

Python pandas select all but rows

pandas: Select rows/columns in DataFrame by indexing «[]»

Select columns of pandas.DataFrame

[Column name] : Get a single column as pandas.Series

[List of column names] : Get single or multiple columns as pandas.DataFrame

Select rows of pandas.DataFrame

[Slice of row name/number] : Get single or multiple rows as pandas.DataFrame

[Boolean array/Series] : Get True rows as pandas.DataFrame

Select elements of pandas.Series

[Label/position] : Get the value of a single element

[List of labels/positions] : Get single or multiple elements as pandas.Series

[Slice of label/position] : Get single or multiple elements as pandas.Series

[Boolean array/Series] : Get True elements as pandas.Series

Select elements of pandas.DataFrame

Note that row and column names are integer

Related Categories

Related Articles