- pandas.DataFrame.count#
- Как подсчитать количество строк в Pandas DataFrame
- Пример: подсчет количества строк в Pandas DataFrame
- pandas.DataFrame.count#
- Pandas: Number of Rows in a Dataframe (6 Ways)
- Loading a Sample Dataframe
- Pandas – Number of Rows in a Dataframe
- Pandas Len Function to Count Rows
- Pandas Shape Attribute to Count Rows
- Pandas Count Method to Count Rows in a Dataframe
- Number of Rows Containing a Value in a Pandas Dataframe
- Number of Rows Matching a Condition in a Pandas Dataframe
- Pandas Number of Rows in each Group
pandas.DataFrame.count#
The values None , NaN , NaT , and optionally numpy.inf (depending on pandas.options.mode.use_inf_as_na ) are considered NA.
Parameters axis , default 0
If 0 or ‘index’ counts are generated for each column. If 1 or ‘columns’ counts are generated for each row.
numeric_only bool, default False
Include only float , int or boolean data.
Returns Series or DataFrame
For each column/row the number of non-NA/null entries. If level is specified returns a DataFrame .
Number of non-NA elements in a Series.
Count unique combinations of columns.
Number of DataFrame rows and columns (including NA elements).
Boolean same-sized DataFrame showing places of NA elements.
Constructing DataFrame from a dictionary:
>>> df = pd.DataFrame("Person": . ["John", "Myla", "Lewis", "John", "Myla"], . "Age": [24., np.nan, 21., 33, 26], . "Single": [False, True, True, True, False]>) >>> df Person Age Single 0 John 24.0 False 1 Myla NaN True 2 Lewis 21.0 True 3 John 33.0 True 4 Myla 26.0 False
Notice the uncounted NA values:
>>> df.count() Person 5 Age 4 Single 5 dtype: int64
Counts for each row:
>>> df.count(axis='columns') 0 3 1 2 2 3 3 3 4 3 dtype: int64
Как подсчитать количество строк в Pandas DataFrame
Есть три метода, которые вы можете использовать для быстрого подсчета количества строк в кадре данных pandas:
#count number of rows in index column of data frame len(df.index ) #find length of data frame len(df) #find number of rows in data frame df.shape [0]
Каждый метод вернет один и тот же ответ.
Для небольших наборов данных разница в скорости между этими тремя методами незначительна.
Для чрезвычайно больших наборов данных рекомендуется использовать len(df.index) , так как было показано, что это самый быстрый метод.
В следующем примере показано, как использовать каждый из этих методов на практике.
Пример: подсчет количества строк в Pandas DataFrame
В следующем коде показано, как использовать три метода, упомянутых ранее, для подсчета количества строк в кадре данных pandas:
import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame df y x1 x2 x3 0 8 5 11 2 1 12 7 8 2 2 15 7 10 3 3 14 9 6 2 4 19 12 6 5 5 23 9 5 5 6 25 9 9 7 7 29 4 12 9 8 31 5 8 11 9 30 4 8 7 10 31 7 9 7 11 31 7 9 8 #count number of rows in index column of data frame len(df.index ) 12 #find length of data frame len(df) 12 #find number of rows in data frame df.shape [0] 12
Обратите внимание, что каждый метод возвращает один и тот же результат. DataFrame имеет 12 строк.
pandas.DataFrame.count#
The values None , NaN , NaT , and optionally numpy.inf (depending on pandas.options.mode.use_inf_as_na ) are considered NA.
Parameters axis , default 0
If 0 or ‘index’ counts are generated for each column. If 1 or ‘columns’ counts are generated for each row.
numeric_only bool, default False
Include only float , int or boolean data.
Returns Series or DataFrame
For each column/row the number of non-NA/null entries. If level is specified returns a DataFrame .
Number of non-NA elements in a Series.
Count unique combinations of columns.
Number of DataFrame rows and columns (including NA elements).
Boolean same-sized DataFrame showing places of NA elements.
Constructing DataFrame from a dictionary:
>>> df = pd.DataFrame("Person": . ["John", "Myla", "Lewis", "John", "Myla"], . "Age": [24., np.nan, 21., 33, 26], . "Single": [False, True, True, True, False]>) >>> df Person Age Single 0 John 24.0 False 1 Myla NaN True 2 Lewis 21.0 True 3 John 33.0 True 4 Myla 26.0 False
Notice the uncounted NA values:
>>> df.count() Person 5 Age 4 Single 5 dtype: int64
Counts for each row:
>>> df.count(axis='columns') 0 3 1 2 2 3 3 3 4 3 dtype: int64
Pandas: Number of Rows in a Dataframe (6 Ways)
In this post, you’ll learn how to count the number of rows in a Pandas Dataframe, including counting the rows containing a value or matching a condition. You’ll learn why to use and why not to use certain methods (looking at you, .count() !) and which methods are the fastest.
Loading a Sample Dataframe
To follow along with the tutorial below, feel free to copy and paste the code below into your favourite text editor to load a sample Pandas Dataframe that we’ll use to count rows!
import pandas as pd data = < 'Level': ['Beginner', 'Intermediate', 'Advanced', 'Beginner', 'Intermediate', 'Advanced', 'Beginner', 'Intermediate', 'Advanced', 'Beginner', 'Intermediate', 'Advanced', 'Beginner', 'Intermediate', 'Advanced', 'Beginner', 'Intermediate', 'Advanced'], 'Students': [10.0, 20.0, 10.0, 40.0, 20.0, 10.0, None, 20.0, 20.0, 40.0, 10.0, 30.0, 30.0, 10.0, 10.0, 10.0, 40.0, 20.0] >df = pd.DataFrame.from_dict(data) print(df.head())
This returns the following dataframe:
Level Students 0 Beginner 10.0 1 Intermediate 20.0 2 Advanced 10.0 3 Beginner 40.0 4 Intermediate 20.0
Pandas – Number of Rows in a Dataframe
Pandas provides a lot of different ways to count the number of rows in its dataframe.
Below you’ll learn about the Pandas len() function, the Pandas .shape property, and the Pandas .count() method.
Pandas Len Function to Count Rows
The Pandas len() function returns the length of a dataframe (go figure!). The safest way to determine the number of rows in a dataframe is to count the length of the dataframe’s index.
To return the length of the index, write the following code:
Pandas Shape Attribute to Count Rows
The Pandas .shape attribute can be used to return a tuple that contains the number of rows and columns, in the following format (rows, columns) . If you’re only interested in the number of rows (say, for a condition in a for loop), you can get the first index of that tuple.
Pandas Count Method to Count Rows in a Dataframe
The Pandas .count() method is, unfortunately, the slowest method of the three methods listed here. The .shape attribute and the len() function are vectorized and take the same length of time regardless of how large a dataframe is. The .count() method takes significantly longer with with larger dataframes.
One of the benefits of the .count() method is that it can ignore missing values.
>> print(df.count()) Level 18 Students 17 dtype: int64
The above output indicates that there are 18 values in the Level column, and only 17 in the Students column. This, really, counts the number of values, rather than the number of rows.
Number of Rows Containing a Value in a Pandas Dataframe
To count the rows containing a value, we can apply a boolean mask to the Pandas series (column) and see how many rows match this condition. What makes this even easier is that because Pandas treats a True as a 1 and a False as a 0, we can simply add up that array.
For an example, let’s count the number of rows where the Level column is equal to ‘Beginner’:
>> print(sum(df['Level'] == 'Beginner')) 6
Number of Rows Matching a Condition in a Pandas Dataframe
Similar to the example above, if we wanted to count the number of rows matching a particular condition, we could create a boolean mask for this.
In the example below, we count the number of rows where the Students column is equal to or greater than 20:
>> print(sum(df['Students'] >= 20)) 10
Pandas Number of Rows in each Group
To use Pandas to count the number of rows in each group created by the Pandas .groupby() method, we can use the size attribute. This returns a series of different counts of rows belonging to each group.
This returns the following series:
Level Advanced 6 Beginner 6 Intermediate 6 dtype: int64
To learn more about the Pandas .groupby() method, check out my video tutorial here: