Среднее по строкам python

Содержание

numpy.mean#
Как рассчитать среднее значение выбранных столбцов в Pandas
Метод 1: вычислить среднее значение строки для всех столбцов
Метод 2: вычислить среднее значение строки для определенных столбцов
Дополнительные ресурсы
Среднее значение mean() в DataFrame Pandas
Пример 2
Пример 3: по строкам
Метод mean() в Python
Использование функции mean()
mean() с модулем NumPy
Функция mean() с модулем Pandas

numpy.mean#

Compute the arithmetic mean along the specified axis.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters : a array_like

Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.

axis None or int or tuple of ints, optional

Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.

If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before.

dtype data-type, optional

Type to use in computing the mean. For integer inputs, the default is float64 ; for floating point inputs, it is the same as the input dtype.

out ndarray, optional

Alternate output array in which to place the result. The default is None ; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See Output type determination for more details.

keepdims bool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the mean method of sub-classes of ndarray , however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

where array_like of bool, optional

Elements to include in the mean. See reduce for details.

If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.

The arithmetic mean is the sum of the elements along the axis divided by the number of elements.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

By default, float16 results are computed using float32 intermediates for extra precision.

>>> a = np.array([[1, 2], [3, 4]]) >>> np.mean(a) 2.5 >>> np.mean(a, axis=0) array([2., 3.]) >>> np.mean(a, axis=1) array([1.5, 3.5])

In single precision, mean can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32) >>> a[0, :] = 1.0 >>> a[1, :] = 0.1 >>> np.mean(a) 0.54999924

Computing the mean in float64 is more accurate:

>>> np.mean(a, dtype=np.float64) 0.55000000074505806 # may vary

Specifying a where argument:

>>> a = np.array([[5, 9, 13], [14, 10, 12], [11, 15, 19]]) >>> np.mean(a) 12.0 >>> np.mean(a, where=[[True], [False], [False]]) 9.0

Источник

Как рассчитать среднее значение выбранных столбцов в Pandas

Вы можете использовать следующие методы для вычисления средних значений строк для выбранных столбцов в кадре данных pandas:

Метод 1: вычислить среднее значение строки для всех столбцов

Метод 2: вычислить среднее значение строки для определенных столбцов

В следующих примерах показано, как использовать каждый метод на практике со следующими пандами DataFrame:

import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame df points assists rebounds 0 14 5 11 1 19 7 8 2 9 7 10 3 21 9 6 4 25 12 6 5 29 9 5 6 20 9 9 7 11 4 12

Метод 1: вычислить среднее значение строки для всех столбцов

Следующий код показывает, как создать новый столбец в DataFrame, который отображает среднее значение строки для всех столбцов:

#define new column that shows the average row value for all columns df['average_all'] = df.mean (axis= 1 ) #view updated DataFrame df points assists rebounds average_all 0 14 5 11 10.000000 1 19 7 8 11.333333 2 9 7 10 8.666667 3 21 9 6 12.000000 4 25 12 6 14.333333 5 29 9 5 14.333333 6 20 9 9 12.666667 7 11 4 12 9.000000

Вот как интерпретировать вывод:

Среднее значение первой строки рассчитывается как: (14+5+11) / 3 = 10 .

Среднее значение второй строки рассчитывается как: (19+7+8) / 3 = 11,33 .

Метод 2: вычислить среднее значение строки для определенных столбцов

В следующем коде показано, как рассчитать среднее значение строки только для столбцов «очки» и «подборы»:

#define new column that shows average of row values for points and rebounds columns df['avg_points_rebounds'] = df[['points', 'rebounds']]. mean (axis= 1 ) #view updated DataFrame df points assists rebounds avg_points_rebounds 0 14 5 11 12.5 1 19 7 8 13.5 2 9 7 10 9.5 3 21 9 6 13.5 4 25 12 6 15.5 5 29 9 5 17.0 6 20 9 9 14.5 7 11 4 12 11.5

Вот как интерпретировать вывод:

Среднее значение «очков» и «подборов» в первой строке рассчитывается как: (14+11) / 2 = 12,5 .

Среднее значение «очков» и «подборов» во второй строке рассчитывается как: (19+8) / 2 = 13,5 .

Дополнительные ресурсы

В следующих руководствах объясняется, как выполнять другие распространенные операции в Python:

Источник

Среднее значение mean() в DataFrame Pandas

В этом примере мы рассчитаем среднее значение по столбцам. Мы узнаем средние оценки, полученные студентами по предметам.

import pandas as pd mydictionary = # create dataframe df_marks = pd.DataFrame(mydictionary) print('DataFrame\n----------') print(df_marks) # calculate mean mean = df_marks.mean() print('\nMean\n------') print(mean)

DataFrame ---------- names physics chemistry algebra 0 Somu 68 84 78 1 Kiku 74 56 88 2 Amol 77 73 82 3 Lini 78 69 87 Mean ------ physics 74.25 chemistry 70.50 algebra 83.75 dtype: float64

Функция mean() возвращает Pandas, это поведение функции mean() по умолчанию. Следовательно, в этом конкретном случае вам не нужно передавать какие-либо аргументы функции mean(). Или, если вы хотите явно указать функцию для вычисления по столбцам, передайте axis = 0, как показано ниже.

Пример 2

В этом примере мы создадим DataFrame с числами, присутствующими во всех столбцах, и вычислим среднее значение.

Из предыдущего примера мы видели, что функция mean() по умолчанию возвращает среднее значение, вычисленное среди столбцов.

import pandas as pd mydictionary = # create dataframe df_marks = pd.DataFrame(mydictionary) print('DataFrame\n----------') print(df_marks) # calculate mean of the whole DataFrame mean = df_marks.mean().mean() print('\nMean\n------') print(mean)

DataFrame ---------- names physics chemistry algebra 0 Somu 68 84 78 1 Kiku 74 56 88 2 Amol 77 73 82 3 Lini 78 69 87 Mean ------ 76.16666666666667

Пример 3: по строкам

В этом примере мы вычислим среднее значение всех столбцов по строкам или оси = 1. В этом конкретном примере среднее значение по строкам дает среднее значение или процент оценок, полученных каждым учеником.

import pandas as pd mydictionary = # create dataframe df_marks = pd.DataFrame(mydictionary) print('DataFrame\n----------') print(df_marks) # calculate mean along rows mean = df_marks.mean(axis=1) print('\nMean\n------') print(mean) # display names and average marks print('\nAverage marks or percentage for each student') print(pd.concat([df_marks['names'], mean], axis=1))

DataFrame ---------- names physics chemistry algebra 0 Somu 68 84 78 1 Kiku 74 56 88 2 Amol 77 73 82 3 Lini 78 69 87 Mean ------ 0 76.666667 1 72.666667 2 77.333333 3 78.000000 dtype: float64 Average marks or percentage for each student names 0 0 Somu 76.666667 1 Kiku 72.666667 2 Amol 77.333333 3 Lini 78.000000

В этом руководстве по Pandas мы узнали, как рассчитать среднее значение всего DataFrame, по столбцу (столбцам) и строкам.

Источник

Метод mean() в Python

Среднее – это значение, представляющее весь набор объектов. Считается центральным значением набора чисел.

Среднее значение рассчитывается путем деления суммы всех значений объектов на количество объектов.

Формула: (сумма значений) / общие значения

Теперь давайте разберемся, как работает функция mean() для вычисления среднего значения.

Использование функции mean()

Функция mean() помогает вычислить среднее значение набора значений, переданных в функцию.

Модуль статистики в Python используется для выполнения всех статистических операций с данными. Нам нужно импортировать модуль статистики, используя следующую команду:

Функция statistics.mean() принимает значения данных в качестве аргумента и возвращает среднее значение переданных ей значений.

import statistics data = [10,20,30,40,50] res_mean = statistics.mean(data) print(res_mean)

mean() с модулем NumPy

Модуль Python NumPy представляет набор значений в виде массива. Мы можем вычислить среднее значение этих элементов массива с помощью функции numpy.mean().

Функция numpy.mean() работает так же, как функция statistics.mean().

import numpy as np data = np.arange(1,10) res_mean = np.mean(data) print(res_mean)

В приведенном выше примере мы использовали функцию numpy.arange (start, stop) для создания равномерно распределенных значений в диапазоне, указанном в качестве параметров. Кроме того, функция numpy.mean() используется для вычисления среднего значения всех элементов массива.

Функция mean() с модулем Pandas

Модуль Pandas работает с огромными наборами данных в виде DataFrames. Среднее значение этих огромных наборов данных можно вычислить с помощью функции pandas.DataFrame.mean().

Функция pandas.DataFrame.mean() возвращает среднее значение этих значений данных.

import numpy as np import pandas as pd data = np.arange(1,10) df = pd.DataFrame(data) res_mean = df.mean() print(res_mean)

В приведенном выше примере мы создали массив NumPy с помощью функции numpy.arange(), а затем преобразовали значения массива в DataFrame с помощью функции pandas.DataFrame(). Кроме того, мы вычислили среднее значение значений DataFrame с помощью функции pandas.DataFrame.mean().

import pandas as pd data = pd.read_csv("C:/mtcars.csv") res_mean = data['qsec'].mean() print(res_mean)

В приведенном выше примере мы использовали вышеупомянутый набор данных и вычислили среднее значение всех значений данных, представленных в столбце данных «qsec».

Источник