Поиск локального максимума python

Как найти экстремумы в списке данных?

Есть ряд данных. После смены знака числа до следующей смены знака, нужно найти минимум и максимум ряда значений. Вот пример данных:

 8 -1 -41 -71 -97 -124 -126 -117 -107 -78 -26 10 46 63 94 100 88 87 105 109 81 39 7 -12 

После первой смены знака, значения стали отрицательными и их минимум стал -126. А после смены отрицательных значений на положительные, их максимум стал 109.

Нужна формула, как из такого списка вычленять таки значения минимумов-максимумов. Также нужна формула, как определить кол-во положительных и отрицательных значений в ряд. Например с начало было 1 положительное, затем 10 отрицательных, после 11 положительных и 1 отрицательное.

import numpy as np import pandas as pd fn = r'C:/Users/Timm/Desktop/0.txt' s = pd.read_csv(fn, header=None, squeeze=True) grp = s.groupby((np.sign(s).diff().fillna(0).ne(0)).cumsum()) extremums = grp.apply(lambda x: x.abs().max() * np.sign(x[x.abs().idxmax()])) sizes = grp.count()
OSError Traceback (most recent call last) in () 3 4 fn = r'C:/Users/Тimm/Desktop/0.txt' ----> 5 s = pd.read_csv(fn, header=None, squeeze=True) 6 7 grp = s.groupby((np.sign(s).diff().fillna(0).ne(0)).cumsum()) C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision) 644 skip_blank_lines=skip_blank_lines) 645 --> 646 return _read(filepath_or_buffer, kwds) 647 648 parser_f.__name__ = name C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds) 387 388 # Create the parser. --> 389 parser = TextFileReader(filepath_or_buffer, **kwds) 390 391 if (nrows is not None) and (chunksize is not None): C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in __init__(self, f, engine, **kwds) 728 self.options['has_index_names'] = kwds['has_index_names'] 729 --> 730 self._make_engine(self.engine) 731 732 def close(self): C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in _make_engine(self, engine) 921 def _make_engine(self, engine='c'): 922 if engine == 'c': --> 923 self._engine = CParserWrapper(self.f, **self.options) 924 else: 925 if engine == 'python': C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py in __init__(self, src, **kwds) 1388 kwds['allow_leading_cols'] = self.index_col is not False 1389 -> 1390 self._reader = _parser.TextReader(src, **kwds) 1391 1392 # XXX pandas\parser.pyx in pandas.parser.TextReader.__cinit__ (pandas\parser.c:4184)() pandas\parser.pyx in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:8471)() OSError: Initializing from file failed

Источник

Читайте также:  Изменение прав файла php

How to Find Local Minima in 1D and 2D NumPy Arrays?

Be on the Right Side of Change

The extreme location of a function is the value x at which the value of the function is the largest or smallest over a given interval or the entire interpretation range.

There is a local maximum of f(x) in x0 if there is an environment such that for any x in this environment f(x) ≥ f(x0) . There is a local minimum of f(x) in x0 if there is an environment such that for any x in this environment f(x) ≤ f(x0) .

For a function f(x) can have several local extrema. All global extreme values are also local, but vice versa of course this is not true.

These values are often called “peaks” because they are the tops of the “hills” and “valleys” that appear on the graph of the function.

The practical application of these points is essential, for example in image processing and signal processing, for example in detecting extraterrestrial messages! 🙂

In this article, we’ll look at some simple ways to find dips in a NumPy array with only NumPy‘s built-in functions, or using scipy library.

Find All the Dips in a 1D NumPy Array

Local minimum dips are points surrounded by larger values on both sides.

Of course, there are many ways to solve the problem, you can use pure numpy, or you can loop through the data, or you can use the SciPy library.

I found a great solution on Github, a work by Benjamin Schmidt, which shows the definition of the local extremum very well:

Method 1: Using numpy’s numpy.where

import numpy as np import matplotlib.pyplot as plt x = np.linspace (0, 50, 1000) y = 0.75 * np.sin(x) peaks = np.where((y[1:-1] > y[0:-2]) * (y[1:-1] > y[2:]))[0] + 1 dips = np.where((y[1:-1] < y[0:-2]) * (y[1:-1] < y[2:]))[0] + 1 plt.plot (x, y) plt.plot (x[peaks], y[peaks], 'o') plt.plot (x[dips], y[dips], 'o') plt.show()

The above makes a list of all indices where the value of y[i] is greater than both of its neighbors. It does not check the endpoints, which only have one neighbor each.

The extra +1 at the end is necessary because where finds the indices within the slice y[1:-1] , not the full array y . The index [0] is necessary because where returns a tuple of arrays, where the first element is the array we want.

We use the Matplotlib library to make a graph. For a complete guide, see the Finxter Computer Science Academy course on Matplotlib.

Method 2: Using numpy.diff

Another simple approach of the solution is using NumPy’s built-in function numpy.diff .

import numpy as np # define an array arr = np.array([1, 3, 7, 1, 2, 6, 0, 1, 6, 0, -2, -5, 18]) # What "numpy.diff" does, and what is the type of the result? #print((np.diff(arr), type(np.diff(arr)))) # What "numpy.sign" does? #print(np.sign(np.diff(arr))) peaks = np.diff(np.sign(np.diff(arr))) #print(peaks) local_minima_pos = np.where(peaks == 2)[0] + 1 print(local_minima_pos)

We define a 1D array “ arr ” and then comes a one-liner. In it, first, we calculate the differences between each element with np.diff(arr) .

After that, we use the „ numpy.sign ” function on that array, hence we get the sign of the differences, i.e., -1 or 1.

Then we use the function “ numpy.sign ” on the array to get the sign of the differences, i.e., -1 or 1.

The whole thing is passed into another function “ numpy.diff ” which returns +2 or -2 or 0. A value of 0 indicates a continuous decrease or increase, -2 indicates a maximum, and +2 indicates a minimum.

Now, we have the peaks and numpy.where tells us the positions. Note the +1 at the end of the line, this is because the 0th element of the peaks array is the difference between the 0th and 1st element of the arr array. The index [0] is necessary because „ numpy.where ” returns a tuple of arrays—the first element is the array we want.

Method 3: Solution with scipy

In the last example, I want to show how the SciPy library can be used to solve the problem in a single line.

import numpy as np from scipy.signal import argrelextrema x = np.linspace (0, 50, 100) y = np.sin(x) #print(y) print(argrelextrema(y, np.less))

Find all the Dips in a 2D NumPy Array

2D arrays can be imagined as digital (grayscale) images built up from pixels. For each pixel, there is a value representing the brightness. These values are represented in a matrix where the pixel positions are mapped on the two axes. If we want to find the darkest points, we need to examine the neighbors of each point.

Method 1: Naive Approach

This can be done by checking for each cell by iterating over each cell [i,j] . If all the neighbours are higher than the actual cell, then it is a local minima. This approach is not optimal, since redundant calculations are performed on the neighbouring cells.

Method 2: Using the „and” Operator

The first step is to add the global maximum of the array to the surrounding of the array so that when the element-wise test reaches the “edge” of the array, the comparison works. Since the current cell is compared to the previous and next cells, an error message would be generated at the end of the axes. The local minimum will always be less than the global maximum, so our comparison will work fine. We then go through the cells of the array, performing the comparison.

To reduce redundant computations, we use a logical “and” operation, so that if one of the conditions is not met, the algorithm will not perform the further comparison.

The indices of elements matching the criteria are collected in a tuple list. Note that the indices of the original array are different from the indices of the expanded array! Therefore we reduce the indices by -1.

import numpy as np data = np.array([[-210, 2, 0, -150], [4, -1, -60, 10], [0, 0, -110, 10], [-50, -11, -2, 10]]) a = np.pad(data, (1, 1), mode='constant', constant_values=(np.amax(data), np.amax(data))) loc_min = [] rows = a.shape[0] cols = a.shape[1] for ix in range(0, rows - 1): for iy in range(0, cols - 1): if a[ix, iy] < a[ix, iy + 1] and a[ix, iy] < a[ix, iy - 1] and \ a[ix, iy] < a[ix + 1, iy] and a[ix, iy] < a[ix + 1, iy - 1] and \ a[ix, iy] < a[ix + 1, iy + 1] and a[ix, iy] < a[ix - 1, iy] and \ a[ix, iy] < a[ix - 1, iy - 1] and a[ix, iy] < a[ix - 1, iy + 1]: temp_pos = (ix-1, iy-1) loc_min.append(temp_pos) print(loc_min) # [(0, 0), (0, 3), (2, 2), (3, 0)]

Method 3: Using scipy’s ndimage minimum filter

Fortunately, there is also a SciPy function for this task:

scipy.ndimage.minimum_filter(input, size = None, footprint = None, output = None, mode = 'reflect', cval = 0.0, origin = 0)

It creates a Boolean array, where local minima are ’ True ’.

  • The ‘ size ‘ parameter is used, which specifies the shape taken from the input array at each element position, to define the input to the filter function.
  • The footprint is a Boolean array that specifies (implicitly) a shape, but also which of the elements within the shape are passed to the filter function.
  • The mode parameter determines how the input array is extended when the filter overlaps a border. By passing a sequence of modes of length equal to the number of dimensions of the input array, different modes can be specified along each axis.
  • Here, the ‘ constant ‘ parameter is specified, so the input is expanded by filling all values beyond the boundary with the same constant value specified by the ‘ cval ‘ parameter. The value of ‘ cval ‘ is set to the global maximum, ensuring that any local minimum on the boundary is also found.

See the SciPy documentation for more detailed information on how the filters work. We then check which cells are “True”, so that all the local minimum indices are in an array.

import numpy as np from scipy.ndimage.filters import minimum_filter data = np.array([[2, 100, 1000, -5], [-10, 9, 1800, 0], [112, 10, 111, 100], [50, 110, 50, 140]]) minima = (data == minimum_filter(data, 3, mode='constant', cval=0.0)) # print(data) # print(minima) res = np.where(1 == minima) print(res)

Performance Evaluation

To learn about which method is faster, I’ve evaluated the runtime performance in an experimental setting. The result shows that the minimum filter method is significantly faster than the “for loop” method:

Here’s the code generating this output:

import numpy as np import time as tm import matplotlib.pyplot as plt from scipy.ndimage.filters import minimum_filter data = np.random.rand(50, 50) a = np.pad(data, (1, 1), mode='constant', constant_values=(np.amax(data), np.amax(data))) loc_min = [] rows = a.shape[0] cols = a.shape[1] begin1 = tm.time() for ix in range(0, rows - 1): for iy in range(0, cols - 1): if (a[ix, iy] < a[ix, iy + 1] and a[ix, iy] < a[ix, iy - 1] and a[ix, iy] < a[ix + 1, iy] and a[ix, iy] < a[ix + 1, iy - 1] and a[ix, iy] < a[ix + 1, iy + 1] and a[ix, iy] < a[ix - 1, iy] and a[ix, iy] < a[ix - 1, iy - 1] and a[ix, iy] < a[ix - 1, iy + 1]): temp_pos = (ix-1, iy-1) loc_min.append(temp_pos) end1 = tm.time() dt1 = (end1 - begin1) print(f"Total runtime of the program is ") begin2 = tm.time() minima = (data == minimum_filter(data, 50, mode='constant', cval=(np.amax(data)))) end2 =tm.time() dt2 = (end2 - begin2) xpoints = 0.1 ypoints1 = dt1 ypoints2 = dt2 plt.plot(xpoints, ypoints1, 'o', xpoints, ypoints2, 's') plt.title('Comparison of "for loop" and minimum_filter') plt.ylabel('Time (sec)') plt.legend(['for loop', "minimum_filter"], loc ="lower right") plt.show()

You can play with the code yourself in our interactive, browser-based, Jupyter Notebook (Google Colab).

Summary

As we have seen above, there are several ways to find the local minima in a numpy array, but as usual, it is useful to use external libraries for this task.

Be on the Right Side of Change 🚀

  • The world is changing exponentially. Disruptive technologies such as AI, crypto, and automation eliminate entire industries. 🤖
  • Do you feel uncertain and afraid of being replaced by machines, leaving you without money, purpose, or value? Fear not! There a way to not merely survive but thrive in this new world!
  • Finxter is here to help you stay ahead of the curve, so you can keep winning as paradigms shift.

Learning Resources 🧑‍💻

⭐ Boost your skills. Join our free email academy with daily emails teaching exponential with 1000+ tutorials on AI, data science, Python, freelancing, and Blockchain development!

Join the Finxter Academy and unlock access to premium courses 👑 to certify your skills in exponential technologies and programming.

New Finxter Tutorials:

Finxter Categories:

Источник

Оцените статью