Python numpy cheat sheet

Numpy Cheat Sheet

A quick guide to the basics of the Python Numpy library, including code samples.

NumPy is the library that gives Python its ability to work with data at speed. Numpy has several advantages over data cleaning and manipulation. It allows for efficient operations on the data structures often used in machine learning: vectors, matrices, and tensors.

When I first learned NumPy, I had trouble remembering all the functions and methods that needed. So I put together the most frequently used Numpy operations. I sometimes come back to this note to refresh my memory. And I am glad if it helps you on your journey too.

The structure of this note:

  1. N-dimensional arrays
  2. Array shape manipulations
  3. Numerical operations on array
  4. Array manipulations routines ( select and split)
  5. Statistical operations

This is a long note, make yourself a cup of tea, and let’s get started!

As always, we need to import NumPy library:

1. N-Dimensional Array (Ndarray)

What are Arrays?

Arrays are a data structure for storing elements of the same type. Each item stored in an array is called an element. Each location of an element in an array has a numerical index, which is used to identify the element.

Читайте также:  Javascript send request function

1D vs 2D Array

1D array (i.e., single dimensional array) stores a list of variables of the same data type. It is possible to access each variable using the index.

2D array (i.e, multi-dimensional array) stores data in a format consisting of rows and columns.

Источник

NumPy Cheat Sheet — Python for Data Science

NumPy is the library that gives Python its ability to work with data at speed. Originally, launched in 1995 as ‘Numeric,’ NumPy is the foundation on which many important Python data science libraries are built, including Pandas, SciPy and scikit-learn. It’s common when first learning NumPy to have trouble remembering all the functions and methods that you need, and while at Dataquest we advocate getting used to consulting the NumPy documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out! If you’re interested in learning NumPy, you can consult our NumPy tutorial blog post, or you can signup for free and start learning NumPy through our interactive Python data science course. Download a Printable PDF of this Cheat Sheet

Key and Imports

In this cheat sheet, we use the following shorthand:

arr | A NumPy Array object You’ll also need to import numpy to get started:

Importing/exporting

np.loadtxt(‘file.txt’) | From a text file np.genfromtxt(‘file.csv’,delimiter=’,’) | From a CSV file np.savetxt(‘file.txt’,arr,delimiter=’ ‘) | Writes to a text file np.savetxt(‘file.csv’,arr,delimiter=’,’) | Writes to a CSV file

Creating Arrays

np.array([1,2,3]) | One dimensional array np.array([(1,2,3),(4,5,6)]) | Two dimensional array np.zeros(3) | 1D array of length 3 all values 0 np.ones((3,4)) | 3 x 4 array with all values 1 np.eye(5) | 5 x 5 array of 0 with 1 on diagonal (Identity matrix) np.linspace(0,100,6) | Array of 6 evenly divided values from 0 to 100 np.arange(0,10,3) | Array of values from 0 to less than 10 with step 3 (eg [0,3,6,9] ) np.full((2,3),8) | 2 x 3 array with all values 8 np.random.rand(4,5) | 4 x 5 array of random floats between 0 — 1 np.random.rand(6,7)*100 | 6 x 7 array of random floats between 0 — 100 np.random.randint(5,size=(2,3)) | 2 x 3 array with random ints between 0 — 4

Inspecting Properties

arr.size | Returns number of elements in arr arr.shape | Returns dimensions of arr (rows,columns) arr.dtype | Returns type of elements in arr arr.astype(dtype) | Convert arr elements to type dtype arr.tolist() | Convert arr to a Python list np.info(np.eye) | View documentation for np.eye

Copying/sorting/reshaping

np.copy(arr) | Copies arr to new memory arr.view(dtype) | Creates view of arr elements with type dtype arr.sort() | Sorts arr arr.sort(axis=0) | Sorts specific axis of arr two_d_arr.flatten() | Flattens 2D array two_d_arr to 1D arr.T | Transposes arr (rows become columns and vice versa) arr.reshape(3,4) | Reshapes arr to 3 rows, 4 columns without changing data arr.resize((5,6)) | Changes arr shape to 5 x 6 and fills new values with 0

Adding/removing Elements

np.append(arr,values) | Appends values to end of arr np.insert(arr,2,values) | Inserts values into arr before index 2 np.delete(arr,3,axis=0) | Deletes row on index 3 of arr np.delete(arr,4,axis=1) | Deletes column on index 4 of arr

Combining/splitting

np.concatenate((arr1,arr2),axis=0) | Adds arr2 as rows to the end of arr1 np.concatenate((arr1,arr2),axis=1) | Adds arr2 as columns to end of arr1 np.split(arr,3) | Splits arr into 3 sub-arrays np.hsplit(arr,5) | Splits arr horizontally on the 5 th index

Indexing/slicing/subsetting

arr[5] | Returns the element at index 5 arr[2,5] | Returns the 2D array element on index [2][5] arr[1]=4 | Assigns array element on index 1 the value 4 arr[1,3]=10 | Assigns array element on index [1][3] the value 10 arr[0:3] | Returns the elements at indices 0,1,2 (On a 2D array: returns rows 0,1,2 ) arr[0:3,4] | Returns the elements on rows 0,1,2 at column 4 arr[:2] | Returns the elements at indices 0,1 (On a 2D array: returns rows 0,1 ) arr[:,1] | Returns the elements at index 1 on all rows arr5) | Returns an array with boolean values ~arr | Inverts a boolean array arr[arr

Scalar Math

np.add(arr,1) | Add 1 to each array element np.subtract(arr,2) | Subtract 2 from each array element np.multiply(arr,3) | Multiply each array element by 3 np.divide(arr,4) | Divide each array element by 4 (returns np.nan for division by zero) np.power(arr,5) | Raise each array element to the 5 th power

Vector Math

np.add(arr1,arr2) | Elementwise add arr2 to arr1 np.subtract(arr1,arr2) | Elementwise subtract arr2 from arr1 np.multiply(arr1,arr2) | Elementwise multiply arr1 by arr2 np.divide(arr1,arr2) | Elementwise divide arr1 by arr2 np.power(arr1,arr2) | Elementwise raise arr1 raised to the power of arr2 np.array_equal(arr1,arr2) | Returns True if the arrays have the same elements and shape np.sqrt(arr) | Square root of each element in the array np.sin(arr) | Sine of each element in the array np.log(arr) | Natural log of each element in the array np.abs(arr) | Absolute value of each element in the array np.ceil(arr) | Rounds up to the nearest int np.floor(arr) | Rounds down to the nearest int np.round(arr) | Rounds to the nearest int

Statistics

np.mean(arr,axis=0) | Returns mean along specific axis arr.sum() | Returns sum of arr arr.min() | Returns minimum value of arr arr.max(axis=0) | Returns maximum value of specific axis np.var(arr) | Returns the variance of array np.std(arr,axis=1) | Returns the standard deviation of specific axis arr.corrcoef() | Returns correlation coefficient of array

Download a printable version of this cheat sheet

If you’d like to download a printable version of this cheat sheet you can do so below.

Источник

NumPY Cheat Sheet

Jim Hugunin originally developed Numeric, the predecessor to NumPy, with assistance from a number of other programmers. Travis Oliphant developed it in 2005 by heavily altering Numeric to incorporate features of the rival Numarray. The multidimensional array objects and the collection of operations for handling those arrays are part of an open-source library. Its sturdy n-dimensional array speeds up data processing. It offers the ability to easily interact with other Python packages and other programming languages like C, C++, etc. The foundational library for Python’s scientific computing is this one. It offers a multidimensional array object with outstanding speed as well as capabilities for interacting with these arrays. The library is what enables Python’s quick data manipulation. NumPy targets the non-optimizing CPython bytecode interpreter, which is the Python reference implementation. Due to the lack of compiler optimization, mathematical algorithms created for this version of Python frequently execute considerably slower than their compiled counterparts. Multidimensional arrays, efficient array-based functions, and operators are some of the ways that NumPy addresses the slowness issue. Using them necessitates rewriting some code, primarily inner loops, in NumPy.

What is NumPy?

The core Python library for scientific computing is called NumPy. It includes multidimensional array objects, various derived objects (masked arrays, matrices, etc.), and a variety of quick manipulations of arrays such as math, logic, shape manipulation, sorting, selection, I / O, and discrete. A Python library that provides a variety of routines. Fourier transform, basic linear algebra, basic statistical operations, random simulation, etc. The ndarray object is the core of the NumPy package. This contains n-dimensional arrays of uniform data types, with many operations carried out in compiled code for speed. View Important Interview Questions on NumPY.

Why Use NumPy?

  • It is a powerful N-dimensional array object
  • It is a sophisticated broadcasting functions
  • It is a tool for integrating C/C++ and Fortran code
  • It is useful for linear algebra, Fourier transform, and random number capabilities

Limitations Of NumPy

An array cannot be added to or subtracted from in the same manner as a list in Python. When extending an array, the np.pad(. ) function actually creates new arrays with the required shape and padding values, copies the existing array into the new one, and then returns the new array.

The np.concatenate([a1,a2]) action in NumPy returns a new array that is filled with the sequential items from the two given arrays rather than actually linking the two arrays together. Only when the array’s element count stays constant can an array’s dimensionality be changed using the np.reshape(. ) method. These conditions result from the need that arrays in NumPy to be viewed on contiguous memory buffers. Blaze, a substitute package, makes an effort to get around this restriction.

Numerous contemporary large-scale scientific computing applications have needs that go beyond what NumPy arrays can handle. For instance, NumPy arrays are frequently loaded into a computer’s RAM, which may not have enough space to accommodate huge dataset processing. Furthermore, a single CPU is used to do NumPy computations. However, by running them on groups of CPUs or specialized hardware, such as GPUs and TPUs, which are used in many deep learning applications, many linear algebra operations may be made faster.

Источник

Оцените статью