- Dumping An Ndarray Object
- Converting an ndarray into bytes:
- Example:
- Output:
- Writing ndarray into a file:
- Reading an ndarray from a file:
- Example:
- Output:
- Pickling ndarray object into a string:
- Example:
- Output:
- NumPy Input and Output: ndarray.tofile() function
- Reading and writing files#
- Reading text and CSV files#
- With no missing values#
- With missing values#
- With non-whitespace delimiters#
- Whitespace-delimited#
- Read a file in .npy or .npz format#
- Write to a file to be read back by NumPy#
- Binary#
- Human-readable#
- Large arrays#
- Read an arbitrarily formatted binary file (“binary blob”)#
- Write or read large arrays#
- Write files for reading by other (non-NumPy) tools#
- Write or read a JSON file#
- Save/restore using a pickle file#
- Convert from a pandas DataFrame to a NumPy array#
- Save/restore using tofile and fromfile #
Dumping An Ndarray Object
2-dimensional array from an ndarray object converted to a list:
Converting an ndarray into bytes:
- Both tostring() and tobytes() method of numpy.ndarray can be used for creating a byte array from string.
- tostring() and tobytes() methods return a python bytes object which is an immutable sequence of bytes .
- The memory layout of the bytes returned by tostring() and tobytes() methods can be in continuously arranged ‘C’ style or continuously arranged ‘C’ Fortran style as per the values passed on to the parameter order.
Example:
# Convert a 2d array into a string of bytes
array_2d = np.arange(9).reshape(3, 3)
print(«2-dimensional array of type ndarray:»)
print(«2-dimensional array from an ndarray object converted to a string of bytes:»)
Output:
2-dimensional array of type ndarray:
2-dimensional array from an ndarray object converted to a string of bytes:
Writing ndarray into a file:
- The parameter sep tells the file is being written in text mode and what is the separator character being used.
- The tofile() method does not take care of preserving endianness while writing the ndarray object to the file.
Reading an ndarray from a file:
- The contents of an ndarray object written to a file using the method tofile() can be read into a ndarray object using the method numpy.fromfile().
- While reading from a file which has the contents of and ndarray object, the separator character and the type of the ndarray elements need to be specified.
Example:
# Convert a 2d array into a string of bytes
array_2d = np.arange(9).reshape(3, 3)
print(«2-dimensional array of type ndarray:»)
print(«Writing the ndarray object to a file in text format:»)
print(«Done writing in text format»)
print(«Write the ndarray object to a file in binary format:»)
print(«Done writing in binary format»)
print(«Reading ndarray from file1:»)
array1 = np.fromfile(filePath1, dtype = ‘int’, sep=’|’)
print(«Type of array1 is %s»%(type(array1)))
print(«Reading ndarray from file2:»)
array2 = np.fromfile(filePath2, dtype = ‘int’)
print(«Type of array2 is %s»%(type(array2)))
Output:
2-dimensional array of type ndarray:
Writing the ndarray object to a file in text format:
Done writing in text format
Write the ndarray object to a file in binary format:
Done writing in binary format
Reading ndarray from file1:
Reading ndarray from file2:
Pickling ndarray object into a string:
- An numpy.ndarray object can be pickled into a python string object using the method dumps() of the numpy.ndarray.
- The string which has the pickle can be converted again into an ndarray object using the loads() method of numpy.
Example:
# Create a 3-dimensional array
print(«Original ndarray object:»)
# Pickle the 3-dimensional array
print(«Pickle of the ndarray:»)
# Load back the 3-dimensional array
Output:
NumPy Input and Output: ndarray.tofile() function
The ndarray.tofile() function is used to write array to a file as text or binary (default).
Data is always written in ‘C’ order, independent of the order of a.
ndarray.tofile(fid, sep="", format="%s")
Version: 1.15.0
Name | Description | Required / Optional |
---|---|---|
fid | An open file object, or a string containing a filename. file or str | Required |
sep | Separator between array items for text output. If «» (empty), a binary file is written, equivalent to file.write(a.tobytes()). str | Required |
format | Format string for text file output. Each entry in the array is formatted to text by first converting it to the closest Python type, and then using «format» % item. str | Required |
Notes:
This is a convenience function for quick storage of array data.
Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness.
Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.
When fid is a file object, array contents are directly written to the file, bypassing the file object’s write method.
As a result, tofile cannot be used with files objects supporting compression (e.g., GzipFile) or file-like objects that do not support fileno() (e.g., BytesIO).
NumPy.fromstring() method Example:
>>> import numpy as np >>> np.array([1,300],np.int32).tofile('test1') >>> with open('test1','rb') as x: print(x.read())
Python — NumPy Code Editor:
Follow us on Facebook and Twitter for latest update.
- Weekly Trends
- Java Basic Programming Exercises
- SQL Subqueries
- Adventureworks Database Exercises
- C# Sharp Basic Exercises
- SQL COUNT() with distinct
- JavaScript String Exercises
- JavaScript HTML Form Validation
- Java Collection Exercises
- SQL COUNT() function
- SQL Inner Join
- JavaScript functions Exercises
- Python Tutorial
- Python Array Exercises
- SQL Cross Join
- C# Sharp Array Exercises
We are closing our Disqus commenting system for some maintenanace issues. You may write to us at reach[at]yahoo[dot]com or visit us at Facebook
Reading and writing files#
This page tackles common applications; for the full collection of I/O routines, see Input and output .
Reading text and CSV files#
With no missing values#
With missing values#
- return a masked arraymasking out missing values (if usemask=True ), or
- fill in the missing value with the value specified in filling_values (default is np.nan for float, -1 for int).
With non-whitespace delimiters#
>>> with open("csv.txt", "r") as f: . print(f.read()) 1, 2, 3 4,, 6 7, 8, 9
Masked-array output#
>>> np.genfromtxt("csv.txt", delimiter=",", usemask=True) masked_array( data=[[1.0, 2.0, 3.0], [4.0, --, 6.0], [7.0, 8.0, 9.0]], mask=[[False, False, False], [False, True, False], [False, False, False]], fill_value=1e+20)
Array output#
>>> np.genfromtxt("csv.txt", delimiter=",") array([[ 1., 2., 3.], [ 4., nan, 6.], [ 7., 8., 9.]])
Array output, specified fill-in value#
>>> np.genfromtxt("csv.txt", delimiter=",", dtype=np.int8, filling_values=99) array([[ 1, 2, 3], [ 4, 99, 6], [ 7, 8, 9]], dtype=int8)
Whitespace-delimited#
numpy.genfromtxt can also parse whitespace-delimited data files that have missing values if
- Each field has a fixed width: Use the width as the delimiter argument.
# File with width=4. The data does not have to be justified (for example, # the 2 in row 1), the last column can be less than width (for example, the 6 # in row 2), and no delimiting character is required (for instance 8888 and 9 # in row 3)
>>> with open("fixedwidth.txt", "r") as f: . data = (f.read()) >>> print(data) 1 2 3 44 6 7 88889
>>> np.genfromtxt("fixedwidth.txt", delimiter=4) array([[1.000e+00, 2.000e+00, 3.000e+00], [4.400e+01, nan, 6.000e+00], [7.000e+00, 8.888e+03, 9.000e+00]])
>>> with open("nan.txt", "r") as f: . print(f.read()) 1 2 3 44 x 6 7 8888 9
>>> np.genfromtxt("nan.txt", missing_values="x") array([[1.000e+00, 2.000e+00, 3.000e+00], [4.400e+01, nan, 6.000e+00], [7.000e+00, 8.888e+03, 9.000e+00]])
>>> with open("skip.txt", "r") as f: . print(f.read()) 1 2 3 44 6 7 888 9
>>> np.genfromtxt("skip.txt", invalid_raise=False) __main__:1: ConversionWarning: Some errors were detected ! Line #2 (got 2 columns instead of 3) array([[ 1., 2., 3.], [ 7., 888., 9.]])
>>> with open("tabs.txt", "r") as f: . data = (f.read()) >>> print(data) 1 2 3 44 6 7 888 9
>>> np.genfromtxt("tabs.txt", delimiter="\t", missing_values=" +") array([[ 1., 2., 3.], [ 44., nan, 6.], [ 7., 888., 9.]])
Read a file in .npy or .npz format#
Write to a file to be read back by NumPy#
Binary#
For security and portability , set allow_pickle=False unless the dtype contains Python objects, which requires pickling.
Masked arrays can’t currently be saved , nor can other arbitrary array subclasses.
Human-readable#
numpy.save and numpy.savez create binary files. To write a human-readable file, use numpy.savetxt . The array can only be 1- or 2-dimensional, and there’s no ` savetxtz` for multiple files.
Large arrays#
Read an arbitrarily formatted binary file (“binary blob”)#
The .wav file header is a 44-byte block preceding data_size bytes of the actual sound data:
chunk_id "RIFF" chunk_size 4-byte unsigned little-endian integer format "WAVE" fmt_id "fmt " fmt_size 4-byte unsigned little-endian integer audio_fmt 2-byte unsigned little-endian integer num_channels 2-byte unsigned little-endian integer sample_rate 4-byte unsigned little-endian integer byte_rate 4-byte unsigned little-endian integer block_align 2-byte unsigned little-endian integer bits_per_sample 2-byte unsigned little-endian integer data_id "data" data_size 4-byte unsigned little-endian integer
The .wav file header as a NumPy structured dtype:
wav_header_dtype = np.dtype([ ("chunk_id", (bytes, 4)), # flexible-sized scalar type, item size 4 ("chunk_size", "), # little-endian unsigned 32-bit integer ("format", "S4"), # 4-byte string, alternate spelling of (bytes, 4) ("fmt_id", "S4"), ("fmt_size", "), ("audio_fmt", "), # ("num_channels", "), # .. more of the same . ("sample_rate", "), # ("byte_rate", "), ("block_align", "), ("bits_per_sample", "), ("data_id", "S4"), ("data_size", "), # # the sound data itself cannot be represented here: # it does not have a fixed size ]) header = np.fromfile(f, dtype=wave_header_dtype, count=1)[0]
This .wav example is for illustration; to read a .wav file in real life, use Python’s built-in module wave .
(Adapted from Pauli Virtanen, Advanced NumPy , licensed under CC BY 4.0.)
Write or read large arrays#
Arrays too large to fit in memory can be treated like ordinary in-memory arrays using memory mapping.
array = numpy.memmap("mydata/myarray.arr", mode="r", dtype=np.int16, shape=(1024, 1024))
large_array[some_slice] = np.load("path/to/small_array", mmap_mode="r")
Memory mapping lacks features like data chunking and compression; more full-featured formats and libraries usable with NumPy include:
For tradeoffs among memmap, Zarr, and HDF5, see pythonspeed.com.
Write files for reading by other (non-NumPy) tools#
Formats for exchanging data with other tools include HDF5, Zarr, and NetCDF (see Write or read large arrays ).
Write or read a JSON file#
NumPy arrays are not directly JSON serializable.
Save/restore using a pickle file#
Avoid when possible; pickles are not secure against erroneous or maliciously constructed data.
Use numpy.save and numpy.load . Set allow_pickle=False , unless the array dtype includes Python objects, in which case pickling is required.
Convert from a pandas DataFrame to a NumPy array#
Save/restore using tofile and fromfile #
numpy.ndarray.tofile and numpy.fromfile lose information on endianness and precision and so are unsuitable for anything but scratch storage.
How to write a NumPy how-to