- How to convert Python string to bytes?
- Table of Contents — Python String to Byte:
- Python String to Bytes:
- Method to convert strings to bytes:
- Using bytes():
- Using encode():
- Limitations and Caveats — Python String to Byte
- Python Convert String to Byte
- How to convert string to byte arrays?
- 3 Answers 3
- How to convert string to byte array in Python
- 9 Answers 9
How to convert Python string to bytes?
In this tutorial, we look at how to convert Python string to bytes. We look at all the various methods along with their limitations and caveats.
Table of Contents — Python String to Byte:
Python String to Bytes:
Converting Python strings to Bytes has become quite popular after the release of Python 3. This is largely because a lot of file handling and Machine learning methods require you to convert them. Before we dive into how to convert them let us first understand what they are and how they are different.
In Python 2, string and bytes were the same typeByte objects; however after the introduction of Python 3 Byte objects are considered as a sequence of bytes, and strings are considered as a sequence of characters. In essence, strings are human-readable and in order for them to become machine-readable, they must be converted to byte objects. This conversion also enables the data to be directly stored on the disk.
The process of converting string objects to byte objects is called encoding and the inverse is called decoding. We look at methods to achieve this below.
Method to convert strings to bytes:
There are many methods that can be used to convert Python string to bytes, however, we look at the most common and simple methods that can be used.
Using bytes():
The bytes() method is an inbuilt function that can be used to convert objects to byte objects.
The bytes take in an object (a string in our case), the required encoding method, and convert it into a byte object. The bytes() method accepts a third argument on how to handle errors.
Let us look at the code to convert a Python string to bytes. The encoding type we use here is “UTF-8”.
#Using the byte() method # initializing string str_1 = "Join our freelance network" str_1_encoded = bytes(str_1,'UTF-8') #printing the encode string print(str_1_encoded) #printing individual bytes for bytes in str_1_encoded: print(bytes, end = ' ')
b'Join our freelance network' 74 111 105 110 32 111 117 114 32 102 114 101 101 108 97 110 99 101 32 110 101 116 119 111 114 107
As you can see this method has converted the string into a sequence of bytes.
Note: This method converts objects into immutable bytes, if you are looking for a mutable method you can use the bytearray() method.
Using encode():
The encode() method is the most commonly used and recommended method to convert Python strings to bytes. A major reason is that it is more readable.
string.encode(encoding=encoding, errors=errors)
Here, string refers to the string you are looking to convert.
- Encoding — Optional. The encoding method you are looking to use. After Python 3, UTF-8 has become the default.
- Error — Optional, A string containing the error message.
#Using the encode method # initializing string str_1 = "Join our freelance network" str_1_encoded = str_1.encode(encoding = 'UTF-8') #printing the encode string print(str_1_encoded) #printing individual bytes for bytes in str_1_encoded: print(bytes, end = ' ')
b'Join our freelance network' 74 111 105 110 32 111 117 114 32 102 114 101 101 108 97 110 99 101 32 110 101 116 119 111 114 107
#Using the encode method #initializing string str_1 = "Join our freelance network" str_1_encoded = str_1.encode(encoding = 'UTF-8') #printing the encode string print(str_1_encoded) #decoding the string str_1_decoded = str_1_encoded.decode() print(str_1_decoded)
b'Join our freelance network' Join our freelance network
Limitations and Caveats — Python String to Byte
- Both the methods solve the same problem efficiently and choosing a particular method would boil down to one’s personal choice. However, I would recommend the second method for beginners.
- The byte() method returns an immutable object. Hence, consider using the bytearray() if you are looking for a mutable object.
- While using the byte() methods the object must have a size 0
Python Convert String to Byte
I am trying to do some serial input and output operations, and one of those is to send an 8×8 array to an external device (Arduino). The pySerial library requires that the information that I send be a byte. However, in my python code, the 8×8 matrix is made up of types . Here’s my sending function:
import serial import Matrix width = 8 height = 8 portName = 'COM3' def sendMatrix(matrix): try: port = serial.Serial(portName, 9600, timeout = 1000000) port.setDTR(0) print("Opened port: \"%s\"." % (portName)) receivedByte = port.read() print(int(receivedByte)) if (receivedByte == '1'): port.write('1') bytesWritten = 0 for row in range(8): for col in range(8): value = matrix.getPoint(col, row) bytesWritten += port.write(value)//ERROR HERE! print(int(port.read())); port.close() print("Data (%d) sent to port: \"%s\"." % (bytesWritten, portName)) except: print("Unable to open the port \"%s\"." % (portName)) def main(): matrix = Matrix.Matrix.readFromFile('framefile', 8, 8) matrix.print() print(type(matrix.getPoint(0, 0))) print(matrix.getPoint(1, 1)) sendMatrix(matrix) main()
Now, I have a class Matrix , which contains a field map , which is the array in question, and I will include that code here too, but the problem I’m having is that each element in the array is of type str , but I need to convert it to a byte. I can disregard possible loss of data, since in practice, I only use 0’s and 1’s. My Matrix Class:
class Matrix(object): def __init__(self, width, height): self.width = width self.height = height self.map = [[0 for x in range(width)] for y in range(height)] def setPoint(self, x, y, value): if ((x >= 0) and (x < self.width) and (y >= 0) and (y < self.height)): self.map[y][x] = value def getPoint(self, x, y): if ((x >= 0) and (x < self.width) and (y >= 0) and (y < self.height)): return self.map[y][x] def print(self): for row in range(self.height): for col in range(self.width): print(str(self.map[row][col])+" ", end="") print() def save(self, filename): f = open(filename, 'w') for row in range(self.height): for col in range(self.width): f.write(str(self.map[row][col])) f.write('\n') f.close() def toByteArray(self): matrixBytes = bytearray(self.width * self.height) for row in range(self.height): for col in range(self.width): matrixBytes.append(int(self.map[row][col])) return matrixBytes def getMap(self): return self.map def readFromFile(filename, width, height): f = open(filename, 'r') lines = list(f) matrix = Matrix(width, height) f.close() for row in range(len(lines)): matrix.map[row] = lines[row].strip('\n') return matrix
matrix is an object, and it contains a width, height, and a 2 dimensional array inside of that. It's not simply an iterable structure.
Sure, but the code for your matrix.toByteArray method looks like it serializes the matrix data correctly into a bytearray, and the built-in bytes function will produce a bytes object from that bytearray.
How to convert string to byte arrays?
How can I convert a string to its byte value? I have a string "hello" and I want to change is to something like "/x68. " .
You realize that it's all just bits and bytes at the lowest level and that the strings "hello" and "\x68\x65\x6C\x6C\x6F" are identical (unless you escape the backslashes instead of using them for hex escapes)?
This makes no sense, what do you actually want to do? (This is just some intermediate step you think you need to do)
What are you trying to do, exactly? It's worth noting that the str type in Python basically is just a set of bytes (meaning that it doesn't have a representation, like Unicode, attached and can just be an arbitrary sequence of bytes, despite its name).
We should add: if you're using Python 3, str is unicode. To convert it to bytes, do s.encode() (you can also specify what character encoding you want, otherwise it will use UTF-8).
For those who are wondering why you would want to do this: I found this to be useful for parsing binary data read using pySerial.
3 Answers 3
Python 2.6 and later have a bytearray type which may be what you're looking for. Unlike strings, it is mutable, i.e., you can change individual bytes "in place" rather than having to create a whole new string. It has a nice mix of the features of lists and strings. And it also makes your intent clear, that you are working with arbitrary bytes rather than text.
quote "I want to change all what is in file (String) into byte array. " .. @kindall's answer does exactly that. +1 for bytearray()
Perhaps you want this (Python 2):
>>> map(ord,'hello') [104, 101, 108, 108, 111]
For a Unicode string this would return Unicode code points:
>>> map(ord,u'Hello, 马克') [72, 101, 108, 108, 111, 44, 32, 39532, 20811]
But encode it to get byte values for the encoding:
>>> map(ord,u'Hello, 马克'.encode('chinese')) [72, 101, 108, 108, 111, 44, 32, 194, 237, 191, 203] >>> map(ord,u'Hello, 马克'.encode('utf8')) [72, 101, 108, 108, 111, 44, 32, 233, 169, 172, 229, 133, 139]
How to convert string to byte array in Python
Say that I have a 4 character string, and I want to convert this string into a byte array where each character in the string is translated into its hex equivalent. e.g.
What you want is not possible, at least not in this exact form. A bytearray of type B contains 1-byte integers, and they are always represented in decimal.
9 Answers 9
encode function can help you here, encode returns an encoded version of the string
In [44]: str = "ABCD" In [45]: [elem.encode("hex") for elem in str] Out[45]: ['41', '42', '43', '44']
or you can use array module
In [49]: import array In [50]: print array.array('B', "ABCD") array('B', [65, 66, 67, 68])
however as you can see,, array module gives a ascii value of string elements, which doesn't match with your expected output
This is the accepted answer and does not work in Python3. Could you please add the python3 version as pointed in other answers?
Just use a bytearray() which is a list of bytes.
s = "ABCD" b = bytearray() b.extend(s)
s = "ABCD" b = bytearray() b.extend(map(ord, s))
By the way, don't use str as a variable name since that is builtin.
map(ord, s) will return values > 255 unless your strings are strictly ASCII. Please update your answer to include something like s.encode('utf-8') . (Note that UTF-8 is a strict superset of ASCII, so it does not alter ASCII strings in any way.)
@9000 it is incorrect to use .encode() as well as .encode('utf-8') . Use map(ord, . ) if you don't want you bytes to be transformed. repl.it/repls/MistySubtleVisitors just press run and see the result.