Python byte string to bytes

Конвертация между байтами и строками#

Избежать работы с байтами нельзя. Например, при работе с сетью или файловой системой, чаще всего, результат возвращается в байтах.

Соответственно, надо знать, как выполнять преобразование байтов в строку и наоборот. Для этого и нужна кодировка.

Кодировку можно представлять как ключ шифрования, который указывает:

  • как «зашифровать» строку в байты (str -> bytes). Используется метод encode (похож на encrypt)
  • как «расшифровать» байты в строку (bytes -> str). Используется метод decode (похож на decrypt)

Эта аналогия позволяет понять, что преобразования строка-байты и байты-строка должны использовать одинаковую кодировку.

encode, decode#

Для преобразования строки в байты используется метод encode:

In [1]: hi = 'привет' In [2]: hi.encode('utf-8') Out[2]: b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' In [3]: hi_bytes = hi.encode('utf-8') 

Чтобы получить строку из байт, используется метод decode:

In [4]: hi_bytes Out[4]: b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' In [5]: hi_bytes.decode('utf-8') Out[5]: 'привет' 

str.encode, bytes.decode#

Метод encode есть также в классе str (как и другие методы работы со строками):

In [6]: hi Out[6]: 'привет' In [7]: str.encode(hi, encoding='utf-8') Out[7]: b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' 

А метод decode есть у класса bytes (как и другие методы):

In [8]: hi_bytes Out[8]: b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' In [9]: bytes.decode(hi_bytes, encoding='utf-8') Out[9]: 'привет' 

В этих методах кодировка может указываться как ключевой аргумент (примеры выше) или как позиционный:

In [10]: hi_bytes Out[10]: b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82' In [11]: bytes.decode(hi_bytes, 'utf-8') Out[11]: 'привет' 

Как работать с Юникодом и байтами#

Есть очень простое правило, придерживаясь которого, можно избежать, как минимум, части проблем. Оно называется «Юникод-сэндвич»:

  • байты, которые программа считывает, надо как можно раньше преобразовать в Юникод (строку)
  • внутри программы работать с Юникод
  • Юникод надо преобразовать в байты как можно позже, перед передачей

Источник

How to convert Python string to bytes?

In this tutorial, we look at how to convert Python string to bytes. We look at all the various methods along with their limitations and caveats.

Table of Contents — Python String to Byte:

Python String to Bytes:

Converting Python strings to Bytes has become quite popular after the release of Python 3. This is largely because a lot of file handling and Machine learning methods require you to convert them. Before we dive into how to convert them let us first understand what they are and how they are different.

In Python 2, string and bytes were the same typeByte objects; however after the introduction of Python 3 Byte objects are considered as a sequence of bytes, and strings are considered as a sequence of characters. In essence, strings are human-readable and in order for them to become machine-readable, they must be converted to byte objects. This conversion also enables the data to be directly stored on the disk.

The process of converting string objects to byte objects is called encoding and the inverse is called decoding. We look at methods to achieve this below.

Method to convert strings to bytes:

There are many methods that can be used to convert Python string to bytes, however, we look at the most common and simple methods that can be used.

Using bytes():

The bytes() method is an inbuilt function that can be used to convert objects to byte objects.

The bytes take in an object (a string in our case), the required encoding method, and convert it into a byte object. The bytes() method accepts a third argument on how to handle errors.

Let us look at the code to convert a Python string to bytes. The encoding type we use here is “UTF-8”.

#Using the byte() method # initializing string str_1 = "Join our freelance network" str_1_encoded = bytes(str_1,'UTF-8') #printing the encode string print(str_1_encoded) #printing individual bytes for bytes in str_1_encoded: print(bytes, end = ' ') 
b'Join our freelance network' 74 111 105 110 32 111 117 114 32 102 114 101 101 108 97 110 99 101 32 110 101 116 119 111 114 107 

As you can see this method has converted the string into a sequence of bytes.

Note: This method converts objects into immutable bytes, if you are looking for a mutable method you can use the bytearray() method.

Using encode():

The encode() method is the most commonly used and recommended method to convert Python strings to bytes. A major reason is that it is more readable.

string.encode(encoding=encoding, errors=errors) 

Here, string refers to the string you are looking to convert.

  1. Encoding — Optional. The encoding method you are looking to use. After Python 3, UTF-8 has become the default.
  2. Error — Optional, A string containing the error message.
#Using the encode method # initializing string str_1 = "Join our freelance network" str_1_encoded = str_1.encode(encoding = 'UTF-8') #printing the encode string print(str_1_encoded) #printing individual bytes for bytes in str_1_encoded: print(bytes, end = ' ') 
b'Join our freelance network' 74 111 105 110 32 111 117 114 32 102 114 101 101 108 97 110 99 101 32 110 101 116 119 111 114 107 
#Using the encode method #initializing string str_1 = "Join our freelance network" str_1_encoded = str_1.encode(encoding = 'UTF-8') #printing the encode string print(str_1_encoded) #decoding the string str_1_decoded = str_1_encoded.decode() print(str_1_decoded) 
b'Join our freelance network' Join our freelance network 

Limitations and Caveats — Python String to Byte

  • Both the methods solve the same problem efficiently and choosing a particular method would boil down to one’s personal choice. However, I would recommend the second method for beginners.
  • The byte() method returns an immutable object. Hence, consider using the bytearray() if you are looking for a mutable object.
  • While using the byte() methods the object must have a size 0

Источник

Python String to bytes, bytes to String

Python String To Bytes, Bytes To String

In this article, we will have a look at the conversion of Python String to bytes and Python bytes to String. Python conversion of type has gained quite an importance due to its feature of data being used during various operations in a different form.

Python conversion of String to bytes and bytes to String has its own importance for the fact that it is necessary while file handling, etc.

Python String to bytes

Either of the following ways can be used to convert Python String to bytes:

1. Python String to bytes using bytes() method

Python’s CPython library provides us with bytes() function to convert String to bytes.

Note: The UTF-8 format is used for the purpose of encoding.

inp = "Engineering Discipline" print("Input String:\n") print(str(inp)) opt = bytes(inp, 'utf-8') print("String after getting converted to bytes:\n") print(str(opt)) print(str(type(opt)))

Input String: Engineering Discipline String after getting converted to bytes: b’Engineering Discipline’

2. Python String to bytes using encode() method

Python’s encode() method can also be used to convert a String to byte format.

inp = "Engineering Discipline" print("Input String:\n") print(str(inp)) opt = inp.encode('utf-8') print("String after getting converted to bytes:\n") print(str(opt)) print(str(type(opt)))

Input String: Engineering Discipline String after getting converted to bytes: b’Engineering Discipline’

Python bytes to String

Python’s byte class has built-in decode() method to convert Python bytes to String.

inp = "Engineering Discipline" print("Input String:\n") print(str(inp)) opt = inp.encode('utf-8') print("String after getting converted to bytes:\n") print(str(opt)) print(str(type(opt))) original = opt.decode('utf-8') print("The decoded String i.e. byte to converted string:\n") print(str(original))

In the above example, we have initially converted the input string to bytes using the encode() method. After which, the decode() method converts that encoded input to original string.

Input String: Engineering Discipline String after getting converted to bytes: b'Engineering Discipline' The decoded String i.e. byte to converted string: Engineering Discipline

Pandas bytes to String

Pandas module has got Series.str.decode() method to convert the encoded data i.e. the data in bytes format to String format.

input_string.decode(encoding = 'UTF-8')
import pandas inp = pandas.Series([b"b'Jim'", b"b'Jonny'", b"b'Shawn'"]) print("Encoded String:") print(inp) opt = inp.str.decode(encoding = 'UTF-8') print("\n") print("Decoded String:") print(opt)

In the above example, we assume the data to be in encoded format. Further which, manipulations are performed on the data.

Encoded String: 0 b"b'Jim'" 1 b"b'Jonny'" 2 b"b'Shawn'" dtype: object Decoded String: 0 b'Jim' 1 b'Jonny' 2 b'Shawn' dtype: object ​

Conclusion

In this article, we have understood the conversion of Python String to bytes and vice versa which also ponders over the concept of encoding and decoding.

References

Python String to bytes, bytes to String – JournalDev

Источник

Читайте также:  What is public static php
Оцените статью