Python sys stdout encoding

Encoding of Python stdout

Python determines the encoding of stdout and stderr based on the value of the LC_CTYPE variable, but only if the stdout is a tty. So if I just output to the terminal, LC_CTYPE (or LC_ALL) define the encoding. However, when the output is piped to a file or to a different process, the encoding is not defined, and defaults to 7-bit ASCII.

Example

#!/usr/bin/env python2 import sys print(sys.stdout.encoding)

If the output is a terminal, the output is self-explanatory:

% LC_ALL=C % ./encoding.py US-ASCII
% LC_ALL=en_GB.UTF-8 % ./encoding.py UTF-8

However, as soon as the output is piped to a logfile, you get a different behaviour in Python 2:

% ./encoding.py > encoding.out ; cat encoding.out None

This poses a problem when writing Unicode to log files:

#!/usr/bin/env python # encoding: utf-8 import sys sys.stderr.write(u"Convert currency ($1.00 = €0.75)\n")
% ./writelog.py Convert currency ($1.00 = €0.75)
% ./writelog.py 2> error.log ; cat error.log Traceback (most recent call last): File "./encoding.py", line 5, in sys.stderr.write(u"Convert currency ($1.00 = €0.75)\n") UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 26: ordinal not in range(128)

Overriding the Encoding of stdout or stderr

Basically, Python makes an educated guess about the encoding of stdout and stderr. Unfortunately, it is not possible to simple set the encoding of stdout if Pythons is wrong, like this:

TypeError: readonly attribute

My personal opinion is that Python fails it’s own guidelines, and I’m not the first to complain.

Читайте также:  Telegram api php class

The good news is that Python 3 no longer exhibits this behaviour.

The general solution is to explicitly feed binary data to stdout and stderr. This can be done by doing the encoding manually:

#!/usr/bin/env python # encoding: utf-8 import sys sys.stderr.write(u"Convert currency ($1.00 = €0.75)\n".encode('utf-8'))

Fixing the output on a given encoding may seem ugly if the output is always send to a terminal (this would even send UTF-8 to a non-UTF-8 terminal), but may be desirable if the output is often piped to other programs or to a file. It makes sure that the output is always consistent, regardless of the environment. This is a tremendous benefit when debugging.

Instead of calling encode() in each sys.stderr.write() , it is also possible to redirect stdout and/or stderr through a StreamWriter at the start of the program:

I recommend not to call sys.stdout.write, print, or sys.stderr.write directly, but use a wrapper function and call that:

def write(line): print(line.encode('utf-8'))

StreamWriter Wrapper around Stdout

Instead of calling encode() in each sys.stderr.write() , or using a wrapper function, it is also possible to redirect stdout and/or stderr through a StreamWriter at the start of the program:

if sys.stdout.encoding != 'UTF-8': sys.stdout = codecs.getwriter('utf-8')(sys.stdout, 'strict') if sys.stderr.encoding != 'UTF-8': sys.stderr = codecs.getwriter('utf-8')(sys.stderr, 'strict')
if sys.stdout.encoding != 'UTF-8': sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict') if sys.stderr.encoding != 'UTF-8': sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict')

StreamWriter Wrapper with Dynamic Encoding

The above examples use a fixed encoding, which is highly recommended if the output is written to file, piped to another program or otherwise logged or processed.

If the output is never processed nor logged, but only for interactive debugging, it is also possible to let the output encoding depend on the environment. For example, sys.stdout.encoding , sys.stderr.encoding , locale.getpreferredencoding() , sys.getdefaultencoding() or -if none of these are known- US-ASCII. This is roughly what the print() function does, only slightly more robust, because it uses xml chars if a character can’t be displayed instead of raising a UnicodeEncodingError.

The following code snippet sets the encoding according to these variables. It also makes sure to replace non-ASCII characters with their XML-encoded counterpart if the desired output is ASCII.

#!/usr/bin/env python import sys,codecs,locale print(sys.stdout.encoding) print(sys.stderr.encoding) print(locale.getpreferredencoding()) if sys.stdout.encoding.upper() != 'UTF-8': encoding = sys.stdout.encoding or locale.getpreferredencoding() try: encoder = codecs.getwriter(encoding) except LookupError: sys.stdout.write("Warning: unknown encoding %s specified in locale().\n" % encoding) encoder = codecs.getwriter('UTF-8') if encoding.upper() != 'UTF-8': sys.stdout.write("Warning: stdout in %s formaat. Diacritical signs are represented in XML-coded format." % encoding) try: sys.stdout = encoder(sys.stdout.buffer, 'xmlcharrefreplace') except AttributeError: sys.stdout = encoder(sys.stdout, 'xmlcharrefreplace') if sys.stderr.encoding.upper() != 'UTF-8': encoding = sys.stderr.encoding or locale.getpreferredencoding() try: encoder = codecs.getwriter(encoding) except LookupError: sys.stderr.write("Warning: unknown encoding %s specified in locale().\n" % encoding) encoder = codecs.getwriter('UTF-8') if encoding.upper() != 'UTF-8': sys.stderr.write("Warning: stderr in %s formaat. Diacritical signs are represented in XML-coded format." % encoding) try: sys.stderr = encoder(sys.stderr.buffer, 'xmlcharrefreplace') except AttributeError: sys.stderr = encoder(sys.stderr, 'xmlcharrefreplace')

Further Reading

Источник

Python sys.stdout Method

Python sys.stdout Method

  1. Syntax of Python sys.stdout Method
  2. Example Codes: Use the sys.stdout Method in Python
  3. Example Codes: sys.stdout.write() vs print() Method
  4. Example Codes: Use the sys.stdout.write() Method to Display a List
  5. Example Codes: Use the sys.stdout.flush() Method in Python
  6. Example Codes: Use the sys.stdout.encoding() Method in Python
  7. Example Codes: Duplicate sys.stdout to a Log File

One of the methods in the sys module in Python is stdout , which uses its argument to show directly on the console window.

These kinds of output can be different, like a simple print statement, an expression, or an input prompt. The print() method, which has the same behavior, first converts to the sys.stdout() method and then shows the result on the console.

Syntax of Python sys.stdout Method

Parameters

No parameter is involved. We use sys.stdout for the output file object.

Return

This method doesn’t return any value but only shows output directly on the console.

Example Codes: Use the sys.stdout Method in Python

# import the sys module to use methods import sys  sys.stdout.write('This is my first line') sys.stdout.write('This is my second line') 
This is my first line This is my second line 

It will return the passed argument in the sys.stdout.write() method and display it on the screen.

Example Codes: sys.stdout.write() vs print() Method

import sys # print shows new line at the end print("First line ") print("Second line ") # displays output directly on console without space or newline sys.stdout.write('This is my first line ') sys.stdout.write('This is my second line ') # for inserting new line sys.stdout.write("\n") sys.stdout.write('In new line ') # writing string values to file.txt print('Hello', 'World', 2+3, file=open('file.txt', 'w')) 
First line Second line This is my first line This is my second line In new line # file.txt will be created with text "Hello World 5" as a string 

We use the sys.stdout.write() method to show the contents directly on the console, and the print() statement has a thin wrapper of the stdout() method that also formats the input. So, by default, it leaves a space between arguments and enters a new line.

After Python version 3.0, print() method not only takes stdout() method but also takes a file argument. To give a line space, we pass the «\n» to the stdout.write() method.

Example Codes: Use the sys.stdout.write() Method to Display a List

import sys  # sys.stdout assigned to "carry" variable carry = sys.stdout my_array = ['one', 'two', 'three']  # printing list items here for index in my_array:  carry.write(index) 
# prints a list on a single line without spaces onetwothree 

The output shows that the stdout.write() method gives no space or new line to the provided argument.

Example Codes: Use the sys.stdout.flush() Method in Python

import sys # import for the use of the sleep() method import time # wait for 5 seconds and suddenly shows all output for index in range(5):  print(index, end =' ')  time.sleep(1) print() # print one number per second till 5 seconds for index in range(5):  # end variable holds /n by default  print(index, end =' ')  sys.stdout.flush()  time.sleep(1) 
0 1 2 3 4 # no buffer 0 1 2 3 4 # use buffer 

The sys.stdout.flush() method flushes the buffer. It means that it will write things from buffer to console, even if it would wait before it does that.

Example Codes: Use the sys.stdout.encoding() Method in Python

# import sys module for stdout methods import sys # if the received value is not None, then the function prints repr(receivedValue) to sys.stdout def display(receivedValue):  if receivedValue is None:  return  mytext = repr(receivedValue)  # exception handling  try:  sys.stdout.write(mytext)  # handles two exceptions here  except UnicodeEncodeError:  bytes = mytext.encode(sys.stdout.encoding, 'backslashreplace')  if hasattr(sys.stdout, 'buffer'):  sys.stdout.buffer.write(bytes)  else:  mytext = bytes.decode(sys.stdout.encoding, 'strict')  sys.stdout.write(mytext)  sys.stdout.write("\n")  display("my name") 

The method sys.stdout.encoding() is used to change the encoding for the sys.stdout . In the method display() , we use it to evaluate an expression inserted in an interactive Python session.

There is an exception handler with two options: if the argument value is encodable, then encode with the backslashreplace error handler. Otherwise, if it is not encodable, it should encode with the sys.std.errors error handler.

Example Codes: Duplicate sys.stdout to a Log File

import sys # method for multiple log saving in txt file class multipleSave(object):  def __getattr__(self, attr, *arguments):  return self._wrap(attr, *arguments)  def __init__(self, myfiles):  self._myfiles = myfiles  def _wrap(self, attr, *arguments):  def g(*a, **kw):  for f in self._myfiles:  res = getattr(f, attr, *arguments)(*a, **kw)  return res  return g  sys.stdout = multipleSave([ sys.stdout, open('file.txt', 'w') ]) # all print statement works here print ('123') print (sys.stdout, 'this is second line') sys.stdout.write('this is third line\n') 
# file.txt will be created on the same directory with multiple logs in it. 123 __main__.multipleSave object at 0x00000226811A0048> this is second line this is third line 

To store output console results in a file, we can use the open() method to store it. We store all the console outputs in the same log file.

This way, we can store any output printed to the console and save it to the log file.

Related Article — Python Sys

Источник

Оцените статью