How to convert a PDF from base64 string to a file?
Can you tell me what is a change you have done in that because I have the facing same issue for that type of string.
6 Answers 6
From my understanding base64decode only takes in a base64 string and looks like you have some headers on your string that are not encoded.
I would remove «data:application/pdf;base64,»
When I’ve used it in the past, I have only used the encoded string.
Does writing it by using the codecs.decode function work? also as Mark stated, you can try to remove the data:application/pdf;base64, portion of the string as this section of the string is not to be decoded.:
import codecs base64String = "JVBERi0xLjQKJeHp69MKMSAwIG9iago8PC9Qcm9kdWNlciAoU2tpYS9. " with open("test.pdf", "wb") as f: f.write(codecs.decode(base64string, "base64"))
import base64 base64String = "JVBERi0xLjQKJeHp69MKMSAwIG9iago8PC9Qcm9kdWNlciAoU2tpYS9. " with open("test.pdf", "wb") as f: f.write(base64.b64decode(base64string))
This is not just base64 encoded data, but data-uri encoded:
There is another post on stack overflow asking how to parse such strings in Python:
The gist of it is to remove the header (everything up to and including the first comma):
theFile.write(base64.b64decode(base64String.split(",")[1:2]))
NOTE: I use [1:2] instead of [1] because it won’t throw an exception if there is only 1 element in the list because nothing follows the comma (empty data).
from base64 import b64decode def base64_to_pdf(file): file_bytes = b64decode(file, validate=True) if file_bytes[0:4] != b"%PDF": raise ValueError("Missing the PDF file signature") with open("file.pdf", "wb") as f: return f.write(file_bytes)
for some reason above code didnt work for me but below worked. import base64 base64String = "JVBERi0xLjQKJeHp69MKMSAwIG9iago8PC9Qcm9kdWNlciAoU2tpYS9. " data = base64.b64decode(base64string) with open("test.pdf", "wb") as f: f.write(data)
yeah if am using f.write(base64.b64decode(base64string)) am not getting any error but getting an empty pdf file. adding additional step to decode in another variable and then writing it to file is working fine.. if i am using b64decode(file, validate=True) it is giving me error like non base64 character in your string. probably its because of pdf has an embedded image.
Convert Base64 To PDF In Python
Last updated June 2, 2023 by Jarvis Silva In this tutorial we will see how to convert base64 to pdf using python, base64 is used to encode binary data so inorder to convert base64 to pdf we will use the python library Base64, it will allows us to decode base64.
Python Code To Convert Base64 To PDF
import base64 b64 = 'enter base64 code here' bytes = base64.b64decode(b64, validate=True) f = open('sample.pdf', 'wb') f.write(bytes) f.close()
- The code imports the base64 module for working with Base64 encoding and decoding.
- There is a variable named b64 which should contain the Base64-encoded string you want to decode.
- The code uses base64.b64decode() to decode the Base64 string and stores the result in the bytes variable.
- It opens a file named sample.pdf in write-binary mode (‘wb’) using open() .
- The decoded binary data is written to the file using f.write(bytes) .
- Finally, the file is closed using f.close() .
After running the program you will see that it printed the base64 version of the pdf file so now if you want to convert it back to base64 then refer this tutorial: Convert pdf to base64 in python.
Here are some more python programs you will find helpful:
- Convert GPX to CSV file in python.
- Convert IPYNB file to python file.
- Convert Wav to mp3 format in python.
- Convert Miles to feet in python.
- Convert Hex To RGB In Python.
- Convert RGB To Hex In Python.
- Convert HTML To Docx In Python.
I hope you found what you were looking for from this python tutorial, and if you want more python tutorials like this, do join our Telegram channel to get updated.
Thanks for reading, have a nice day 🙂
How to Convert base64 to PDF in Python?
Now, let’s see post of how to convert base64 to pdf in python. I explained simply about python convert base64 to pdf. I’m going to show you about how to convert base64 string to pdf in python. This example will help you how to decode base64 string to pdf in python.
If you are looking to convert base64 string to an pdf in python then I will help you with that. I will give you a very simple example, we will take one variable «pdfData» with base64 string and then convert it into the pdf. we will use base64 library. So, without further ado, let’s see simple examples: You can use these examples with python3 (Python 3) version.
# Import base64 Module import base64 # PDF Base64 String pdfData = "JVBERi0xLjMNCiXi48/TDQoNCjEgMCBvYmoNCjw8DQovVHlw. "; # Decode base64 String Data decodedData = base64.b64decode((pdfData)) # Write PDF from Base64 File pdfFile = open('sample.pdf', 'wb') pdfFile.write(decodedData) pdfFile.close()
Hardik Savani
I’m a full-stack developer, entrepreneur and owner of Aatman Infotech. I live in India and I love to write tutorials and tips that can help to other artisan. I am a big fan of PHP, Laravel, Angular, Vue, Node, Javascript, JQuery, Codeigniter and Bootstrap from the early stage. I believe in Hardworking and Consistency.
We are Recommending you
- How to Get Last Value in Dictionary Python?
- How to Get First Value in Dictionary Python?
- How to Convert Dictionary to JSON in Python?
- How to Check if Python Dictionary is Empty or Not?
- How to Change Value in Python Dictionary?
- Python Pandas Read Excel File Multiple Sheets Example
- How to Create Excel File in Python?
- How to Open and Read Xlsx File in Python?
- Python Remove Empty String from List Example
- Python Delete Folder and Files Recursively Example
- Python Delete File if Exists Example
- How to Add Seconds to DateTime in Python?
- Python PATCH Request with Parameters Example
How to base64 encode a PDF file in Python
Actually, that fact turned out to be the answer. I just didn’t know about it when I asked the question.
4 Answers 4
If you don’t want to use the xmlrpclib’s Binary class, you can just use the .encode() method of strings:
a = open("pdf_reference.pdf", "rb").read().encode("base64")
Doesn’t this read the entire file by calling read() before encoding it? Is that how it’s supposed to work? I can’t imagine encoding a multi-MB file or larger with this.
@Shurane it’s a one-line example of using the encode method of strings. Optimising performance will be application specific and is left as an exercise.
this no longer works in Python 3, try this: base64.b64encode(open(‘path/to/your.pdf’, ‘rb’).read()) credit: stackoverflow.com/a/43084065/5125264
Actually, after some more digging, it looks like the xmlrpclib module may have the piece I need with it’s Binary helper class:
binary_obj = xmlrpclib.Binary( open('foo.pdf').read() )
import xmlrpclib server = xmlrpclib.ServerProxy("http://athomas:password@localhost:8080/trunk/login/xmlrpc") server.wiki.putAttachment('WikiStart/t.py', xmlrpclib.Binary(open('t.py').read()))
You can do it with the base64 library, legacy interface.
Looks like you might be able to use the binascii module
binascii.b2a_base64(data)
Convert binary data to a line of ASCII characters in base64 coding. The return value is the converted line, including a newline char. The length of data should be at most 57 to adhere to the base64 standard.