Convert pages to pdf python

How to convert webpage into pdf by using python?

Python: how to convert webpage into PDF by using Python?

There are several ways to convert a webpage into a PDF using Python. Here, I will explain two popular methods:

Method 1: Using the «pdfkit» library

  • Step 1 — Install the «pdfkit» library by running the following command in your command prompt or terminal:

The function takes two parameters: the URL of the webpage and the path to save the PDF file. For example, the following line converts the webpage «https://www.google.com» and saves the PDF file as «google.pdf» in the current working directory: pdfkit.from_url(‘https://www.google.com’, ‘google.pdf’)

Method 2: Using the «reportlab» library

  • Step 1 — Install the «reportlab» library by running the following command in your command prompt or terminal:

from reportlab.pdfgen import canvas

  • Step 3 — Create a new PDF document by creating an instance of the «canvas.Canvas» class and passing the path to save the PDF file as a parameter
Читайте также:  Python virtual environment configured

For example, the following line creates a new PDF document and saves it as «google.pdf» in the current working directory: c = canvas.Canvas(«google.pdf»)

For example, the following line adds the text «Hello World» to the PDF document at the coordinates (100, 750): c.drawString(100,750,»Hello World»)

Both these method can be used to convert webpage into pdf using python. But you need to keep in mind that the first method (pdfkit) will only convert the static version of the webpage and the second method (reportlab) will only add the text to the pdf.

You can also use other libraries like PyPDF2, WeasyPrint, etc to convert webpage into pdf.

Sure, another method that you can use to convert a webpage into a PDF using Python is the «WeasyPrint» library.

Method 3: Using the «WeasyPrint» library

  • Step 1 — Install the «WeasyPrint» library by running the following command in your command prompt or terminal:

The class takes the URL of the webpage as a parameter and the method takes the path to save the PDF file as a parameter. For example, the following line converts the webpage «https://www.google.com» and saves the PDF file as «google.pdf» in the current working directory: weasyprint.HTML(‘https://www.google.com’).write_pdf(‘google.pdf’)

This method allows you to convert a webpage with CSS and javascript into pdf. You can also convert a local html file by passing the path to the file instead of the url.

It’s worth noting that some web pages may require additional headers or cookies to be passed in order to be properly rendered as a PDF.

You can also convert a webpage into a pdf by using browser extension like save as pdf, or use online tools like web2pdfconvert.com or others.

Conclusion

In conclusion, there are several ways to convert a webpage into a PDF using Python, including using the «pdfkit» library, the «reportlab» library, the «WeasyPrint» library, or using browser extensions or online tools. Each method has its own advantages and disadvantages, so you should choose the one that best suits your needs. The «pdfkit» library is good for converting static webpages, the «reportlab» library is good for adding text to PDFs, the «WeasyPrint» library is good for converting webpages with CSS and javascript, and browser extensions or online tools are good for quick and easy conversion.

Источник

Рецепты Python: преобразование из HTML и URL в PDF и PS

Для приготовления преобразования из HTML и URL в PDF и PS нам понадобится сам python, генератор htmldoc и плагин pyhtmldoc. (Я дал ссылки на свои форки, т.к. делал некоторые изменения, которые пока не удалось пропихнуть в оригинальный репозитории. Можно также воспользоваться готовым образом.)

Для начала импортируем плагин командой

Для преобразования из HTML и URL в PDF и PS используем команды

 pdf = file2pdf('file.html'.encode(), None) # преобразуем FILE в PDF ps = file2ps('file.html'.encode(), None) # преобразуем FILE в PS file2pdf('file.html'.encode(), 'file.pdf') # преобразуем FILE в PDF и сохраняем результат в файл file2ps('file.html'.encode(), 'file.pdf') # преобразуем FILE в PS в PDF и сохраняем результат в файл pdf = file2pdf(['file1.html'.encode(), 'file2.html'.encode()], None) # преобразуем несколько FILE в PDF ps = file2ps(['file1.html'.encode(), 'file2.html'.encode()], None) # преобразуем несколько FILE в PS file2pdf(['file1.html'.encode(), 'file2.html'.encode()], 'file.pdf') # преобразуем несколько FILE в PDF и сохраняем результат в файл file2ps(['file1.html'.encode(), 'file2.html'.encode()], 'file.pdf') # преобразуем несколько FILE в PS в PDF и сохраняем результат в файл pdf = html2pdf('Здравствуй, мир!'.encode(), None) # преобразуем HTML в PDF ps = html2ps('Здравствуй, мир!'.encode(), None) # преобразуем HTML в PS html2pdf('Здравствуй, мир!'.encode(), 'file.pdf') # преобразуем HTML в PDF и сохраняем результат в файл html2ps('Здравствуй, мир!'.encode(), 'file.pdf') # преобразуем HTML в PS в PDF и сохраняем результат в файл pdf = html2pdf(['Здравствуй, мир!'.encode(), 'До свидания, мир!'.encode()], None) # преобразуем несколько HTML в PDF ps = html2ps(['Здравствуй, мир!'.encode(), 'До свидания, мир!'.encode()], None) # преобразуем несколько HTML в PS html2pdf(['Здравствуй, мир!'.encode(), 'До свидания, мир!'.encode()], 'file.pdf') # преобразуем несколько HTML в PDF и сохраняем результат в файл html2ps(['Здравствуй, мир!'.encode(), 'До свидания, мир!'.encode()], 'file.pdf') # преобразуем несколько HTML в PS в PDF и сохраняем результат в файл pdf = url2pdf('https://google.com'.encode(), None) # преобразуем URL в PDF ps = url2ps('https://google.com'.encode(), None) # преобразуем URL в PS url2pdf('https://google.com'.encode(), 'file.pdf') # преобразуем URL в PDF в PDF и сохраняем результат в файл url2ps('https://google.com'.encode(), 'file.pdf') # преобразуем URL в PS в PDF и сохраняем результат в файл pdf = url2pdf(['https://google.com'.encode(), 'https://google.ru'.encode()], None) # преобразуем несколько URL в PDF ps = url2ps(['https://google.com'.encode(), 'https://google.ru'.encode()], None) # преобразуем несколько URL в PS url2pdf(['https://google.com'.encode(), 'https://google.ru'.encode()], 'file.pdf') # преобразуем несколько URL в PDF в PDF и сохраняем результат в файл url2ps(['https://google.com'.encode(), 'https://google.ru'.encode()], 'file.pdf') # преобразуем несколько URL в PS в PDF и сохраняем результат в файл 

Источник

Python – Convert HTML Page to PDF

PDF is one of the most used digital format to save or transfer documents. In this article, we will learn how to convert HTML page to PDF.

What additional libraries or software do we need?

We will use pdfkit library and wkhtmltopdf.

Install pdfkit

To install pdfkit, run the following pip command.

Python – Convert HTML Page to PDF

Run Code Online

Install wkhtmltopdf

Ubuntu or Debian users can install wkhtmltopdf using below apt-get command.

sudo apt-get install wkhtmltopdf

Provide the password if prompted.

Windows users can download wkhtmltopdf from this official github repository wkhtmltopdf. The file size would be around 25MB and takes a moment to download.

Once downloaded, double click on the binary file and continue with the installation. It would be mostly installed at the path C:\Program Files\wkhtmltopdf. We should add bin folder to the system PATH variable in Environment Variables. For example, C:\Program Files\wkhtmltopdf\bin.

Add wkhtmltopdf to Path Environment Variable

Restart the command prompt, if you are running the python program using command prompt python command for the Path to take effect.

Example 1: HTML to PDF using URL

Now that the environment is setup, following is a simple example to convert HTML to PDF, where HTML is downloaded from a URL. We use the function from_url().

import pdfkit pdfkit.from_url('https://www.google.com/','sample.pdf') 

The converted PDF file is saved to the current path in the command prompt or terminal.

Python – Convert HTML Page to PDF

Output pdf file would look like

Python – Convert HTML Page to PDF

Example 2: Convert HTML to PDF from Local File

If your HTML file is stored locally, you can use from_file() function and convert the local HTML file to PDF.

import pdfkit pdfkit.from_file('local.html', 'sample.pdf') 

Example 2: Convert HTML String to PDF

If your HTML data is stored in a Python variable, you can use from_string() function and convert the HTML string to PDF.

import pdfkit var htmlstr = '

Heading 2

Sample paragraph.

' pdfkit.from_string(htmlstr, 'sample.pdf')

Python – Convert HTML Page to PDF

Run Code Online

Summary

We have successfully converted a HTML data to PDF. We have considered HTML data to be from a URL, local file or a string.

Источник

Convert HTML to PDF using Python

Convert HTML to PDF in Python

In this tutorial we will explore how to convert HTML files to PDF using Python.

Table of Contents

  • Conclusion
  • and opening it in the code editor should show:

    Convert HTML file to PDF using Python

    Let’s start with converting HTML file to PDF using Python.

    The sample.html file is located in the same directory as the main.py file with the code:

    First, we will need to find the path to the wkhtmltopdf executable file wkhtmltopdf.exe

    Recall that we installed in C:\Program Files\wkhtmltopdf meaning that the .exe file is in that folder. Navigating to it, you should see that the path to executable file is: C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe

    Now we have everything we need and can easily convert HTML file to PDF using Python:

    And you should see sample.pdf created in the same directory:

    which should should look like this:

    Convert Webpage to PDF using Python

    Using pdfkit library you can also convert webpages into PDF using Python.

    In this section we will reuse most of the code from the previous section, except now instead of using HTML file we will use the URL of a webpage and the .from_url() method of pdfkit class:

    And you should see webpage.pdf created in the same directory:

    which should should look like this:

    Conclusion

    In this article we explored how to convert HTML to PDF using Python and wkhtmltopdf.

    Источник

    Оцените статью