- Use Python to remove image watermarks
- Saved searches
- Use saved searches to filter your results more quickly
- License
- Gadesme/watermark-remover
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.md
- Saved searches
- Use saved searches to filter your results more quickly
- akash-rajak/Image-Watermark-Remover
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.md
Use Python to remove image watermarks
Unwatermark PDF images using Python. The idea is very simple, the code is very simple.
First consider how Python unwatermarks images, then reuse the idea to PDF.
This picture is a screenshot from the PDF of data Structure and Algorithms, with the watermark of the official account.
It is obvious from the above figure that the watermark color is generally light in order not to affect the reading of the text. Therefore, we can use the color difference feature to remove the watermark. That is: use Python to read the color of the image and whiten the light-colored parts.
PIL is the standard library for Python. Python2 is native to the system. Python3 needs to be installed by itself
pip install pillow Copy the code
Once the installation is complete, read the picture and get the dimensions (width and height) of the picture
from PIL import Image img = Image.open('watermark_pic.png') width, height = img.size Copy the code
Before we move on, let’s talk a little bit about color in computers. Optical three primary colors are red, green and blue (RGB), that is to say, they are three basic colors that can not be decomposed, other colors can be mixed by these three colors, three colors mixed in equal proportion is white, no light is black.
In a computer, three bytes can be used to represent RGB colors. The maximum value of one word is 255, so (255, 0, 0) represents red, (0, 255, 0) represents green, and (0, 0, 255) represents blue. Accordingly, (255, 255, 255) represents white and (0, 0, 0) represents black. Any combination from (0, 0, 0) to (255, 255, 255) can represent a different color.
Next we can read the RGB of the picture with the following code
for i in range(width): for j in range(height): pos = (i, j) print(img.getpixel(pos)[:3]) Copy the code
The color of each position in the picture is represented by a quad. The first three digits are RGB, and the fourth digit is Alpha channel, so we don’t need to care.
With RGB, we can modify it.
As can be seen from the figure, the RGB of the watermark is # d9D9D9, which is represented in hexadecimal, which is actually (217, 217, 217).
The closer each of these values gets to 255, the lighter the color becomes, and when they all become 255, they become white. So anywhere RGB is greater than 217, we can make it white. That is, the sum of RGB three digits is greater than or equal to 651.
if sum(img.getpixel(pos)[:3 =])651: img.putpixel(pos, (255.255.255)) Copy the code
The complete code is as follows:
from PIL import Image img = Image.open('watermark_pic.png') width, height = img.size for i in range(width): for j in range(height): pos = (i, j) if sum(img.getpixel(pos)[:3 =])651: img.putpixel(pos, (255.255.255)) img.save('watermark_removed_pic.png') Copy the code
With the above foundation, it is simple to remove the PDF watermark, the idea is to convert each PDF page into a picture, and then modify the RGB of the watermark, and finally output the picture.
Install the PyMupdf library to manipulate PDF files
pip install pymupdf Copy the code
Read the PDF and transfer the image
import fitz doc = fitz.open("Manual of Data Structures and Algorithms @ public code.pdf") for page in doc: pix = page.get_pixmap() Copy the code
The PDF has 480 pages, so you need to traverse each page and get the image PIX for each page. The PIx object is similar to the IMG object we saw above, and its RGB can be read and modified.
The page.get_pixmap() operation is irreversible, that is, it can convert PDF to image, but after changing the IMAGE RGB, it cannot be applied to PDF and can only be output as image.
Modify the watermark RGB as before, the difference is that RGB here is a triplet, no Alpha channel, the code is as follows:
from itertools import product for pos in product(range(pix.width), range(pix.height)): if sum(pix.pixel(pos[0], pos[1 =))651: pix.set_pixel(pos[0], pos[1], (255.255.255)) Copy the code
The complete code is as follows:
from itertools import product import fitz doc = fitz.open("Manual of Data Structures and Algorithms @ public code.pdf") page_no = 0 for page in doc: pix = page.get_pixmap() for pos in product(range(pix.width), range(pix.height)): if sum(pix.pixel(pos[0], pos[1 =))651: pix.set_pixel(pos[0], pos[1], (255.255.255)) pix.pil_save(f"pdf_pics/page_ .png", dpi=(30000.30000)) print('the first f Page removal done ') page_no += 1 Copy the code
There are drawbacks to this approach. First, the output is not in PDF format; Second, the output picture is fuzzy, the follow-up needs to be optimized, the best is to modify the PDF directly.
Continue to share Python basics and tools to use.
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
License
Gadesme/watermark-remover
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.md
This is a Deep learning project that removes watermarks and is based on noise2noise modifications. The original project is from https://github.com/yu4u/noise2noise
Detailed instructions for use
The traditional method of image watermark removal is efficient, but it is more damaging to the details. Removing watermarks is simple and easy to say, and hard to do. Some watermarks with repair stamp take a few seconds to remove, some watermarks take a couple of hours without being able to totally removed it correctly.
Some images that are not very rich in detail can be filled with near pixel by an image processing software such as Photoshop to cover up the watermark part, and you can achieve near perfect results.
This is the result of 9 hours of training on the 1050ti, it may be a little dirty, but theoretically 20 hours or more of training is enough to reach a usable level (left is the original image, right is the de-watermarked image,)
It is better than professional-grade software such as photoshop.
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
Python script to remove watermark from any image with watermark in it.
akash-rajak/Image-Watermark-Remover
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.md
- An Image Watermark Remover is an application created in python with tkinter gui and OpenCv library.
- In this application user can select any image with watermark in it and will be able to remove the watermark from that selected image.
- Also user will be shown both the image with watermark and the image without watermark as an output.
- User can also save that snipped image any where on local system by using save command.
- For implementing this used OpenCv library.
- python 3
- cv2 module
- tkinter module
- filedialog from tkinter
- messagebox
- from PIL import Image, ImageTk
- User just need to download the file, and run the image_watermark_remover.py, on local system.
- After running a GUI window appears, where user can start the application of removing watermark by clicking on the START button.
- After that a new GUI window will open, in which user will have buttons like SELECT and EXIT.
- User can select any image file with watermark in it from the local system, using SELECT button.
- After that user will be able to see both the image with watermark and image without watermark as an output.
- User can also save that image without watermark any where on local system by using save command.
- Install tkinter, PIL, cv2
- After that download the code file, and run image_watermark_remover.py on local system.
- Then the script will start running and user can explore it by selecting any image with watermark in it and removing it.