Python selenium img src

Download image with selenium python

What i don’t know is the need of put here «hey i don’t know X when you can search it instead». What’s your real profit of doing that? Please, watch my next comment and say me what do you think about.

9 Answers 9

Here’s a complete example (using google’s recaptcha as a target):

import urllib from selenium import webdriver driver = webdriver.Firefox() driver.get('http://www.google.com/recaptcha/demo/recaptcha') # get the image source img = driver.find_element_by_xpath('//div[@id="recaptcha_image"]/img') src = img.get_attribute('src') # download the image urllib.urlretrieve(src, "captcha.png") driver.close() 

The problem with dynamic generated images is that there is a new image generated each time you request it. In that case, you have several options:

 from selenium import webdriver driver = webdriver.Firefox() driver.get('https://moscowsg.megafon.ru/ps/scc/php/cryptographp.php?PHPSESSID=mfc540jkbeme81qjvh5t0v0bnjdr7oc6&ref=114&w=150') driver.save_screenshot("screenshot.png") driver.close() 

In the last couple of years.. urllib has changed. Instead of using urllib.urlretrieve, you now need to use urllib.request.urlretrieve.

It’s ok to save a screenshot from the whole page and then cut the image from, but you can also to use the «find» method from «webdriver» to locate the image you want to save, and write the «screenshot_as_png» property like below:

from selenium import webdriver driver = webdriver.Firefox() driver.get('https://www.webpagetest.org/') with open('filename.png', 'wb') as file: file.write(driver.find_element_by_xpath('/html/body/div[1]/div[5]/div[2]/table[1]/tbody/tr/td[1]/a/div').screenshot_as_png) 

Sometimes it could get an error because of the scroll, but depending on your necessity, it’s a good way to get the image.

Читайте также:  Изменить иконку html файла

@ Ramon this does not work, I am trying to get the image of your profile pic from this page and get the error elenium.common.exceptions.WebDriverException: Message: unknown command: session/5734e4b0f8d6171317af42ddf0979562/element/0.9586500909174849-1/screenshot

The problem of using save_screenshot is that we cannot save an image in its original quality and cannot restore the alpha channel in an image. Therefore, I propose another solution. Here is a complete example using the selenium-wire library suggested by @codam_hsmits. It is possible to download images via ChromeDriver .

I have defined the following function to parse each request and save the request body to a file when necessary.

from seleniumwire import webdriver # Import from seleniumwire from urllib.parse import urlparse import os from mimetypes import guess_extension import time import datetime def download_assets(requests, asset_dir="temp", default_fname="unnamed", skip_domains=["facebook", "google", "yahoo", "agkn", "2mdn"], exts=[".png", ".jpeg", ".jpg", ".svg", ".gif", ".pdf", ".bmp", ".webp", ".ico"], append_ext=False): asset_list = <> for req_idx, request in enumerate(requests): # request.headers # request.response.body is the raw response body in bytes if request is None or request.response is None or request.response.headers is None or 'Content-Type' not in request.response.headers: continue ext = guess_extension(request.response.headers['Content-Type'].split(';')[0].strip()) if ext is None or ext == "" or ext not in exts: #Don't know the file extention, or not in the whitelist continue parsed_url = urlparse(request.url) skip = False for d in skip_domains: if d in parsed_url.netloc: skip = True break if skip: continue frelpath = parsed_url.path.strip() if frelpath == "": timestamp = str(datetime.datetime.now().replace(microsecond=0).isoformat()) frelpath = f"__" elif frelpath.endswith("\\") or frelpath.endswith("/"): timestamp = str(datetime.datetime.now().replace(microsecond=0).isoformat()) frelpath = frelpath + f"__" elif append_ext and not frelpath.endswith(ext): frelpath = frelpath + f"_" #Missing file extension but may not be a problem if frelpath.startswith("\\") or frelpath.startswith("/"): frelpath = frelpath[1:] fpath = os.path.join(asset_dir, parsed_url.netloc, frelpath) if os.path.isfile(fpath): continue os.makedirs(os.path.dirname(fpath), exist_ok=True) print(f"Downloading to ") asset_list[fpath] = request.url try: with open(fpath, "wb") as file: file.write(request.response.body) except: print(f"Cannot download to ") return asset_list 

Let’s download some images from Google homepage to temp folder.

# Create a new instance of the Chrome/Firefox driver driver = webdriver.Chrome() # Go to the Google home page driver.get('https://www.google.com') # Download content to temp folder asset_dir = "temp" while True: # Please browser the internet, it will collect the images for every second time.sleep(1) download_assets(driver.requests, asset_dir=asset_dir) driver.close() 

Note that it cannot decide which images can be seen on the page rather than being hidden in the background, so the users should actively click the buttons or links to trigger new download requests.

Источник

how to get img src url to python?

enter image description here

I want to get the following scr url.

from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.common.by import By from datetime import date import openpyxl import time from webdriver_manager.chrome import ChromeDriverManager driver = webdriver.Chrome(ChromeDriverManager().install()) img_class = 'thumbnail_thumb_wrap__1pEkS _wrapper' #div img_xpath = '/html/body/div/div/div[2]/div[2]/div[3]/div[1]/ul/div/div[1]/li/div/div[1]/div/a/img' img_css = '.thumbnail_thumb_wrap__1pEkS .thumbnail_thumb__3Agq6:before' item_img = driver.find_elements(By.XPATH, img_xpath).__getattribute__('src') print(item_img) 

I searched by Xpath, css_select, and class_name, but I see an error message that none of them have ‘src’. What am I missing here?

1 Answer 1

I believe you have to use the .get_attribute(«src») method, instead of the __getattribute__ method. The former is implemented by Selenium as a method to get an attribute from the webpage, while the latter is a builtin Python method to try and get the attribute of the Python object.

See here for the Selenium get_attribute documentation, and here for the Python __getattribute__ method that you should not use in this situation.

Example

from selenium import webdriver from selenium.webdriver.common.by import By driver = webdriver.Chrome() img_xpath = '/html/body/div[4]/main/div[1]/div[1]/div[1]/div/div/div[2]/div/div/img' driver.get("https://github.com") item_web_element = driver.find_element(By.XPATH, img_xpath) item_img = item_web_element.get_attribute("src") print("The source is:", item_img) driver.quit() 

Источник

Python selenium get image src

Thanks for contributing an answer to Stack Overflow!,I want to add somethingFOO (or data:somethingFOO) to a string using python & selenium. How can I do that?, Stack Overflow Public questions & answers , Meta Stack Overflow

if your html is like this,

then the following code will give you hello world

driver.find_element_by_id("demo").text 

and following code will give you demo

driver.find_element_by_id("demo").get_attribute("id") 

Answer by Frank Kent

driver.find_element_by_id("element_id").get_attribute("src")

Answer by Brixton Graham

How do I get the resource id of an image if I know its name in Android?,How do I get the resource id of an image if I know its name in Android using Kotlin?,We can get the source of an image in Selenium. An image in an html document has tagname. Each image also has an attribute src which contains the source of image in the page.,How do I set the Selenium webdriver get timeout?

import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.chrome.ChromeDriver; import java.util.concurrent.TimeUnit; public class Imagesrc < public static void main(String[] args) < System.setProperty("webdriver.chrome.driver", "C:\\Users\\ghs6kor\\Desktop\\Java\\chromedriver.exe"); WebDriver driver = new ChromeDriver(); String url = "https://www.tutorialspoint.com/index.htm"; driver.get(url); driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS); // identify image WebElement l =driver.findElement(By.xpath("//img[@title='Tutorialspoint']")); //getAttribute() to get src of image System.out.println("Src attribute is: "+ l.getAttribute("src")); driver.quit(); >>

Answer by Waylon Stephens

I am working with Selenium in Python and using Firefox web driver. ,I am trying to get the SRC of an image. When I first request the SRC I get the actual image data, not the SRC,Not sure why the image data is being returned on the first time the code is ran, and then the src in the second run. It almost seems that once the image is cached then it can get the src or something like that. ,Amazon website elements are JavaScript enabled elements so to extract the src attribute of any element, you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

I am trying to get the SRC of an image. When I first request the SRC I get the actual image data, not the SRC

data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQ . 

If I run the exact same code a second time I will get the SRC

fireFoxOptions = webdriver.FirefoxOptions() fireFoxOptions.set_headless() browser = webdriver.Firefox(firefox_options=fireFoxOptions) element = browser.find_element(By.ID , "idOfImageHere" ) imageUrl = element.get_attribute("src") print("image src: " + imageUrl) 

Answer by Amora Huffman

I am working with Selenium in Python and using Firefox web driver. ,I am trying to get the SRC of an image. When I first request the SRC I get the actual image data, not the SRC,If I run the exact same code a second time I will get the SRC,Any suggestions on how to prevent the image data from being returned, just the src link?

I am trying to get the SRC of an image. When I first request the SRC I get the actual image data, not the SRC

data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQ . 

If I run the exact same code a second time I will get the SRC

fireFoxOptions = webdriver.FirefoxOptions() fireFoxOptions.set_headless() browser = webdriver.Firefox(firefox_options=fireFoxOptions) element = browser.find_element(By.ID , "idOfImageHere" ) imageUrl = element.get_attribute("src") print("image src: " + imageUrl) 

Answer by Rylee Beard

Software Quality Assurance & Testing, Software Quality Assurance & Testing Meta , Software Quality Assurance & Testing help chat ,Thanks for contributing an answer to Software Quality Assurance & Testing Stack Exchange!

With a wait strategy, you’ll probably want to find the audio element itself rather than getting it thru the containing span element. Here’s an example along those lines (using implicit wait):

driver.implicitly_wait(3) sound_url = driver.find_element_by_tag_name('audio').get_attribute('src') # sound_url now contains 'https://s.yimg.com/bg/dict/ox/mp3/v1/real@_us_2.mp3' 

Answer by Kiara Correa

Hello, how can i get img src (url) by using Selenium + PhantomJS. I know XPATH, but i do not know how to return url of this image. I tried to add attrib[‘href’] and attrib[‘src’] but it does not work.

How are you getting to the image in the first place if you do not know the url? Is it just on the webpage somewhere? Im going to assume it is on the webpage somewhere and you have the xpath to its location. So this is what you would do:

img = driver.find_element_by_xpath('//your/xpath[@to_image]') src = img.get_attribute('src') 

Answer by Avery Fletcher

Example: python selenium get image src

driver.find_element_by_id("element_id").get_attribute("src")

Источник

How to extract the src attribute from the img using Selenium

When we say «inspect» Chrome, it finds the jpg file and let’s copy the xpath. When I try to import it in python, it comes as base64. When I decode the code, a small white image comes up. A blank image appears.

import base64 from io import BytesIO from PIL import Image from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service from webdriver_manager.chrome import ChromeDriverManager from selenium.webdriver.common.by import By def base64_to_image(base64_string): image_data = base64.b64decode(base64_string) image = Image.open(BytesIO(image_data)) return image options = Options() options.add_argument("--disable-notifications") options.add_experimental_option("excludeSwitches", ["enable-automation"]) #blutoot hata giderme options.add_experimental_option("useAutomationExtension", False) driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options) driver.implicitly_wait(5) driver.set_window_size(800, 200) driver.get('https://www.migros.com.tr/sut-kahvaltilik-c-4?sayfa=1') elems = driver.find_element(By.XPATH,"/html/body/sm-root/div/main/sm-product/article/sm-list/div/div[4]/div[2]/div[4]/sm-list-page-item[1]/mat-card/div[1]/fe-product-image/a/img") #Yumurtacım 15'li L Boy Yumurta (63-72 G) 

Источник

Оцените статью