- Get HTML Source of WebElement in Selenium WebDriver using Python.
- Syntax
- Syntax
- Example
- How To Get Page Source In Selenium Using Python?
- What Is An HTML Page Source?
- What Is An HTML Web Element?
- How To Get Page Source In Selenium WebDriver Using Python?
- Get HTML Page source Using driver.page_source
- python selenium get html
- Install selenium
- Get html source
- How to get HTML source of page in Selenium Python?
- Example
- Summary
Get HTML Source of WebElement in Selenium WebDriver using Python.
We can get html source of a webelement with Selenium webdriver.We can get the innerHTML attribute to get the source of the web element.
The innerHTML is an attribute of a webelement which is equal to the text that is present between the starting and ending tag. The get_attribute method is used for this and innerHTML is passed as an argument to the method.
Syntax
s = element.get_attribute('innerHTML')
We can obtain the html source of the webelement with the help of Javascript Executor. We shall utilize the execute_script method and pass arguments index.innerHTML and webelement whose html source is to be retrieved to the method.
Syntax
s = driver.find_element_by_id("txt-search") driver.execute_script("return arguments[0].innerHTML;",s)
Let us see the below html code of an element. The innerHTML of the element shall be − You are browsing the best resource for Online Education.
Example
Code Implementation with get_attribute.
from selenium import webdriver driver = webdriver.Chrome(executable_path="C:\chromedriver.exe" # implicit wait applied driver.implicitly_wait(0.5) driver.get("https://www.tutorialspoint.com/index.htm") # to identify element and obtain innerHTML with get_attribute l = driver.find_element_by_css_selector("h4") print("HTML code of element: " + l.get_attribute('innerHTML'))
Code Implementation with Javascript Executor.
from selenium import webdriver driver = webdriver.Chrome(executable_path="C:\chromedriver.exe" # implicit wait applied driver.implicitly_wait(0.5) driver.get("https://www.tutorialspoint.com/index.htm") # to identify element and obtain innerHTML with execute_script l = driver.find_element_by_css_selector("h4") h= driver.execute_script("return arguments[0].innerHTML;",l) print("HTML code of element: " + h)
How To Get Page Source In Selenium Using Python?
This article is a part of our Content Hub. For more in-depth resources, check out our content hub on Selenium Python Tutorial.
Retrieving the page source of a website under scrutiny is a day-to-day task for most test automation engineers. Analysis of the page source helps eliminate bugs identified during regular website UI testing, functional testing, or security testing drills. In an extensively complex application testing process, automation test scripts can be written in a way that if errors are detected in the program, then it automatically.
- saves that particular page’s source code.
- notifies the person responsible for the URL of the page.
- extracts the HTML source of a specific element or code-block and delegates it to responsible authorities if the error has occurred in one particular independent HTML WebElement or code block.
This is an easy way to trace, fix logical and syntactical errors in the front-end code. In this article, we first understand the terminologies involved and then explore how to get the page source in Selenium WebDriver using Python.
TABLE OF CONTENT
What Is An HTML Page Source?
In non-technical terminology, it’s a set of instructions for browsers to display info on the screen in an aesthetic fashion. Browsers interpret these instructions in their own ways to create browser screens for the client-side. These are usually written using HyperText Markup Language (HTML), Cascading Style Sheets (CSS) & Javascript.
This entire set of HTML instructions that make a web page is called page source or HTML source, or simply source code. Website source code is a collection of source code from individual web pages.
Here’s an example of a Source Code for a basic page with a title, form, image & a submit button.
What Is An HTML Web Element?
The easiest way to describe an HTML web element would be, “any HTML tag that constitutes the HTML page source code is a web Element.” It could be an HTML code block, an independent HTML tag like , a media object on the web page – image, audio, video, a JS function or even a JSON object wrapped within tags.
In the above example – is an HTML web element, so is and the children of body tags are HTML web elements too i.e., , etc.
How To Get Page Source In Selenium WebDriver Using Python?
Selenium WebDriver is a robust automation testing tool and provides automation test engineers with a diverse set of ready-to-use APIs. And to make Selenium WebDriver get page source, Selenium Python bindings provide us with a driver function called page_source to get the HTML source of the currently active URL in the browser.
Alternatively, we can also use the “GET” function of Python’s request library to load the page source. Another way is to execute javascript using the driver function execute_script and make Selenium WebDriver get page source in Python. A not-recommended way of getting page source is using XPath in tandem with “view-source:” URL. Let’s explore examples for these four ways of how to get page source in Selenium WebDriver using Python –
We’ll be using a sample small web page hosted on GitHub for all four examples. This page was created to demonstrate drag and drop testing in Selenium Python using LambdaTest.
Get HTML Page source Using driver.page_source
We’ll fetch “pynishant.github.io” in the ChromeDriver and save its content to a file named “page_source.html.” This filename could be anything of your choice. Next, we read the file’s content and print it on the terminal before closing the driver.
python selenium get html
Selenium is a web automation module that can be used to get a webpages html code. In this article we will show how to achieve that.
You can use the web drivers attribute .page_source to grab the html code of any webpage.
If you are new to selenium, I recommend the course below.
Install selenium
If you haven’t done so, install the selenium module (pip), the web browser and the web driver.
For this example, you may need to set the path to chromium:
export PATH=$PATH:/usr/lib/chromium/
Get html source
You can import thet webdriver from the selenium module. A webdriver object is created (chromium) and we can optionally specify if we want to ignore certificate errors.
Of course any web browser can be used, but for this example I’ve used chromium.
Once the web browser started we navigate it to a webpage URL using the get() module. Then we get the page source.
from selenium import webdriver
import time
options = webdriver.ChromeOptions()
options.add_argument(‘—ignore-certificate-errors’)
options.add_argument(«—test-type»)
options.binary_location = «/usr/bin/chromium»
driver = webdriver.Chrome(chrome_options=options)
driver.get(‘https://python.org’)
html = driver.page_source
print(html)
It will output the webpage source, which is stored in the variable html.
Selenium will start the chromium browser automatically
How to get HTML source of page in Selenium Python?
In this tutorial, you will learn how to get the HTML source of a webpage using Selenium in Python.
To get the HTML source of a webpage in Selenium Python, load the URL, and read the page_source attribute of the driver object. The attribute returns the source of the HTML page as a string.
source = driver.page_source
Example
Consider the following HTML file.
Hello User!
This is child 1. This is child 2. This is child 3.
In the following program, we initialize a driver, then we load the index.html page running on our local server, or you may give the URL of the page you are interested in, and read the page_source attribute of the driver object. We shall store the returned value in a variable and print it to standard output.
Python Program
from selenium import webdriver from webdriver_manager.chrome import ChromeDriverManager from selenium.webdriver.chrome.service import Service as ChromeService # Setup chrome driver service = ChromeService(executable_path=ChromeDriverManager().install()) driver = webdriver.Chrome(service=service) # Navigate to the url driver.get('http://127.0.0.1:5500/index.html') # Get HTML source of webpage source = driver.page_source print(source) # Close the driver driver.quit()
Summary
In this Python Selenium tutorial, we have given instructions on how to get the HTML source of a page, with example program.