- Attacking Web Applications With Python: Exploiting Web Forms and Requests
- Detecting vulnerabilities with Python
- Extracting, filling and submitting forms in Python
- Introduction to HTTP requests: URLs, headers and message body
- Requests and Responses:
- Intercepting and manipulating HTTP requests with Python
- Conclusion
- Sources
Attacking Web Applications With Python: Exploiting Web Forms and Requests
This article is to introduce web application penetration testers with python and explain how python can be used for making customized HTTP requests – which in turn can be further expanded for development of custom scripts/tools that can be developed for special conditions where scanners fail. Readers will be introduced to libraries that can help a penetration tester in making custom HTTP requests using python. All examples shown in this article are developed using python3.
Detecting vulnerabilities with Python
Let us begin by discussing how python can be used to detect vulnerabilities in web applications. In this example, we will write a simple python script that detects SQL Injection in a vulnerable web application.
The target URL in this case looks as follows.
http://192.168.1.106/webapps/sqli/sqli.php?id=1 |
The parameter id is vulnerable to error based SQL Injection. Any attempts to pass SQL Injection payloads such as single quote (‘) will throw a MySQL error in the response. Detecting this using an automated script is simple. We will need to fuzz the parameter values with various SQL Injection payloads and check if the response contains the string “MySQL”. The following script does exactly that.
from termcolor import colored
if http_request.content.find(b’MySQL’) != -1:
print (url_mod + colored(” – potential error based SQLi detected”, ‘red’))
print(url_mod + colored(” – no injection found”,’green’))
In the preceding script, we are fuzzing the parameter values by reading payloads from a txt file. Following are some of the key steps used in the script.
- Target URL’s parameter value is replaced with the string INJECT_HERE .
- Invoke the function detect(url) when the script is run.
- When the function is invoked, we are reading the payloads from fuzzing.txt
- Each payload read is used to replace the word INJECT_HERE in the target URL.
- With each modified URL instance, we are making a HTTP Request using Python’s request module.
- Finally, we are searching for error strings in the response using http_request.content.find() function.
- If the error string is found, the URL parameter id is vulnerable.
Following is the simplified output of the preceding script.
http://192.168.1.106/webapps/sqli/sqli.php?id=1 – no injection found
As we can notice, a few lines of python code is enough to write a simple vulnerability scanner in python. This can come handy when we need to write custom scripts for new vulnerabilities or automate vulnerability discovery of new vulnerabilities. This is specifically useful when we need to scan web apps for vulnerabilities at scale.
Extracting, filling and submitting forms in Python
Now, let us discuss how Python can be leveraged when dealing with application forms. There can be scenarios, where we will need to automatically extract HTML elements from a web application form, fill and submit the form. Let us go through an example to understand how we can achieve this using Python.
We have the following login page at the URL. We will need to automatically submit this form using Python and verify if we are successfully logged in.
Before automating this process, let us fill in the form fields manually and submit the request, which looks as follows.
As we can notice in the preceding figure, a POST request is made to the following URL with the parameters shown under the Form Data section of the HTTP request.
http://192.168.1.106:8080/ExamResults/Login |
After successfully logging in, we will see the following home page.
Now, let us see how we can automate this process using Python. The following python script can be used to submit a post request using requests module and we should be able to login using this python script.
response = s.post(login_url, data=payload)
The line print(response.content) prints the HTML response as follows.
b’\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n \r\n \r\n\r\n \r\n \r\n \r\n \r\n\r\n \r\n \r\n \r\n \r\n\r\n\r\n’ |
The HTML tag highlighted below confirms that the user has been successfully logged in.
Let us check if we can further improve our script to determine if we are logged in without manually verifying the response. Python has a module named BeautifulSoup , which comes handy when we need to parse HTML documents. The following script shows how we can parse the HTML content returned when the user is logged in.
from bs4 import BeautifulSoup
response = s.post(login_url, data=payload)
soup = BeautifulSoup(html, ‘html.parser’)
if welcome.string == “welcome admin”:
The following steps are used to determine if the user login is successful.
- First, we imported the module BeautifulSoup using the line from bs4 import BeautifulSoup .
- Next, we are parsing the complete HTML document using the line: soup = BeautifulSoup(html, ‘html.parser’)
- Next, we are extracting the H2 tag in the response using form = soup.find(‘h2’)
- The H2 tag identified in the previous step contains a label, which contains the string we are looking for.
- We are using the line welcome = form.find(label) to extract the label.
- Finally, welcome.string should contain the string we are looking for.
- Note that, the whole process is done using try and except blocks as there is a chance for exceptions when looking for specific tags in the html response especially if the user login is not successful.
Clearly, python can be used to interact with forms very easily and this whole process can be really useful in fuzzing forms and to perform brute force attacks on login pages.
Introduction to HTTP requests: URLs, headers and message body
This section of the article provides a brief introduction to HTTP requests. We will go through some of the fundamental building blocks of a simple HTTP request, which includes URL, headers and message body.
Requests and Responses:
During HTTP communications, clients (Eg: Browsers, curl, netcat etc.) and servers communicate with each other by exchanging individual messages. Each message sent by the client is called a request and the messages received from the server are called responses.
Following is a sample HTTP Request:
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:83.0) Gecko/20100101 Firefox/83.0
Accept-Encoding: gzip, deflate
The preceding request contains headers and body. Let us go through some of the headers. The following line from the preceding request specifies that the request method is POST. Usually, POST method is used to submit content to the server whereas GET method is used to request for content.
POST /xvwa/vulnerabilities/sqli/ HTTP/1.1 |
We have various other methods existing in HTTP such as TRACK, TRACE, PUT, DELETE and OPTIONS. When a request is sent using the GET method, the parameters will be passed through the URL.
Next, the following line in the request specifies the domain name or the IP address of the server with which the client is interacting with. In our case, it is 192.168.1.105
Next, let us take a look at the user-Agent header.
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:83.0) Gecko/20100101 Firefox/83.0 |
User-Agent header field helps the server to identify the client software originating the request.
If it is a client other than firefox, the value will be different. For example, we may see the following if the request is sent from curl instead of the browser.
Next, let us observe the line with the header Accept.
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 |
As we can notice, there are several values specified by the browser in this header. Accept header specifies the Content-Types that are acceptable for the response. Wildcards are also supported to represent any type.
Next, the following header shows the cookie being sent to the server. Cookies are usually used to identify the logged in user.
Cookie: PHPSESSID=fjv5te289b8he60k9qoss7ldj5 |
Lastly, we can see the parameters being passed from the web application to the server in the following excerpt.
There are a few parameters, which also include a parameter named hidden that appears to be a hidden parameter.
Following is a sample HTTP Response returned from the server.
Date: Sat, 12 Dec 2020 07:02:06 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
As we can notice, the response contains several headers along with the requested content.
Intercepting and manipulating HTTP requests with Python
In this section of the article, let us see how we can automatically tamper HTTP requests and responses using python. We will achieve this using an intercepting proxy tool called mitmproxy. mitmproxy is a free and open source interactive HTTPS proxy that comes preinstalled in Kali Linux.
We can use the following command to launch mitmproxy in Kali Linux.
By Default, mitmproxy listens on port 8080. We can configure our browser to proxy all the traffic through mitmproxy as shown below.
After configuring the proxy, we can access any web application using the same browser as shown below.
Once the application is loaded, we should be able to see HTTP requests and responses in mitmproxy command line console as follows.
The request and response shown in the preceding figures contain the default headers both in the request and response. Let us intercept the request and response to add a custom HTTP header.
We can use the following addon script, which can be loaded when starting mitmproxy.
flow.request.headers[“ customer-request-header “] = “ custom-value1 “
flow.response.headers[“ customer-response-header “] = “ custom-value2 “
As we can see in the preceding excerpt, we are adding a custom HTTP header to the request and response.
We can start mitmproxy using the following command to load this addon script.
Once again, access the web application and the newly added headers can be seen.
This technique of intercepting requests and responses comes handy in automated vulnerability discovery using python.
Conclusion
Python is an easy to learn language which can be helpful to penetration testers to create their custom tools which they can use to achieve coverage. Thus plugging in holes which are at times created by vulnerability scanners because they are unable to hit certain pages due to one or the other reason. Users can create reusable code by using python, which can help them create classes that can be inherited and extended. Python can not only be used for quick and dirty scripting to achieve small automation tasks but also be used to create enterprise class vulnerability scanning routines.
Sources
- Black Hat Python: Python Programming for Hackers and Pentesters Book by Justin Seitz – https://www.amazon.com/Black-Hat-Python-Programming-Pentesters/dp/1593275900
- Learning Python Web Penetration Testing: Automate Web Penetration Testing Activities Using Python Book by Christian Martorella – https://www.packtpub.com/product/learning-python-web-penetration-testing/9781789533972
- https://github.com/mitmproxy/mitmproxy