- Saved searches
- Use saved searches to filter your results more quickly
- License
- rshk/python-pcapng
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.rst
- About
- Analyzing Packet Captures with Python
- What programming language?
- What modules?
- The code
- A few notes before we start
- Step 1: Program skeleton
- Summary
- Saved searches
- Use saved searches to filter your results more quickly
- License
- erikodiony/pcap-parser
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.md
- About
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
Pure-Python library to parse the pcap-ng format used by newer versions of dumpcap & similar tools.
License
rshk/python-pcapng
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.rst
Python library to parse the pcap-ng format used by newer versions of dumpcap & similar tools (wireshark, winpcap, . ).
If you prefer the RTD theme, or want documentation for any version other than the latest, head here:
If you prefer the more comfortable, page-wide, default sphinx theme, a documentation mirror is hosted on GitHub pages:
git clone https://github.com/rshk/python-pcapng
Download zip of the latest version:
The official page on the Python Package Index is: https://pypi.python.org/pypi/python-pcapng
- I need to decently extract some information from a bunch of pcap-ng files, but apparently tcpdump has some problems reading those files, I couldn’t find other nice tools nor Python bindings to a library able to parse this format, so..
- In general, it appears there are (quite a bunch of!) Python modules to parse the old (much simpler) format, but nothing for the new one.
- And, they usually completely lack any form of documentation.
Yes, I guess it would be much slower than something written in C, but I’m much better at Python than C.
..and I need to get things done, and CPU time is not that expensive 🙂
(Maybe I’ll give a try porting the thing to Cython to speed it up, but anyways, pure-Python libraries are always useful, eg. for PyPy).
Basic usage is as simple as:
from pcapng import FileScanner with open('/tmp/mycapture.pcap', 'rb') as fp: scanner = FileScanner(fp) for block in scanner: pass # do something with the block.
Have a look at the blocks documentation to see what they do; also, the examples directory contains some example scripts using the library.
Format specification is here:
Contributions are welcome, please contact me if you’re planning to do some big change, so that we can sort out the best way to integrate it.
Or even better, open an issue so the whole world can participate in the discussion 🙂
Write support exists as of version 2.0.0. See the file examples/generate_pcapng.py for an example of the minimum code needed to generate a pcapng file.
In most cases, this library will prevent you from creating broken data. If you want to create marginal pcapng files, e.g. as test cases for other software, you can do that by adjusting the «strictness» of the library, as in:
from pcapng.strictness import Strictness, set_strictness set_strictness(Strictness.FIX)
Recognized values are Strictness.FORBID (the default), Strictness.FIX (warn about problems, fix if possible), Strictness.WARN (warn only), and Strictness.NONE (no warnings). Circumstances that will result in strictness warnings include:
- Adding multiples of a non-repeatable option to a block
- Adding a SPB to a file with more than one interface
- Writing a PB (PBs are obsolete and not to be used in new files)
- Writing EPB/SPB/PB/ISB before writing any IDBs
git tag v2.0.0 -m 'Version 2.0.0'
python -m venv ./.build-venv ./.build-venv/bin/python -m pip install build twine
rm -rf ./dist *.egg-info ./.build-venv/bin/python -m build
If you get some crazy version number like 2.0.1.dev0+g7bd8575.d20220310 instead of what you expect (eg 2.0.0 ), it’s because you have uncommitted or untracked files in your local working copy, or you created more commits after creating the tag. Such a version number will be refused by pypi (and it’s not a good version number anyways), so make sure you have a clean working copy before building.
About
Pure-Python library to parse the pcap-ng format used by newer versions of dumpcap & similar tools.
Analyzing Packet Captures with Python
For most situations involving analysis of packet captures, Wireshark is the tool of choice. And for good reason too — Wireshark provides an excellent GUI that not only displays the contents of individual packets, but also analysis and statistics tools that allow you to, for example, track individual TCP conversations within a pcap, and pull up related metrics.
There are situations, however, where the ability to process a pcap programmatically becomes extremely useful. Consider:
- given a pcap that contains hundreds of thousands of packets, find the first connection to a particular server/service where the TCP SYN-ACK took more than 300ms to appear after the initial SYN
- in a pcap that captures thousands of TCP connections between a client and several servers, find the connections that were prematurely terminated because of a RST sent by the client; at that point in time, determine how many other connections were in progress between that client and other servers
- you are given two pcaps, one gathered on a SPAN port on an access switch, and another on an application server a few L3 hops away. At some point the application server sporadically becomes slow (retransmits on both sides, TCP windows shrinking etc.). Prove that it is (or is not) because of the network.
- repeat the above exercises several times a week (or several times a day) with different sets of packet captures
In all these cases, it is immensely helpful to write a custom program to parse the pcaps and yield the data points you are looking for.
It is important to realize that we are not precluding the use of Wireshark; for example, after your program locates the proverbial needle(s) in the haystack, you can use that information (say a packet number or a timestamp) in Wireshark to look at a specific point inside the pcap and gain more insight.
So, this is the topic of this blog post: how to go about programmatically processing packet capture (pcap) files.
What programming language?
I will be using Python (3). Why Python? Apart from the well-known benefits of Python (open-source, relatively gentle learning curve, ubiquity, abundance of modules and so forth), it is also the case that Network Engineers are gaining expertise in this language and are using it in other areas of their work (device management and monitoring, workflow applications etc.).
What modules?
I will be using scapy, plus a few other modules that are not specific to packet processing or networking (argparse, pickle, pandas).
Note that there are other alternative Python modules that can be used to read and parse pcap files, like pyshark and pycapfile. Pyshark in particular is interesting because it simply leverages the underlying tshark installed on the system to do its work, so if you are in a situation where you need to leverage tshark’s powerful protocol decoding ability, pyshark is the way to go. In this blog however I am restricting myself to regular Ethernet/IPv4/TCP packets, and I can just use scapy.
The code
A few notes before we start
The code below was written and executed on Linux (Linux Mint 18.3 64-bit), but the code is OS-agnostic; it should work as well in other environments, with little or no modification.
In this post I use an example pcap file captured on my computer.
Step 1: Program skeleton
Build a skeleton for the program. This will also serve to check if your Python installation is OK.
Use the argparse module to get the pcap file name from the command line. If your argparse knowledge needs a little brushing up, you can look at my argparse recipe book, or at any other of the dozens of tutorials on the web.
You will notice from the graph that the window size shows a sudden dip to some value between 400000 and 500000 shortly after timestamp 21.1. If you find this suspicious, you can again write more code to help you narrow down the exact packet number in the capture:
Summary
With Python code, you can iterate over the packets in a pcap, extract relevant data, and process that data in ways that make sense to you. You can use code to go over the pcap and locate a specific sequence of packets (i.e. locate the needle in the haystack) for later analysis in a GUI tool like Wireshark. Or you can create customized graphical plots that can help you visualize the packet information. Further, since this is all code, you can do this repeatedly with multiple pcaps.
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
Parse pcap file and display http traffics with python
License
erikodiony/pcap-parser
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.md
Parse and show http traffics. Python 2.7.* required.
This module parse pcap/pcapng file, retrieve http data and show as text. Pcap files can be obtained via tcpdump or wireshark or other network traffic capture tools.
- Http requests/responses grouped by tcp connections, the requests in one keep-alive http connection will display together.
- Managed chunked and compressed http requests/responses.
- Managed character encoding
- Format json content to a beautiful way.
This module can be installed via pip:
Use tcpdump to capture packets:
tcpdump -wtest.pcap tcp port 80
# only output the requested URL and response status parse_pcap test.pcap # output http req/resp headers parse_pcap -v test.pcap # output http req/resp headers and body which belong to text type parse_pcap -vv test.pcap # output http req/resp headers and body parse_pcap -vvv test.pcap # display and attempt to do url decoding and formatting json output parse_pcap -vvb test.pcap
sudo tcpdump -w- tcp port 80 | parse_pcap
Use -g to group http request/response:
********** [10.66.133.90:56240] -- -- --> [220.181.90.13:80] ********** GET http://s1.rr.itc.cn/w/u/0/20120611181946_24.jpg HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/imgloading.jpg HTTP/1.1 200 OK GET http://s1.rr.itc.cn/w/u/0/20130201103132_66.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/w/u/0/20120719174136_77.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_prev_open.png HTTP/1.1 200 OK ********** [10.66.133.90:47526] -- -- --> [220.181.90.13:80] ********** GET http://s1.rr.itc.cn/w/u/0/20130227132442_43.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_next.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_prev.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_next_open.png HTTP/1.1 200 OK
You can use the -p/-i to specify the ip/port of source and destination, will only display http data meets the specified conditions:
parse_pcap -p55419 -vv test.pcap parse_pcap -i192.168.109.91 -vv test.pcap
Use -d to specify the http domain, only display http req/resp with the domain:
parse_pcap -dwww.baidu.com -vv test.pcap
Use -u to specify the http uri pattern, only dispay http req/resp which url contains the url pattern:
parse_pcap -u/api/update -vv test.pcap
Use -e can forced the encoding http body used:
parse_pcap -i192.168.109.91 -p80 -vv -eutf-8 test.pcap
About
Parse pcap file and display http traffics with python