Pcap parsing in python

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Pure-Python library to parse the pcap-ng format used by newer versions of dumpcap & similar tools.

License

rshk/python-pcapng

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Читайте также:  Java mac os настройка

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.rst

Python library to parse the pcap-ng format used by newer versions of dumpcap & similar tools (wireshark, winpcap, . ).

If you prefer the RTD theme, or want documentation for any version other than the latest, head here:

If you prefer the more comfortable, page-wide, default sphinx theme, a documentation mirror is hosted on GitHub pages:

git clone https://github.com/rshk/python-pcapng

Download zip of the latest version:

The official page on the Python Package Index is: https://pypi.python.org/pypi/python-pcapng

  • I need to decently extract some information from a bunch of pcap-ng files, but apparently tcpdump has some problems reading those files, I couldn’t find other nice tools nor Python bindings to a library able to parse this format, so..
  • In general, it appears there are (quite a bunch of!) Python modules to parse the old (much simpler) format, but nothing for the new one.
  • And, they usually completely lack any form of documentation.

Yes, I guess it would be much slower than something written in C, but I’m much better at Python than C.

..and I need to get things done, and CPU time is not that expensive 🙂

(Maybe I’ll give a try porting the thing to Cython to speed it up, but anyways, pure-Python libraries are always useful, eg. for PyPy).

Basic usage is as simple as:

from pcapng import FileScanner with open('/tmp/mycapture.pcap', 'rb') as fp: scanner = FileScanner(fp) for block in scanner: pass # do something with the block. 

Have a look at the blocks documentation to see what they do; also, the examples directory contains some example scripts using the library.

Format specification is here:

Contributions are welcome, please contact me if you’re planning to do some big change, so that we can sort out the best way to integrate it.

Or even better, open an issue so the whole world can participate in the discussion 🙂

Write support exists as of version 2.0.0. See the file examples/generate_pcapng.py for an example of the minimum code needed to generate a pcapng file.

In most cases, this library will prevent you from creating broken data. If you want to create marginal pcapng files, e.g. as test cases for other software, you can do that by adjusting the «strictness» of the library, as in:

from pcapng.strictness import Strictness, set_strictness set_strictness(Strictness.FIX)

Recognized values are Strictness.FORBID (the default), Strictness.FIX (warn about problems, fix if possible), Strictness.WARN (warn only), and Strictness.NONE (no warnings). Circumstances that will result in strictness warnings include:

  • Adding multiples of a non-repeatable option to a block
  • Adding a SPB to a file with more than one interface
  • Writing a PB (PBs are obsolete and not to be used in new files)
  • Writing EPB/SPB/PB/ISB before writing any IDBs
git tag v2.0.0 -m 'Version 2.0.0'
python -m venv ./.build-venv ./.build-venv/bin/python -m pip install build twine
rm -rf ./dist *.egg-info ./.build-venv/bin/python -m build

If you get some crazy version number like 2.0.1.dev0+g7bd8575.d20220310 instead of what you expect (eg 2.0.0 ), it’s because you have uncommitted or untracked files in your local working copy, or you created more commits after creating the tag. Such a version number will be refused by pypi (and it’s not a good version number anyways), so make sure you have a clean working copy before building.

About

Pure-Python library to parse the pcap-ng format used by newer versions of dumpcap & similar tools.

Источник

Analyzing Packet Captures with Python

For most situations involving analysis of packet captures, Wireshark is the tool of choice. And for good reason too — Wireshark provides an excellent GUI that not only displays the contents of individual packets, but also analysis and statistics tools that allow you to, for example, track individual TCP conversations within a pcap, and pull up related metrics.

There are situations, however, where the ability to process a pcap programmatically becomes extremely useful. Consider:

  • given a pcap that contains hundreds of thousands of packets, find the first connection to a particular server/service where the TCP SYN-ACK took more than 300ms to appear after the initial SYN
  • in a pcap that captures thousands of TCP connections between a client and several servers, find the connections that were prematurely terminated because of a RST sent by the client; at that point in time, determine how many other connections were in progress between that client and other servers
  • you are given two pcaps, one gathered on a SPAN port on an access switch, and another on an application server a few L3 hops away. At some point the application server sporadically becomes slow (retransmits on both sides, TCP windows shrinking etc.). Prove that it is (or is not) because of the network.
  • repeat the above exercises several times a week (or several times a day) with different sets of packet captures

In all these cases, it is immensely helpful to write a custom program to parse the pcaps and yield the data points you are looking for.

It is important to realize that we are not precluding the use of Wireshark; for example, after your program locates the proverbial needle(s) in the haystack, you can use that information (say a packet number or a timestamp) in Wireshark to look at a specific point inside the pcap and gain more insight.

So, this is the topic of this blog post: how to go about programmatically processing packet capture (pcap) files.

What programming language?

I will be using Python (3). Why Python? Apart from the well-known benefits of Python (open-source, relatively gentle learning curve, ubiquity, abundance of modules and so forth), it is also the case that Network Engineers are gaining expertise in this language and are using it in other areas of their work (device management and monitoring, workflow applications etc.).

What modules?

I will be using scapy, plus a few other modules that are not specific to packet processing or networking (argparse, pickle, pandas).

Note that there are other alternative Python modules that can be used to read and parse pcap files, like pyshark and pycapfile. Pyshark in particular is interesting because it simply leverages the underlying tshark installed on the system to do its work, so if you are in a situation where you need to leverage tshark’s powerful protocol decoding ability, pyshark is the way to go. In this blog however I am restricting myself to regular Ethernet/IPv4/TCP packets, and I can just use scapy.

The code

A few notes before we start

The code below was written and executed on Linux (Linux Mint 18.3 64-bit), but the code is OS-agnostic; it should work as well in other environments, with little or no modification.

In this post I use an example pcap file captured on my computer.

Step 1: Program skeleton

Build a skeleton for the program. This will also serve to check if your Python installation is OK.

Use the argparse module to get the pcap file name from the command line. If your argparse knowledge needs a little brushing up, you can look at my argparse recipe book, or at any other of the dozens of tutorials on the web.

Analyzing Packet Captures with Python Part 1 Figure 1

You will notice from the graph that the window size shows a sudden dip to some value between 400000 and 500000 shortly after timestamp 21.1. If you find this suspicious, you can again write more code to help you narrow down the exact packet number in the capture:

Analyzing Packet Captures with Python Part 1 Figure 2

Summary

With Python code, you can iterate over the packets in a pcap, extract relevant data, and process that data in ways that make sense to you. You can use code to go over the pcap and locate a specific sequence of packets (i.e. locate the needle in the haystack) for later analysis in a GUI tool like Wireshark. Or you can create customized graphical plots that can help you visualize the packet information. Further, since this is all code, you can do this repeatedly with multiple pcaps.

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Parse pcap file and display http traffics with python

License

erikodiony/pcap-parser

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Parse and show http traffics. Python 2.7.* required.

This module parse pcap/pcapng file, retrieve http data and show as text. Pcap files can be obtained via tcpdump or wireshark or other network traffic capture tools.

  • Http requests/responses grouped by tcp connections, the requests in one keep-alive http connection will display together.
  • Managed chunked and compressed http requests/responses.
  • Managed character encoding
  • Format json content to a beautiful way.

This module can be installed via pip:

Use tcpdump to capture packets:

tcpdump -wtest.pcap tcp port 80
# only output the requested URL and response status parse_pcap test.pcap # output http req/resp headers parse_pcap -v test.pcap # output http req/resp headers and body which belong to text type parse_pcap -vv test.pcap # output http req/resp headers and body parse_pcap -vvv test.pcap # display and attempt to do url decoding and formatting json output parse_pcap -vvb test.pcap
sudo tcpdump -w- tcp port 80 | parse_pcap

Use -g to group http request/response:

********** [10.66.133.90:56240] -- -- --> [220.181.90.13:80] ********** GET http://s1.rr.itc.cn/w/u/0/20120611181946_24.jpg HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/imgloading.jpg HTTP/1.1 200 OK GET http://s1.rr.itc.cn/w/u/0/20130201103132_66.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/w/u/0/20120719174136_77.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_prev_open.png HTTP/1.1 200 OK ********** [10.66.133.90:47526] -- -- --> [220.181.90.13:80] ********** GET http://s1.rr.itc.cn/w/u/0/20130227132442_43.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_next.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_prev.png HTTP/1.1 200 OK GET http://s1.rr.itc.cn/p/images/pic_next_open.png HTTP/1.1 200 OK 

You can use the -p/-i to specify the ip/port of source and destination, will only display http data meets the specified conditions:

parse_pcap -p55419 -vv test.pcap parse_pcap -i192.168.109.91 -vv test.pcap

Use -d to specify the http domain, only display http req/resp with the domain:

parse_pcap -dwww.baidu.com -vv test.pcap

Use -u to specify the http uri pattern, only dispay http req/resp which url contains the url pattern:

parse_pcap -u/api/update -vv test.pcap

Use -e can forced the encoding http body used:

parse_pcap -i192.168.109.91 -p80 -vv -eutf-8 test.pcap

About

Parse pcap file and display http traffics with python

Источник

Оцените статью