Парсинг логов windows python

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Pure Python parser for recent Windows Event Log files (.evtx)

License

williballenthin/python-evtx

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

python-evtx is a pure Python parser for recent Windows Event Log files (those with the file extension «.evtx»). The module provides programmatic access to the File and Chunk headers, record templates, and event entries. For example, you can use python-evtx to review the event logs of Windows 7 systems from a Mac or Linux workstation. The structure definitions and parsing strategies were heavily inspired by the work of Andreas Schuster and his Perl implementation «Parse-Evtx».

With the release of Windows Vista, Microsoft introduced an updated event log file format. The format used in Windows XP was a circular buffer of record structures that each contained a list of strings. A viewer resolved templates hosted in system library files and inserted the strings into appropriate positions. The newer event log format is proprietary binary XML. Unpacking chunks from an event log file from Windows 7 results in a complete XML document with a variable schema. The changes helped Microsoft tune the file format to real-world uses of event logs, such as long running logs with hundreds of megabytes of data, and system independent template resolution.

Andreas Schuster released the first public description of the .evtx file format in 2007. He is the author of the thorough document «Introducing the Microsoft Vista event log file format» that describes the motivation and details of the format. Mr. Schuster also maintains the Perl implementation of a parser called «Parse-Evtx». I referred to the source code of this library extensively during the development of python-evtx.

Joachim Metz also released a cross-platform, LGPL licensed C++ based parser in 2011. His document «Windows XML Event Log (EVTX): Analysis of EVTX» provides a detailed description of the structures and context of newer event log files.

python-evtx works on both the 2.7 and 3.x versions of the Python programming language. As it is purely Python, the module works equally well across platforms. The code does not depend on any modules that require separate compilation; however, if you have lxml installed, its even nicer.

python-evtx operates on event log files from Windows operating systems newer than Windows Vista. These files typically have the file extension .evtx. Version 5.09 of the file utility identifies such a file as «MS Vista Windows Event Log». To manual confirm the file type, look for the ASCII string «ElfFile» in the first seven bytes:

willi/evtx » xxd -l 32 Security.evtx 0000000: 456c 6646 696c 6500 0000 0000 0000 0000 ElfFile. 0000010: d300 0000 0000 0000 375e 0000 0000 0000 . 7^. 

Provided with the parsing module Evtx are four scripts that mimic the tools distributed with Parse-Evtx. evtx_info.py prints metadata about the event log and verifies the checksums of each chunk. evtx_templates.py builds and prints the templates used throughout the event log. evtx_dump.py parses the event log and transforms the binary XML into a human readable ASCII XML format. Finally, evtx_dump_json.py parses event logs, similar to evtx_dump.py and transforms the binary XML into JSON with the added capability to output the JSON array to a file.

Note the length of the evtx_dump.py script: its only 20 lines. Now, review the contents and notice the complete implementation of the logic:

print(e_views.XML_HEADER) print('') for record in log.records: print(record.xml()) print('') 

Working with python-evtx is really easy!

Updates to python-evtx are pushed to PyPi, so you can install the module using either easy_install or pip . For example, you can use pip like so:

The source code for python-evtx is hosted at Github, and you may download, fork, and review it from this repository (http://www.github.com/williballenthin/python-evtx). Please report issues or feature requests through Github’s bug tracker associated with the project.

python-evtx is licensed under the Apache License, Version 2.0. This means it is freely available for use and modification in a personal and professional capacity.

About

Pure Python parser for recent Windows Event Log files (.evtx)

Источник

Chapter 3 — Windows Event Log Parsing¶

Example for opening EVTX files, iterating over events, and filtering events.

Demonstrates how to open an EVTX file and get basic details about the event log. This section makes use of python-evtx, a python library for reading event log files. To install, run pip install python-evtx .

Other libraries for parsing these event logs exist and we welcome others to add snippets that showcase how to make use of them in reading EVTX files.

$ python using_python_evtx.py System.evtx

Open Windows Event Logs (EVTX)¶

This function shows an example of opening an EVTX file and parsing out several header metadata parameters about the file.

def open_evtx(input_file): """Opens a Windows Event Log and displays common log parameters. Arguments: input_file (str): Path to evtx file to open Examples: >>> open_evtx("System.evtx") File version (major): 3 File version (minor): 1 File is ditry: True File is full: False Next record number: 10549 """ with evtx.Evtx(input_file) as open_log: header = open_log.get_file_header() properties = OrderedDict( [ ("major_version", "File version (major)"), ("minor_version", "File version (minor)"), ("is_dirty", "File is dirty"), ("is_full", "File is full"), ("next_record_number", "Next record number"), ] ) for key, value in properties.items(): print(f"value>: getattr(header, key)()>") 

Iterate over record XML data (EVTX)¶

In this function, we iterate over the records within an EVTX file and expose the raw XML. This leverages a yield generator for low impact on resources.

Additionally, if you would like to parse the XML, or interact with the child elements, you can enable it by assigning the parse_xml parameter as True, which will then call the .lxml() method on the individual event record. This requires the installation of the lxml Library, as it returns a lxml.etree object that you can interact with.

def get_events(input_file, parse_xml=False): """Opens a Windows Event Log and returns XML information from the event record. Arguments: input_file (str): Path to evtx file to open parse_xml (bool): If True, return an lxml object, otherwise a string Yields: (generator): XML information in object or string format Examples: >>> for event_xml in enumerate(get_events("System.evtx")): >>> print(event_xml) """ with evtx.Evtx(input_file) as event_log: for record in event_log.records(): if parse_xml: yield record.lxml() else: yield record.xml() 

Filtering records within events logs¶

Now that we have get_events() , we can begin to perform operations on the newly accessible data. In this function, we extract information from the LXML object, and use that to filter results based on Event ID and other fields within the results. You can easily extend this to support other fields, filters, and return values. Some examples include:

  • extracting all login and logoff events, with their session identifiers, then calculating the session durations
  • Identify PowerShell events and expose arguments for further processing (ie. Base64 decoding, shellcode analysis)
def filter_events_json(event_data, event_ids, fields=None): """Provide events where the event id is found within the provided list of event ids. If found, it will return a JSON formatted object per event. If a list of fields are provided, it will filter the resulting JSON event object to contain only those fields. Arguments: event_data (genertor): Iterable containing event data as XML. Preferably the result of the :func:`get_events()` method. event_ids (list): A list of event identifiers. Each element should be a string value, even though the identifier is an integer. fields (list): Collection of fields from the XML data to include in the JSON output. Only supports top-level fields. Yields: (dict): A dictionary containing the filtered record information Example: >>> filtered_logins = filter_events_json( >>> get_events("System.evtx", parse_xml=True), >>> event_ids=['4624', '4625'], >>> fields=["SubjectUserName", "SubjectUserSid", >>> "SubjectDomainName", "TargetUserName", "TargetUserSid", >>> "TargetDomainName", "WorkstationName", "IpAddress", >>> "IpPort", "ProcessName"] >>> ) >>> for filtered_login in filtered_logins: >>> print(json.dumps(filtered_login, indent=2)) """ for evt in event_data: system_tag = evt.find("System", evt.nsmap) event_id = system_tag.find("EventID", evt.nsmap) if event_id.text in event_ids: event_data = evt.find("EventData", evt.nsmap) json_data = <> for data in event_data.getchildren(): if not fields or data.attrib["Name"] in fields: # If we don't have a specified field filter list, print all # Otherwise filter for only those fields within the list json_data[data.attrib["Name"]] = data.text yield json_data 

Docstring References¶

Provide events where the event id is found within the provided list of event ids. If found, it will return a JSON formatted object per event.

If a list of fields are provided, it will filter the resulting JSON event object to contain only those fields.

  • event_data (genertor) – Iterable containing event data as XML. Preferably the result of the get_events() method.
  • event_ids (list) – A list of event identifiers. Each element should be a string value, even though the identifier is an integer.
  • fields (list) – Collection of fields from the XML data to include in the JSON output. Only supports top-level fields.

(dict) – A dictionary containing the filtered record information

>>> filtered_logins = filter_events_json( >>> get_events("System.evtx", parse_xml=True), >>> event_ids=['4624', '4625'], >>> fields=["SubjectUserName", "SubjectUserSid", >>> "SubjectDomainName", "TargetUserName", "TargetUserSid", >>> "TargetDomainName", "WorkstationName", "IpAddress", >>> "IpPort", "ProcessName"] >>> ) >>> for filtered_login in filtered_logins: >>> print(json.dumps(filtered_login, indent=2)) 

Opens a Windows Event Log and returns XML information from the event record.

  • input_file (str) – Path to evtx file to open
  • parse_xml (bool) – If True, return an lxml object, otherwise a string

(generator) – XML information in object or string format

>>> for event_xml in enumerate(get_events("System.evtx")): >>> print(event_xml) 

Opens a Windows Event Log and displays common log parameters.

input_file (str) – Path to evtx file to open

>>> open_evtx("System.evtx") File version (major): 3 File version (minor): 1 File is ditry: True File is full: False Next record number: 10549 

Источник

Читайте также:  Python selenium assert примеры
Оцените статью