- headerparser 0.4.0
- The Format
- Installation
- Examples
- Saved searches
- Use saved searches to filter your results more quickly
- License
- dmeranda/httpheader
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.md
- About
headerparser 0.4.0
headerparser parses key-value pairs in the style of RFC 822 (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library’s argparse module. (Everyone loves argparse , right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included.
The Format
RFC 822-style headers are header fields that follow the general format of e-mail headers as specified by RFC 822 and friends: each field is a line of the form “ Name: Value ”, with long values continued onto multiple lines (“folded”) by indenting the extra lines. A blank line marks the end of the header section and the beginning of the message body.
This basic grammar has been used by numerous textual formats besides e-mail, including but not limited to:
- HTTP request & response headers
- Usenet messages
- most Python packaging metadata files
- Debian packaging control files
- META-INF/MANIFEST.MF files in Java JARs
- a subset of the YAML serialization format
— all of which this package can parse.
Installation
Just use pip (You have pip, right?) to install headerparser and its dependencies:
Examples
>>> import headerparser >>> parser = headerparser.HeaderParser() >>> parser.add_field('Name', required=True) >>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example') >>> parser.add_field('Public', type=headerparser.BOOL, default=False) >>> parser.add_field('Tag', multiple=True) >>> parser.add_field('Data')
Parse some headers and inspect the results:
>>> msg = parser.parse_string('''\ . Name: Sample Input . Public: yes . tag: doctest, examples, . whatever . TAG: README . . Wait, why I am using a body instead of the "Data" field? . ''') >>> sorted(msg.keys()) ['Name', 'Public', 'Tag', 'Type'] >>> msg['Name'] 'Sample Input' >>> msg['Public'] True >>> msg['Tag'] ['doctest, examples,\n whatever', 'README'] >>> msg['TYPE'] 'example' >>> msg['Data'] Traceback (most recent call last): . KeyError: 'data' >>> msg.body 'Wait, why I am using a body instead of the "Data" field?\n'
Fail to parse headers that don’t meet your requirements:
>>> parser.parse_string('Type: demonstration') Traceback (most recent call last): . headerparser.errors.MissingFieldError: Required header field 'Name' is not present >>> parser.parse_string('Name: Bad type\nType: other') Traceback (most recent call last): . headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type' >>> parser.parse_string('Name: unknown field\nField: Value') Traceback (most recent call last): . headerparser.errors.UnknownFieldError: Unknown header field 'Field'
Allow fields you didn’t even think of:
>>> parser.add_additional() >>> msg = parser.parse_string('Name: unknown field\nField: Value') >>> msg['Field'] 'Value'
Just split some headers into names & values and worry about validity later:
>>> for field in headerparser.scan_string('''\ . Name: Scanner Sample . Unknown headers: no problem . Unparsed-Boolean: yes . CaSe-SeNsItIvE-rEsUlTs: true . Whitespace around colons:optional . Whitespace around colons : I already said it's optional. . That means you have the _option_ to use as much as you want! . . And there's a body, too, I guess. . '''): print(field) ('Name', 'Scanner Sample') ('Unknown headers', 'no problem') ('Unparsed-Boolean', 'yes') ('CaSe-SeNsItIvE-rEsUlTs', 'true') ('Whitespace around colons', 'optional') ('Whitespace around colons', "I already said it's optional.\n That means you have the _option_ to use as much as you want!") (None, "And there's a body, too, I guess.\n")
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
Python module for parsing HTTP headers: Accept with qvalues, byte ranges, etc.
License
dmeranda/httpheader
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.md
Httpheader is a Python module for dealing with HTTP headers and content negotiation. It provides a set of utility functions and classes which properly implement all the details and edge cases of the HTTP 1.1 protocol headers. Httpheader is intended to be used as part of a larger web framework or any application that must deal with HTTP.
In particular, httpheader can handle:
- Byte range requests (multipart/byteranges)
- Content negotiation (content type, language, all the Accept-* style headers; including full support for priority/qvalue handling.
- Content/media type parameters
- Conversion to and from HTTP date and time formats
There are a few classes defined by this module:
- class content_type — media types such as ‘text/plain’
- class language_tag — language tags such as ‘en-US’
- class range_set — a collection of (byte) range specifiers
- class range_spec — a single (byte) range specifier
The primary functions in this module may be categorized as follows:
- Content negotiation functions.
- acceptable_content_type()
- acceptable_language()
- acceptable_charset()
- acceptable_encoding()
- parse_accept_header()
- parse_accept_language_header()
- parse_range_header()
- http_datetime()
- parse_http_datetime()
- quote_string()
- remove_comments()
- canonical_charset()
- parse_comma_list()
- parse_comment()
- parse_qvalue_accept_list()
- parse_media_type()
- parse_number()
- parse_parameter_list()
- parse_quoted_string()
- parse_range_set()
- parse_range_spec()
- parse_token()
- parse_token_or_quoted_string()
And there are some specialized exception classes:
- RFC 2616, «Hypertext Transfer Protocol — HTTP/1.1», June 1999. http://www.ietf.org/rfc/rfc2616.txt Errata at http://purl.org/NET/http-errata
- RFC 2046, «(MIME) Part Two: Media Types», November 1996. http://www.ietf.org/rfc/rfc2046.txt
- RFC 3066, «Tags for the Identification of Languages», January 2001. http://www.ietf.org/rfc/rfc3066.txt
Complete documentation and additional information is available on the httpheader project homepage.
This module is also registered on the Python Package Index (PyPI) as package «httpheader». This should make it easy to install into most Python environments.
About
Python module for parsing HTTP headers: Accept with qvalues, byte ranges, etc.