Fast csv reader python

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

License

juancarlospaco/faster-than-csv

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Benchmark Results

  • This CSV Lib is ~300 Lines of Code.
  • Benchmarks run on Docker from Dockerfile on this repo.
  • Speed is IRL time to complete 10000 CSV Parsings.
  • Lines Of Code counted using CLOC.
  • Direct dependencies of the package when ready to run.
  • Benchmarks run on Docker from Dockerfile on this repo.
  • Stats as of year 2021.
  • x86_64 64Bit AMD, SSD, Arch Artix Linux.
import faster_than_csv as csv csv.csv2list("example.csv") # See Docs for more info. # Custom Separators supported. csv.csv2json("example.csv", indentation=4) # CSV to JSON, Pretty-Printed. csv.csv2htmltable("example.csv") # CSV to HTML+CSS Table (No JavaScript). csv.read_clipboard() # CSV from the Clipboard. csv.diff_csvs("example.csv", "anotherfile.csv") # Diff optimized for CSVs.
  • Input: CSV, TSV, Clipboard, File, URL, Custom.
  • Output: CSV, TSV, HTML, JSON, NDJSON, Diff, File, Custom.

Description: Takes a path of a CSV file string, process CSV and returns a list of dictionaries. This is very similar to pandas.read_csv(filename) .

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: Data from the CSV, dict type.

Description: Takes a path of a CSV file string, process CSV and returns a list.

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: Data from the CSV, list type.

Description: Reads CSV string from Clipboard, process CSV and returns a list of dictionaries. This is very similar to pandas.read_clipboard() . This works on Linux, Mac, Windows.

  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: Data from the CSV, dict type.

Description: Takes a path of a CSV file string, process CSV and returns JSON.

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.
  • indentation Pretty-Printed or Minified JSON output, int type, optional, 0 is Minified, 4 is Pretty-Printed, you can use any integer to adjust the indentation.

Returns: Data from the CSV as JSON Minified Single-line string computer-friendly, str type.

Description: Takes a path of a CSV file string, process CSV and returns NDJSON.

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • ndjson_file_path path of the NDJSON file, str type, required, must not be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: None. Data from the CSV as NDJSON https://github.com/ndjson/ndjson-spec, str type.

Description: Takes a path of a CSV file string, process CSV and returns the data rendered on HTML Table.

  • csv_file_path path of the CSV file, str type, required, must not be empty string, defaults to «» , if its empty string then No file is written.
  • html_file_path path of the CSV file, str type, optional, can be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.
  • header_html HTML Header, str type, optional, defaults to Bulma CSS, can be empty string.

Returns: Data from the CSV as HTML Table, str type, raw HTML (no style at all).

Description: Takes a path of a CSV file string, process CSV and returns the data rendered as a Karax HTML Table.

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: Karax DSL, str type.

Description: Takes a path of a CSV file string, process CSV and prints to terminal a colored prety-printed table.

  • csv_file_path path of the CSV file, str type, required, must not be empty string, defaults to «» , if its empty string then No file is written.
  • column_width column width of the wider column, required, int type, must not be 0 , must not be negative.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: None.

Description: Takes a path of a CSV file string, process CSV and returns a Valid XML string. Output is guaranteed to be always Valid XML.

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • separator Separator character of the CSV data, str type, optional, defaults to ‘,’ , must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.
  • header_xml XML Header of the XML string, str type, optional, can be empty string, defaults to «\n» .

Returns: XML, str type.

Description: Takes a path of a CSV file string, process CSV and returns a TSV.

  • csv_file_path path of the CSV file, str type, required, must not be empty string.
  • separator1 Separator character of the CSV data, str type, optional, must not be empty string.
  • separator2 Separator character of the CSV data, str type, optional, must not be empty string.
  • quote Quote character of the CSV data, str type, optional, defaults to ‘»‘ , must not be empty string.

Returns: Data from the CSV as TSV, str type.

Description: Takes 2 paths of 2 CSV files, process CSV and returns the Diff of the 2 CSV.

  • csv_file_path0 path of the CSV file, str type, required, must not be empty string, file must exist.
  • csv_file_path1 path of the CSV file, str type, required, must not be empty string, file must exist.

Returns: Diff.

Instead of having a pair of functions with a lot of arguments that you should provide to make it work, we have tiny functions with very few arguments that do one thing and do it as fast as possible.

$ ./build-docker.sh $ ./run-docker.sh $ ./run-benchmark.sh # Inside Docker.

win-compile

  • Git Clone and Compile on Windows 10 on just 2 commands!.
  • Alternatively you can try Docker for Windows.
  • Alternatively you can try WSL for Windows.
  • The file extension must be .pyd , NOT .dll .

If you dont understand how to install it, you can just download, extract, put the files on the same folder as your *.py file and you are good to go.

Maybe it works on 32Bit, but is not supported, integer sizes are too small, and performance can be worse.

Maybe it works on Python 2, but is not supported, and performance can be worse, we suggest to migrate to Python3.

Functions do not have internal try: except: blocks, so you can wrap them inside try: except: blocks if you need very resilient code.

Add at the end of the PIP install command:

—isolated —disable-pip-version-check —no-cache-dir —no-binary :all:

Unmmodified raw output of Python timeit module.

Please send Pull Request to Python to improve the output of timeit .

Send Crypto, request features, donate today

BEP20 Binance Smart Chain Network BSC

0xb78c4cf63274bb22f83481986157d234105ac17e 

BTC Bitcoin Network

1Pnf45MgGgY32X4KDNJbutnpx96E4FxqVi 

BEP20 Binance Smart Chain Network BSC

0xb78c4cf63274bb22f83481986157d234105ac17e 

ERC20 Ethereum Network

0xb78c4cf63274bb22f83481986157d234105ac17e 

BEP20 Binance Smart Chain Network BSC

0xb78c4cf63274bb22f83481986157d234105ac17e 

ERC20 Ethereum Network

0xb78c4cf63274bb22f83481986157d234105ac17e 

TRC20 Tron Network

TWGft53WgWvH2mnqR8ZUXq1GD8M4gZ4Yfu 

BEP20 Binance Smart Chain Network BSC

0xb78c4cf63274bb22f83481986157d234105ac17e 

SOL Solana Network

FKaPSd8kTUpH7Q76d77toy1jjPGpZSxR4xbhQHyCMSGq 

BEP20 Binance Smart Chain Network BSC

0xb78c4cf63274bb22f83481986157d234105ac17e 

ADA Cardano Network

DdzFFzCqrht9Y1r4Yx7ouqG9yJNWeXFt69xavLdaeXdu4cQi2yXgNWagzh52o9k9YRh3ussHnBnDrg7v7W2hSXWXfBhbo2ooUKRFMieM 

ERC20 Ethereum Network

0xb78c4cf63274bb22f83481986157d234105ac17e 

ALGO Algorand Network

WM54DHVZQIQDVTHMPOH6FEZ4U2AU3OBPGAFTHSCYWMFE7ETKCUUOYAW24Q 

Источник

Читайте также:  Python itertools all permutations
Оцените статью