- How to convert text file into dict?
- 1 Answer 1
- 4 Ways to Convert text file to dictionary in Python
- 1. Convert a text file to a dictionary in Python using For loop
- 3. JSON Module to Convert text file to dictionary in Python
- For Convert text file to Dict
- 4. Pickle. dump() to Convert dictionary to a text file in
- Program for Writing and Reading a dictionary to file
- Conclusion :
- Parsing file into a dictionary in python
- 7 Answers 7
- Output
- Python: Create Dictionary from Text/File that’s in Dictionary Format
- 6 Answers 6
How to convert text file into dict?
This file we are assuming is a txt file. How would I convert something like this into a dictionary? This is what I tried:
try: with open(filename) as f: line_dict = <> for part in filename: key, value = part.split(":") line_dictParse file to dict python = value except Exception as e: print e
I get need more than 1 value to unpack. I’m guessing it’s mad about the extra bracket, right? What would be the best way to go about this and what are some options I can look into?
You can’t parse something unless it follows some format. The data you show is JSON, so json.load(open(«filename»)) is all you need. In your code, for part in filename: certainly isn’t what you want — part would just be the letters of the filename. Generally, its good to know the format of the thing you are trying to parse and to use existing parsers if you can.
If I use json.load I end up with string with «u» characters and the order looks different from the original file. Is the best solution here to use a for loop to get rid of anything that fits (u’)
You you need to use 2.x? It been depricated for a long time. Those are unicode characters (in python 3, str is unicode and unicode has been depricated) and work much the same is regular strings.
A unicode string like u»employeeID» doesn’t really have a «u» on front. Its python’s way of telling you its unicode. It may work perfectly fine in your application. If not, you could parse with json then go through the result and convert everything to str by decoding.
1 Answer 1
Since the data is in JSON format, you can use the json module to parse it into a Python dictionary. In Python 2 strings aren’t Unicode, so you’ll also need to convert all Unicode strings in the input into that format.
Here’s how to do that using a helper function. Note that the order of items in the result may not be the same as the input because in Python 2, dictionaries don’t preserve the order-of-insertion.
import json from pprint import pprint import sys if sys.version_info[0] > 2: raise RuntimeError('Requires Python 2') def unicode_convert(obj): """ Convert unicode data to string in object and return it. """ if isinstance(obj, unicode): obj = obj.encode() elif isinstance(obj, list): obj = map(unicode_convert, obj) elif isinstance(obj, dict): # Converts contents in-place. for key, value in obj.items(): # Copy of dict's (key, value) pairs (Py2) del objParse file to dict python if isinstance(key, unicode): key = key.encode() if isinstance(value, (unicode, list, dict)): value = unicode_convert(value) objParse file to dict python = value return obj with open('personal_info.json') as file: info = json.load(file) # Parse JSON object into Python dictionary. line_dict = pprint(line_dict)
4 Ways to Convert text file to dictionary in Python
In this article, we are going to learn about 4 Ways to Convert text file to dictionary in Python. This is a very helpful exercise for programmers who works with data files. As a Data Analyst or Data scientist when we have huge data in files then we need to convert that data into python objects. This conversion helps us in handling the data in an organized manner.
We will learn about one of the use cases where we will be converting the text file data to a Python dictionary object. To understand this, We will take a sample text file and will read the content, and load it in the Python dictionary.
1. Convert a text file to a dictionary in Python using For loop
How does it work
- First, we will create an empty dictionary that will hold the file data in the form of dictionary data. Then we will open the file ‘lang.txt’ to start reading the content of the file.
- Now we have the file open and next we will read the data line by line from this file. Once we read each line then we will use the split function to split the line contents.
The line content is then assigned to the key and value pair of the dictionary. Since we are using the for loop to iterate the lines of a file and then create the dictionary items, this will keep going until we reach the end of the file. Consider the example below to convert the data content to a dictionary.
dictionary = <> with open("lang.txt") as file:
So to start with, Let us Assume, we have the following text file (lang.txt) which contains the following data.
myline = data myline2 = This is data myline3 = Dict_Keydata myline4 = Dict_datavalue
Let’s understand with an example how to convert this text file into a dictionary
myfile = open("data.txt", 'r') data_dict = <> for line in myfile: k, v = line.strip().split('=') data_dict[k.strip()] = v.strip() myfile.close() print(' text file to dictionary =\n ',data_dict)
3. JSON Module to Convert text file to dictionary in Python
The JSON is a text format that is language-independent. It is very easy to read and write, the machine can easily parse it and generate it. We are using the JSON module dump() method.
- Import the JSON Module.
- open the file in write mode using with statement “with open(“DictFile.txt”,”w”) as file:”
- Write the dictionary to file using write(json.dumps(dict_students)) method.
For Convert text file to Dict
- open the file in read mode using with the statement “with open(“DictFile.txt”,”r”) as file:”
- Convert text file to Dictionary using file_content = file.read()
- Print file contents using the print method
import json dict_students = with open("DictFile.txt","w") as file: file.write(json.dumps(dict_students)) #reading the json file with open("DictFile.txt", "r") as file: file_content = file.read() print('file contents:',file_content)
4. Pickle. dump() to Convert dictionary to a text file in
First, open the file in write mode by using “wb”, this mode is used to open files for writing in binary format. Use pickle.dump() to serialize dictionary data and then write to file.
To read a file we need to open the binary file in reading mode(“rb”), then use the pickle.load() method to deserialize the file contents. To check the output we are Printing the file contents using the print() method.
Program for Writing and Reading a dictionary to file
#python 3 program to write and read dictionary to text file import pickle dict_students = file = open("DictFile.pkl","wb") pickle.dump(dict_students, file) file.close() #reading the DictFile.pkl" contents file = open("DictFile.pkl", "rb") file_contents = pickle.load(file) print(file_contents)
The file will get created in the current directory with the following data format.
Conclusion :
We have explored 4 Ways to Convert text file to dictionary in Python. In a second way, we are using the delimiter to separate the key-value pair. We can do any of these two ways as per our requirements.
Parsing file into a dictionary in python
Number of strings between ClucthXXX and next ClutchXXX might be different but not equal to zero. I was wondering if it’s possible somehow to take a specific string from a file using it as a key (in my case it would be ClutchXXX) and the text till the second occurrence of the specific string as a value for a dictionary? I want to receive such dictionary:
I am mostly interested in the part where we take string pattern and save it as a key and the text after as a value. Any suggestions or directions to a useful approach would be appreciated.
Is it possible that the word Clutch will appear in any other line? If not, you could use .split(‘Clutch’)
See my answer below. No need for regex as long as the alphabetic part of the keyword («Clutch») doesn’t appear elsewhere.
7 Answers 7
from itertools import groupby from functools import partial key = partial(re.match, r'Clutch\d\d\d') with open('foo.txt') as f: groups = (', '.join(map(str.strip, g)) for k, g in groupby(f, key=key)) pprint(dict(zip(*[iter(groups)]*2)))
Collect the lines in lists, storing that list in a dictionary at the same time:
d = <> values = None with open(filename) as inputfile: for line in inputfile: line = line.strip() if line.startswith('Clutch'): values = d[line] = [] else: values.append(line)
It’s easy enough to turn all those lists into single strings though, after loading the file:
You can also do the joining as you read the file; I’d use a generator function to process the file in groups:
def per_clutch(inputfile): clutch = None lines = [] for line in inputfile: line = line.strip() if line.startswith('Clutch'): if lines: yield clutch, lines clutch, lines = line, [] else: lines.append(line) if clutch and lines: yield clutch, lines
then just slurp all groups into a dictionary:
with open(filename) as inputfile: d =
>>> def per_clutch(inputfile): . clutch = None . lines = [] . for line in inputfile: . line = line.strip() . if line.startswith('Clutch'): . if lines: . yield clutch, lines . clutch, lines = line, [] . else: . lines.append(line) . if clutch and lines: . yield clutch, lines . >>> sample = '''\ . Clutch001 . Albino X Pastel . Bumble Bee X Albino Lesser . Clutch002 . Bee X Fire Bee . Albino Cinnamon X Albino . Mojave X Bumble Bee . Clutch003 . Black Pastel X Banana Ghost Lesser . '''.splitlines(True) >>> >>> from pprint import pprint >>> pprint(_)
@BallPython: then your first line does not start with ‘Clutch’ ; only when a line starting with ‘Clutch’ is encountered is values set to a list.
plus one, values = d[line] = [] is pretty amazing. I would use your first code but what good are the other approaches, they don’t match the simplicity of the first one
As noted in comments, if «Clutch» (or whatever keyword) can be relied on not to appear in the non-keyword lines, you could use the following:
keyword = "Clutch" with open(filename) as inputfile: t = inputfile.read() d =
This reads the whole file in to memory at once, so should be avoided if your file may get very large.
You could use re.split() to enumerate «Clutch» parts in the file:
import re tokens = iter(re.split(r'(^Clutch\d\s*$)\s+', file.read(), flags=re.M)) next(tokens) # skip until the first Clutch print()
Output
Lets file ‘file.txt’ contains:
Clutch001 Albino X Pastel Bumble Bee X Albino Lesser Clutch002 Bee X Fire Bee Albino Cinnamon X Albino Mojave X Bumble Bee Clutch003 Black Pastel X Banana Ghost Lesser
To receive your dictionary try this:
import re with open('file.txt', 'r') as f: result = re.split( r'(Clutch\d).*?', f.read(), flags=re.DOTALL # including '\n' )[1:] # result is ['Clutch001', '\nAlbino X Pastel\nBumble Bee X Albino Lesser\n', 'Clutch002', '\nBee X Fire Bee\nAlbino Cinnamon X Albino\nMojave X Bumble Bee\n', 'Clutch003', '\nBlack Pastel X Banana Ghost Lesser\n'] keys = result[::2] # keys is ['Clutch001', 'Clutch002', 'Clutch003'] values = result[1::2] # values is ['\nAlbino X Pastel\nBumble Bee X Albino Lesser\n', '\nBee X Fire Bee\nAlbino Cinnamon X Albino\nMojave X Bumble Bee\n', '\nBlack Pastel X Banana Ghost Lesser\n'] values = map( lambda value: value.strip().replace('\n', ', '), values ) # values is ['Albino X Pastel, Bumble Bee X Albino Lesser', 'Bee X Fire Bee, Albino Cinnamon X Albino, Mojave X Bumble Bee', 'Black Pastel X Banana Ghost Lesser'] d = dict(zip(keys, values)) # d is
Python: Create Dictionary from Text/File that’s in Dictionary Format
I’d like to create a dictionary from a text file that I have, who’s contents are in a ‘dictionary’ format. Here’s a sample of what the file contains:
It’s exactly this except it contains 125,000 entries. I am able to read in the text file using read(), but it creates a variable of the literal text of the file even when I initialize the variable with
@JBernardo +1 as long as you have Python 2.6 or newer, that’s the way to go. The ast module was introduced in 2.5, but didn’t have the helper functions (such as literal_eval ). Those came in 2.6.
6 Answers 6
You can use the eval built-in. For example, this would work if each dictionary entry is on a different line:
dicts_from_file = [] with open('myfile.txt','r') as inf: for line in inf: dicts_from_file.append(eval(line)) # dicts_from_file now contains the dictionaries created from the text file
Alternatively, if the file is just one big dictionary (even on multiple lines), you can do this:
with open('myfile.txt','r') as inf: dict_from_file = eval(inf.read())
This is probably the most simple way to do it, but it’s not the safest. As others mentioned in their answers, eval has some inherent security risks. The alternative, as mentioned by JBernardo, is to use ast.literal_eval which is much safer than eval since it will only evaluate strings which contain literals. You can simply replace all the calls to eval in the above examples with ast.literal_eval after importing the ast module.
If you’re using Python 2.4 you are not going to have the ast module, and you’re not going to have with statements. The code will look more like this:
inf = open('myfile.txt','r') dict_from_file = eval(inf.read()) inf.close()
Don’t forget to call inf.close() . The beauty of with statements is they do it for you, even if the code block in the with statement raises an exception.