- How to Merge Dictionaries in Python
- Our Problem
- Possible Solutions
- Multiple update
- Copy and update
- Dictionary constructor
- Keyword arguments hack
- Dictionary comprehension
- Concatenate items
- Union items
- Chain items
- ChainMap
- Why did we put user before defaults ?
- Why is there an empty dictionary before user ?
- Does this actually give us a dictionary?
- Dictionary from ChainMap
- Dictionary concatenation
- Dictionary unpacking
- Dictionary unioning
- Summary
- Comments
- Beyond Intro to Python
- Favorite Posts
How to Merge Dictionaries in Python
Have you ever wanted to combine two or more dictionaries in Python?
There are multiple ways to solve this problem: some are awkward, some are inaccurate, and most require multiple lines of code.
Let’s walk through the different ways of solving this problem and discuss which is the most Pythonic.
Our Problem
Before we can discuss solutions, we need to clearly define our problem.
Our code has two dictionaries: user and defaults . We want to combine these two dictionaries into a new dictionary called context .
We have some requirements:
- user values should override defaults values in cases of duplicate keys
- keys in defaults and user may be any valid keys
- the values in defaults and user can be anything
- defaults and user should not change during the creation of context
- updates made to context should never alter defaults or user
Note: In 5, we’re focused on updates to the dictionary, not contained objects. For concerns about mutability of nested objects, we should look into copy.deepcopy.
So we want something like this:
>>> user = 'name': "Trey", 'website': "http://treyhunner.com"> >>> defaults = 'name': "Anonymous User", 'page_name': "Profile Page"> >>> context = merge_dicts(defaults, user) # magical merge function >>> context
We’ll also consider whether a solution is Pythonic. This is a very subjective and often illusory measure. Here are a few of the particular criteria we will use:
- The solution should be concise but not terse
- The solution should be readable but not overly verbose
- The solution should be one line if possible so it can be written inline if needed
- The solution should not be needlessly inefficient
Possible Solutions
Now that we’ve defined our problem, let’s discuss some possible solutions.
We’re going to walk through a number of methods for merging dictionaries and discuss which of these methods is the most accurate and which is the most idiomatic.
Multiple update
Here’s one of the simplest ways to merge our dictionaries:
context = <> context.update(defaults) context.update(user)
Here we’re making an empty dictionary and using the update method to add items from each of the other dictionaries. Notice that we’re adding defaults first so that any common keys in user will override those in defaults .
All five of our requirements were met so this is accurate. This solution takes three lines of code and cannot be performed inline, but it’s pretty clear.
Copy and update
Alternatively, we could copy defaults and update the copy with user .
context = defaults.copy() context.update(user)
This solution is only slightly different from the previous one.
For this particular problem, I prefer this solution of copying the defaults dictionary to make it clear that defaults represents default values.
Dictionary constructor
We could also pass our dictionary to the dict constructor which will also copy the dictionary for us:
context = dict(defaults) context.update(user)
This solution is very similar to the previous one, but it’s a little bit less explicit.
Keyword arguments hack
You may have seen this clever answer before, possibly on StackOverflow:
context = dict(defaults, **user)
This is just one line of code. That’s kind of cool. However, this solution is a little hard to understand.
Beyond readability, there’s an even bigger problem: this solution is wrong.
The keys must be strings. In Python 2 (with the CPython interpreter) we can get away with non-strings as keys, but don’t be fooled: this is a hack that only works by accident in Python 2 using the standard CPython runtime.
- Accurate: no. Requirement 2 is not met (keys may be any valid key)
- Idiomatic: no. This is a hack.
Dictionary comprehension
Just because we can, let’s try doing this with a dictionary comprehension:
context = k: v for d in [defaults, user] for k, v in d.items()>
This works, but this is a little hard to read.
If we have an unknown number of dictionaries this might be a good idea, but we’d probably want to break our comprehension over multiple lines to make it more readable. In our case of two dictionaries, this doubly-nested comprehension is a little much.
Concatenate items
What if we get a list of items from each dictionary, concatenate them, and then create a new dictionary from that?
context = dict(list(defaults.items()) + list(user.items()))
This actually works. We know that the user keys will win out over defaults because those keys come at the end of our concatenated list.
In Python 2 we actually don’t need the list conversions, but we’re working in Python 3 here (you are on Python 3, right?).
Union items
In Python 3, items is a dict_items object, which is a quirky object that supports union operations.
context = dict(defaults.items() | user.items())
That’s kind of interesting. But this is not accurate.
Requirement 1 ( user should “win” over defaults ) fails because the union of two dict_items objects is a set of key-value pairs and sets are unordered so duplicate keys may resolve in an unpredictable way.
Requirement 3 (the values can be anything) fails because sets require their items to be hashable so both the keys and values in our key-value tuples must be hashable.
Side note: I’m not sure why the union operation is even allowed on dict_items objects. What is this good for?
Chain items
So far the most idiomatic way we’ve seen to perform this merge in a single line of code involves creating two lists of items, concatenating them, and forming a dictionary.
We can join our items together more succinctly with itertools.chain:
from itertools import chain context = dict(chain(defaults.items(), user.items()))
This works well and may be more efficient than creating two unnecessary lists.
ChainMap
A ChainMap allows us to create a new dictionary without even looping over our initial dictionaries (well sort of, we’ll discuss this):
from collections import ChainMap context = ChainMap(<>, user, defaults)
A ChainMap groups dictionaries together into a proxy object (a “view”); lookups query each provided dictionary until a match is found.
This code raises a few questions.
Why did we put user before defaults ?
We ordered our arguments this way to ensure requirement 1 was met. The dictionaries are searched in order, so user returns matches before defaults .
Why is there an empty dictionary before user ?
This is for requirement 5. Changes to ChainMap objects affect the first dictionary provided and we don’t want user to change so we provided an empty dictionary first.
Does this actually give us a dictionary?
A ChainMap object is not a dictionary but it is a dictionary-like mapping. We may be okay with this if our code practices duck typing, but we’ll need to inspect the features of ChainMap to be sure. Among other features, ChainMap objects are coupled to their underlying dictionaries and they handle removing items in an interesting way.
- Accurate: possibly, we’ll need to consider our use cases
- Idiomatic: yes if we decide this suits our use case
Dictionary from ChainMap
If we really want a dictionary, we could convert our ChainMap to a dictionary:
context = dict(ChainMap(user, defaults))
It’s a little odd that user must come before defaults in this code whereas this order was flipped in most of our other solutions. Outside of that oddity, this code is fairly simple and should be clear enough for our purposes.
Dictionary concatenation
What if we simply concatenate our dictionaries?
This is cool, but it isn’t valid. This was discussed in a python-ideas thread some time ago.
Some of the concerns brought up in this thread include:
- Maybe | makes more sense than + because dictionaries are like sets
- For duplicate keys, should the left-hand side or right-hand side win?
- Should there be an updated built-in instead (kind of like sorted)?
Dictionary unpacking
Since Python 3.5 (thanks to PEP 448) you can merge dictionaries with the ** operator:
This is simple and Pythonic. There are quite a few symbols, but it’s fairly clear that the output is a dictionary at least.
This is functionally equivalent to our very first solution where we made an empty dictionary and populated it with all items from defaults and user in turn. All of our requirements are met and this is likely the simplest solution we’ll ever get.
Dictionary unioning
Since Python 3.9, there’s finally an operator for combining two dictionaries.
The | operator will combine two dictionaries into a new dictionary:
The + and | operators were already in-use on collections.Counter objects (see my article on counting things in Python) and | on Counter objects worked the same way it now does on all dictionaries.
So why use | instead of ** ? Two reasons: | is more readable (for new Pythonistas certainly) and it’s also more flexible.
When using the | operator between mappings (dictionary-like objects), the mappings you’re merging have control over what type is returned (usually they’ll maintain the same type).
For example let’s say we have two defaultdict objects, we’d like to merge:
>>> from collections import defaultdict >>> flags1 = defaultdict(bool, "purple": True, "blue": False>) >>> flags2 = defaultdict(bool, "blue": True, "green": False>)
Since defaultdict objects are dictionary-like, we can use ** to merge them into a new dictionary:
But note that the returned object is a dictionary: it’s of type dict , not collections.defaultdict .
Since Python 3.9, we can instead use the | operator to merge two defaultdict objects:
>>> flags1 | flags2 defaultdict(, )
Unlike ** , using the | operator between mappings will maintain the mapping type.
Summary
There are a number of ways to combine multiple dictionaries, but there are few elegant ways to do this with just one line of code.
If you’re using Python 3.8 or below, this is the most idiomatic way to merge two dictionaries:
If you’re using Python 3.9 or above, this is the most idiomatic way to merge two dictionaries:
Note: If you are particularly concerned with performance, I also measured the performance of these different dictionary merging methods.
If you’re interested in deep-merging this dictionary (merging a dictionary of dictionaries for example), check out this deep merging technique from Mahmoud Hashemi.
- If you’re interested in learning more about the new features of * and ** in Python 3.5 and their history you may want to read the article I wrote on asterisks in Python
- This post has been updated to note the | operator that Python 3.9 added
Posted by Trey Hunner Feb 23 rd , 2016 10:00 am favorite, python
Comments
Hi! My name is Trey Hunner.
I help Python teams write better Python code through Python team training.
I also help individuals level-up their Python skills with weekly Python skill-building.
Beyond Intro to Python
Need to fill in gaps in your Python knowledge? I have just the thing.
Intro to Python courses often skip over certain fundamental Python concepts. I send emails meant help you internalize those concepts without wasting time.
This isn’t an Intro to Python course. It’s Python concepts beyond Intro to Python. Sign up below to get started.
Favorite Posts
Copyright © 2023 — Trey Hunner — Powered by Octopress
Intro to Python courses often skip over some fundamental Python concepts.
Sign up below and I’ll share ideas new Pythonistas often overlook.
You’re nearly signed up. You just need to check your email and click the link there to set your password.
Right after you’ve set your password you’ll receive your first Python Morsels exercise.