Python concatenate two dicts

Содержание

How to Merge Dictionaries in Python
Our Problem
Possible Solutions
Multiple update
Copy and update
Dictionary constructor
Keyword arguments hack
Dictionary comprehension
Concatenate items
Union items
Chain items
ChainMap
Why did we put user before defaults ?
Why is there an empty dictionary before user ?
Does this actually give us a dictionary?
Dictionary from ChainMap
Dictionary concatenation
Dictionary unpacking
Dictionary unioning
Summary
Comments
Beyond Intro to Python
Favorite Posts

How to Merge Dictionaries in Python

Have you ever wanted to combine two or more dictionaries in Python?

There are multiple ways to solve this problem: some are awkward, some are inaccurate, and most require multiple lines of code.

Let’s walk through the different ways of solving this problem and discuss which is the most Pythonic.

Our Problem

Before we can discuss solutions, we need to clearly define our problem.

Our code has two dictionaries: user and defaults . We want to combine these two dictionaries into a new dictionary called context .

We have some requirements:

user values should override defaults values in cases of duplicate keys
keys in defaults and user may be any valid keys
the values in defaults and user can be anything
defaults and user should not change during the creation of context
updates made to context should never alter defaults or user

Note: In 5, we’re focused on updates to the dictionary, not contained objects. For concerns about mutability of nested objects, we should look into copy.deepcopy.

So we want something like this:

>>> user = 'name': "Trey", 'website': "http://treyhunner.com"> >>> defaults = 'name': "Anonymous User", 'page_name': "Profile Page"> >>> context = merge_dicts(defaults, user) # magical merge function >>> context

We’ll also consider whether a solution is Pythonic. This is a very subjective and often illusory measure. Here are a few of the particular criteria we will use:

The solution should be concise but not terse
The solution should be readable but not overly verbose
The solution should be one line if possible so it can be written inline if needed
The solution should not be needlessly inefficient

Possible Solutions

Now that we’ve defined our problem, let’s discuss some possible solutions.

We’re going to walk through a number of methods for merging dictionaries and discuss which of these methods is the most accurate and which is the most idiomatic.

Multiple update

Here’s one of the simplest ways to merge our dictionaries:

context = <> context.update(defaults) context.update(user)

Here we’re making an empty dictionary and using the update method to add items from each of the other dictionaries. Notice that we’re adding defaults first so that any common keys in user will override those in defaults .

All five of our requirements were met so this is accurate. This solution takes three lines of code and cannot be performed inline, but it’s pretty clear.

Copy and update

Alternatively, we could copy defaults and update the copy with user .

context = defaults.copy() context.update(user)

This solution is only slightly different from the previous one.

For this particular problem, I prefer this solution of copying the defaults dictionary to make it clear that defaults represents default values.

Dictionary constructor

We could also pass our dictionary to the dict constructor which will also copy the dictionary for us:

context = dict(defaults) context.update(user)

This solution is very similar to the previous one, but it’s a little bit less explicit.

Keyword arguments hack

You may have seen this clever answer before, possibly on StackOverflow:

context = dict(defaults, **user)

This is just one line of code. That’s kind of cool. However, this solution is a little hard to understand.

Beyond readability, there’s an even bigger problem: this solution is wrong.

The keys must be strings. In Python 2 (with the CPython interpreter) we can get away with non-strings as keys, but don’t be fooled: this is a hack that only works by accident in Python 2 using the standard CPython runtime.

Accurate: no. Requirement 2 is not met (keys may be any valid key)
Idiomatic: no. This is a hack.

Dictionary comprehension

Just because we can, let’s try doing this with a dictionary comprehension:

context = k: v for d in [defaults, user] for k, v in d.items()>

This works, but this is a little hard to read.

If we have an unknown number of dictionaries this might be a good idea, but we’d probably want to break our comprehension over multiple lines to make it more readable. In our case of two dictionaries, this doubly-nested comprehension is a little much.

Concatenate items

What if we get a list of items from each dictionary, concatenate them, and then create a new dictionary from that?

context = dict(list(defaults.items()) + list(user.items()))

This actually works. We know that the user keys will win out over defaults because those keys come at the end of our concatenated list.

In Python 2 we actually don’t need the list conversions, but we’re working in Python 3 here (you are on Python 3, right?).

Union items

In Python 3, items is a dict_items object, which is a quirky object that supports union operations.

context = dict(defaults.items() | user.items())

That’s kind of interesting. But this is not accurate.

Requirement 1 ( user should “win” over defaults ) fails because the union of two dict_items objects is a set of key-value pairs and sets are unordered so duplicate keys may resolve in an unpredictable way.

Requirement 3 (the values can be anything) fails because sets require their items to be hashable so both the keys and values in our key-value tuples must be hashable.

Side note: I’m not sure why the union operation is even allowed on dict_items objects. What is this good for?

Chain items

So far the most idiomatic way we’ve seen to perform this merge in a single line of code involves creating two lists of items, concatenating them, and forming a dictionary.

We can join our items together more succinctly with itertools.chain:

from itertools import chain context = dict(chain(defaults.items(), user.items()))

This works well and may be more efficient than creating two unnecessary lists.

ChainMap

A ChainMap allows us to create a new dictionary without even looping over our initial dictionaries (well sort of, we’ll discuss this):

from collections import ChainMap context = ChainMap(<>, user, defaults)

A ChainMap groups dictionaries together into a proxy object (a “view”); lookups query each provided dictionary until a match is found.

This code raises a few questions.

Why did we put user before defaults ?

We ordered our arguments this way to ensure requirement 1 was met. The dictionaries are searched in order, so user returns matches before defaults .

Why is there an empty dictionary before user ?

This is for requirement 5. Changes to ChainMap objects affect the first dictionary provided and we don’t want user to change so we provided an empty dictionary first.

Does this actually give us a dictionary?

A ChainMap object is not a dictionary but it is a dictionary-like mapping. We may be okay with this if our code practices duck typing, but we’ll need to inspect the features of ChainMap to be sure. Among other features, ChainMap objects are coupled to their underlying dictionaries and they handle removing items in an interesting way.

Accurate: possibly, we’ll need to consider our use cases
Idiomatic: yes if we decide this suits our use case

Dictionary from ChainMap

If we really want a dictionary, we could convert our ChainMap to a dictionary:

context = dict(ChainMap(user, defaults))

It’s a little odd that user must come before defaults in this code whereas this order was flipped in most of our other solutions. Outside of that oddity, this code is fairly simple and should be clear enough for our purposes.

Dictionary concatenation

What if we simply concatenate our dictionaries?

This is cool, but it isn’t valid. This was discussed in a python-ideas thread some time ago.

Some of the concerns brought up in this thread include:

Maybe | makes more sense than + because dictionaries are like sets
For duplicate keys, should the left-hand side or right-hand side win?
Should there be an updated built-in instead (kind of like sorted)?

Dictionary unpacking

Since Python 3.5 (thanks to PEP 448) you can merge dictionaries with the ** operator:

This is simple and Pythonic. There are quite a few symbols, but it’s fairly clear that the output is a dictionary at least.

This is functionally equivalent to our very first solution where we made an empty dictionary and populated it with all items from defaults and user in turn. All of our requirements are met and this is likely the simplest solution we’ll ever get.

Dictionary unioning

Since Python 3.9, there’s finally an operator for combining two dictionaries.

The | operator will combine two dictionaries into a new dictionary:

The + and | operators were already in-use on collections.Counter objects (see my article on counting things in Python) and | on Counter objects worked the same way it now does on all dictionaries.

So why use | instead of ** ? Two reasons: | is more readable (for new Pythonistas certainly) and it’s also more flexible.

When using the | operator between mappings (dictionary-like objects), the mappings you’re merging have control over what type is returned (usually they’ll maintain the same type).

For example let’s say we have two defaultdict objects, we’d like to merge:

>>> from collections import defaultdict >>> flags1 = defaultdict(bool, "purple": True, "blue": False>) >>> flags2 = defaultdict(bool, "blue": True, "green": False>)

Since defaultdict objects are dictionary-like, we can use ** to merge them into a new dictionary:

But note that the returned object is a dictionary: it’s of type dict , not collections.defaultdict .

Since Python 3.9, we can instead use the | operator to merge two defaultdict objects:

>>> flags1 | flags2 defaultdict(, )

Unlike ** , using the | operator between mappings will maintain the mapping type.

Summary

There are a number of ways to combine multiple dictionaries, but there are few elegant ways to do this with just one line of code.

If you’re using Python 3.8 or below, this is the most idiomatic way to merge two dictionaries:

If you’re using Python 3.9 or above, this is the most idiomatic way to merge two dictionaries:

Note: If you are particularly concerned with performance, I also measured the performance of these different dictionary merging methods.

If you’re interested in deep-merging this dictionary (merging a dictionary of dictionaries for example), check out this deep merging technique from Mahmoud Hashemi.

If you’re interested in learning more about the new features of * and ** in Python 3.5 and their history you may want to read the article I wrote on asterisks in Python
This post has been updated to note the | operator that Python 3.9 added

Posted by Trey Hunner Feb 23 rd , 2016 10:00 am favorite, python

Comments

Hi! My name is Trey Hunner.

I help Python teams write better Python code through Python team training.

I also help individuals level-up their Python skills with weekly Python skill-building.

Beyond Intro to Python

Need to fill in gaps in your Python knowledge? I have just the thing.

Intro to Python courses often skip over certain fundamental Python concepts. I send emails meant help you internalize those concepts without wasting time.

This isn’t an Intro to Python course. It’s Python concepts beyond Intro to Python. Sign up below to get started.

Favorite Posts

Intro to Python courses often skip over some fundamental Python concepts.

You’re nearly signed up. You just need to check your email and click the link there to set your password.

Right after you’ve set your password you’ll receive your first Python Morsels exercise.

Источник