Python data class nested

Creating nested dataclass objects in Python

This is a request that is as complex as the dataclasses module itself, which means that probably the best way to achieve this «nested fields» capability is to define a new decorator, akin to @dataclass .

Fortunately, if you don’t need the signature of the __init__ method to reflect the fields and their defaults, like the classes rendered by calling dataclass , this can be a whole lot simpler: A class decorator that will call the original dataclass and wrap some functionality over its generated __init__ method can do it with a plain » . (*args, **kwargs): » style function.

In other words, all one needs to do is write a wrapper around the generated __init__ method that will inspect the parameters passed in «kwargs», check if any corresponds to a «dataclass field type», and if so, generate the nested object prior to calling the original __init__ . Maybe this is harder to spell out in English than in Python:

from dataclasses import dataclass, is_dataclass def nested_dataclass(*args, **kwargs): def wrapper(cls): cls = dataclass(cls, **kwargs) original_init = cls.__init__ def __init__(self, *args, **kwargs): for name, value in kwargs.items(): field_type = cls.__annotations__.get(name, None) if is_dataclass(field_type) and isinstance(value, dict): new_obj = field_type(**value) kwargs[name] = new_obj original_init(self, *args, **kwargs) cls.__init__ = __init__ return cls return wrapper(args[0]) if args else wrapper 

Note that besides not worrying about __init__ signature, this also ignores passing init=False — since it would be meaningless anyway.

(The if in the return line is responsible for this to work either being called with named parameters or directly as a decorator, like dataclass itself)

Читайте также:  CSS Table Border

And on the interactive prompt:

In [85]: @dataclass . class A: . b: int = 0 . c: str = "" . In [86]: @dataclass . class A: . one: int = 0 . two: str = "" . . In [87]: @nested_dataclass . class B: . three: A . four: str . In [88]: @nested_dataclass . class C: . five: B . six: str . . In [89]: obj = C(five=, "four": "zort">, six="fnord") In [90]: obj.five.three.two Out[90]: 'narf' 

If you want the signature to be kept, I’d recommend using the private helper functions in the dataclasses module itself, to create a new __init__ .

Solution 2

You can use post_init for this

from dataclasses import dataclass @dataclass class One: f_one: int f_two: str @dataclass class Two: f_three: str f_four: One def __post_init__(self): self.f_four = One(**self.f_four) data = > print(Two(**data)) # Two(f_three='three', f_four=One(f_one=1, f_two='two')) 

Solution 3

You can try dacite module. This package simplifies creation of data classes from dictionaries — it also supports nested structures.

from dataclasses import dataclass from dacite import from_dict @dataclass class A: x: str y: int @dataclass class B: a: A data = < 'a': < 'x': 'test', 'y': 1, >> result = from_dict(data_class=B, data=data) assert result == B(a=A(x='test', y=1)) 

To install dacite, simply use pip:

Solution 4

Instead of writing a new decorator I came up with a function modifying all fields of type dataclass after the actual dataclass is initialized.

def dicts_to_dataclasses(instance): """Convert all fields of type `dataclass` into an instance of the specified data class if the current value is of type dict.""" cls = type(instance) for f in dataclasses.fields(cls): if not dataclasses.is_dataclass(f.type): continue value = getattr(instance, f.name) if not isinstance(value, dict): continue new_value = f.type(**value) setattr(instance, f.name, new_value) 

The function could be called manually or in __post_init__ . This way the @dataclass decorator can be used in all its glory.

The example from above with a call to __post_init__ :

@dataclass class One: f_one: int f_two: str @dataclass class Two: def __post_init__(self): dicts_to_dataclasses(self) f_three: str f_four: One data = > two = Two(**data) # Two(f_three='three', f_four=One(f_one=1, f_two='two')) 

Solution 5

I have created an augmentation of the solution by @jsbueno that also accepts typing in the form List[] .

def nested_dataclass(*args, **kwargs): def wrapper(cls): cls = dataclass(cls, **kwargs) original_init = cls.__init__ def __init__(self, *args, **kwargs): for name, value in kwargs.items(): field_type = cls.__annotations__.get(name, None) if isinstance(value, list): if field_type.__origin__ == list or field_type.__origin__ == List: sub_type = field_type.__args__[0] if is_dataclass(sub_type): items = [] for child in value: if isinstance(child, dict): items.append(sub_type(**child)) kwargs[name] = items if is_dataclass(field_type) and isinstance(value, dict): new_obj = field_type(**value) kwargs[name] = new_obj original_init(self, *args, **kwargs) cls.__init__ = __init__ return cls return wrapper(args[0]) if args else wrapper 

Источник

Python nested dataclasses . is this valid?

I’m using dataclasses to create a nested data structure, that I use to represent a complex test output. Previously I’d been creating a hierarchy by creating multiple top-level dataclasses and then using composition:

from dataclasses import dataclass @dataclass class Meta: color: str size: float @dataclass class Point: x: float y: float stuff: Meta point1 = Point(x=5, y=-5, stuff=Meta(color='blue', size=20)) 

Problem

I was wondering if there was a way of defining the classses in a self-contained way, rather than polluting my top-level with a bunch of lower-level classes. So above, the definition of Point dataclass contains the definition of Meta , rather than the definition being at the top level.

Solution?

I wondered if it’s possible to use inner (dataclass) classes with a dataclass and have things all work. So I tried this:

rom dataclasses import dataclass from typing import get_type_hints @dataclass class Point: @dataclass class Meta: color: str size: float @dataclass class Misc: elemA: bool elemB: int x: float y: float meta: Meta misc: Misc point1 = Point(x=1, y=2, meta=Point.Meta(color='red', size=5.5), misc=Point.Misc(elemA=True, elemB=-100)) print("This is the point:", point1) print(point1.x) print(point1.y) print(point1.meta) print(point1.misc) print(point1.meta.color) print(point1.misc.elemB) point1.misc.elemB = 99 print(point1) print(point1.misc.elemB) 

This all seems to work — the print outputs all work correctly, and the assignment to a (sub) member element works as well. You can even support defaults for nested elements:

from dataclasses import dataclass @dataclass class Point: @dataclass class Meta: color: str = 'red' size: float = 10.0 x: float y: float meta: Meta = Meta() pt2 = Point(x=10, y=20) print('pt2', pt2) 

Question

Is this a correct way to implement nested dataclasses? (meaning it’s just not lucky it works now, but would likely break in future? . or it’s just fugly and Not How You Do Things? . or it’s just Bad?) . It’s certainly a lot cleaner and a million times easier to understand and upport than a gazillion top-level ‘mini’ dataclasses being composed together. . It’s also a lot easier than trying to use marshmellow or jerry-rigging a json schema to class structure model. . It also is very simple (which I like)

Источник

Оцените статью