V1: What's New? - rnag/dataclass-wizard GitHub Wiki
The upcoming major release, V1, is a complete, ground-up rewrite of the Dataclass Wizard's core logic. This release addresses critical inefficiencies and simplifies the library’s usage while making the documentation easier to follow and understand.
The overarching goal of V1 is to optimize how the library processes data, enhance performance and provide a more intuitive and "magical" experience for users.
EnvWizard Improvements in V1
EnvWizard benefits significantly from the v1 engine rewrite.
Notable improvements include:
- Explicit and configurable environment precedence
- Native support for nested EnvWizard dataclasses
- Field-level environment configuration via
Env(...)andAlias(env=...) - Improved error diagnostics for missing or invalid environment values
- Cached dotenv and secrets resolution for better performance
See EnvWizard v1 — Quickstart (Opt-In) for a hands-on introduction.
Breaking Changes in v1.0
1. Default Key Transformation Update
- Breaking Change: In
v1.0, the default key transformation for JSON serialization now keeps keys as-is, rather than converting them tocamelCase. - New Default:
key_transform='NONE'.
How to Prepare:
You can enforce this behavior using the JSONPyWizard helper:
from dataclasses import dataclass
from dataclass_wizard import JSONPyWizard
@dataclass
class MyModel(JSONPyWizard):
my_field: str
print(MyModel(my_field="value").to_dict())
# Output: {'my_field': 'value'}
2. Default __str__() Behavior Change
- Breaking Change: The
__str__()method no longer prettifies JSON withcamelCase. It now uses thepprintmodule for better readability.
How to Prepare:
Test this using JSONPyWizard:
from dataclasses import dataclass
from dataclass_wizard import JSONWizard, JSONPyWizard
@dataclass
class CurrentModel(JSONWizard):
my_field: str
@dataclass
class NewModel(JSONPyWizard):
my_field: str
print(CurrentModel(my_field="value"))
# Output:
# {
# "myField": "value"
# }
print(NewModel(my_field="value"))
# Output: NewModel(my_field='value')
3. Float to Int Conversion Change
- Breaking Change: In
v1.0, floats with fractional parts (e.g.,123.4) will raise an error instead of being silently converted to integers. Floats without fractional parts (e.g.,3.0) will still convert to integers.
How to Prepare:
Make sure your fields use float annotations where appropriate:
from dataclasses import dataclass
from dataclass_wizard import JSONPyWizard
@dataclass
class Test(JSONPyWizard):
class _(JSONPyWizard.Meta):
v1 = True
list_of_int: list[int]
input_dict = {'list_of_int': [1, '2.0', '3.', -4, '-5.00', '6', '-7']}
t = Test.from_dict(input_dict)
print(t) # Output: Test(list_of_int=[1, 2, 3, -4, -5, 6, -7])
# ERROR!
_ = Test.from_dict({'list_of_int': [123.4]})
4. String coercion (None handling)
In v1, fields annotated as str no longer silently coerce None to the empty
string by default.
str:None → "None"Optional[str]:Nonepreserved
To restore the old behavior:
class _(Meta):
v1 = True
v1_coerce_none_to_empty_str = True
Current Issues in Dataclass Wizard
There are several inefficiencies and flaws in how the current version of Dataclass Wizard works, particularly in deserialization:
-
Deserialization Process
Currently, deserialization iterates over the keys of the JSON object (o) rather than the fields of the target dataclass. This approach is flawed because:- Key Count Mismatch: The number of keys in a JSON object is often greater than or equal to the number of fields defined in the dataclass. Iterating over the dataclass fields would be inherently more efficient since the expected keys are predefined.
- Missed Optimizations: By knowing the dataclass fields upfront, several optimizations can be implemented during class instantiation.
-
Serialization
While the serialization process in the current version works, there are areas for improvement. Updates for this will be revealed as development progresses.
Improvements in V1
The V1 rewrite introduces numerous optimizations and architectural changes to address these issues:
-
Iterating Over Dataclass Fields
Instead of looping over JSON keys,V1now iterates directly over the fields defined in the dataclass. This change provides several advantages:- Reduced Operations: Since the dataclass fields are predefined, unnecessary runtime checks and iterations are eliminated.
- Eliminates Mapping Overhead: No need for runtime dictionaries to map fields to parsers. All mappings are determined at class generation time, reducing memory and execution overhead.
-
Code Generation for Parsing
The new version dynamically generates parsing functions for each field based on its type annotations. This eliminates the need for multiple@dataclass-decorated classes to extend generic parser classes. The result is cleaner, leaner, and faster code. -
Positional Argument Support
WithV1, the Dataclass Wizard shifts gear to pass positional arguments (when possible) to a dataclass’s__init__method during instantiation. This approach significantly speeds up class creation and improves the overall performance of thefrom_dictfunction. -
Memory and Execution Efficiency
By leveraging precomputed field mappings and type-specific parsers,V1reduces runtime overhead and speeds up deserialization end to end. This makes the library faster, more lightweight, and scalable for larger datasets.
Long-Term Goals
While V1 currently focuses on improving deserialization, work is underway to bring similar optimizations to the serialization process. The goal is to maintain the same level of simplicity and intuitiveness for users while optimizing performance.
Why V1 Is a Game-Changer
The existing version of Dataclass Wizard has several inefficiencies that make usage less optimal and can lead to slower execution times. With V1, these issues are resolved, and the library is reimagined for better performance, usability, and clarity. The new release simplifies the de/serialization process while preserving the flexibility and ease of use that Dataclass Wizard is known for.
In short, V1 represents a significant leap forward in both performance and design, setting a strong foundation for future development.
Performance Improvements
The upcoming major release, v1, will deliver significantly improved performance in both serialization and deserialization.
Based on personal tests and benchmarks, the v1 configuration makes Dataclass Wizard approximately 2x faster than pydantic! It also positions Dataclass Wizard alongside other high-performance serialization libraries in Python.
The minor release v0.33.0 offers v1 opt-in (optimizing de-serialization only at this point).
[!TIP] Not all features are fully supported just yet -- see Progress Overview for completed work so far.