-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generics support #163
base: main
Are you sure you want to change the base?
generics support #163
Conversation
cdf40bd
to
18e5b55
Compare
marshmallow_recipe/serialization.py
Outdated
@@ -36,7 +36,9 @@ def __call__( | |||
|
|||
|
|||
class DumpFunction(Protocol): | |||
def __call__(self, data: Any, *, naming_case: NamingCase | None = None) -> dict[str, Any]: ... | |||
def __call__( | |||
self, data: Any, *, naming_case: NamingCase | None = None, t: type | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of passing t here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added commit
that function c396fa8#diff-7f5a7271c4eef44b2f5fda6334ecae0952550de169520dde4a919d86949361c2R655
and that test c396fa8#diff-7f5a7271c4eef44b2f5fda6334ecae0952550de169520dde4a919d86949361c2R655
should give an answer
the issue is that type(instance)
returns unsubscripted (open) generic but we can get subscripted (closed) generic from instance.__origin_class__
but it does not for frozen=True
or slots=True
dataclasses that is why we need to specify type explicitly
I will also add more details in PR description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR description updated
marshmallow_recipe/generics.py
Outdated
|
||
origin = get_origin(t) | ||
if origin is Union or origin is types.UnionType: | ||
return Union[*(build_subscripted_type(x, type_var_map) for x in get_args(t))] # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pyright highlights *
as unsupported for Python 3.10 in this case of usage but as I see there are no Python 3.10 in tests matrix.
Should I support this behaviour for Python 3.10 or can I increase Python version for Pyright up to 3.11?
The same thing for Annotated
type on the next line
Also Pyright requesting for update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this branch and removed # type: ignore
for this case
also I noticed that Pyright does not notify about redundant # type: ignore
as mypy do
395fd0b
to
68aff75
Compare
@@ -72,81 +80,81 @@ def bake_schema( | |||
if result := _schema_types.get(key): | |||
return result | |||
|
|||
fields_with_metadata = [ | |||
( | |||
fields_type_map = get_fields_type_map(cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we pass cls
here and not origin
? If there is why do we use origin
on line 95 and why we even collecting field names one more time and not using the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we pass cls here and not origin?
in case with genetic type origin
will be unsubscripted (open) generic like Xxx[T]
without args and we can not use it to calculate actual field type and cls
will be subscripted (closed) generic like Xxx[int]
which has int
as arg to replace with it all TypeVars used in fields types.
If there is why do we use origin on line 95
subscripted generic of unsubscripted dataclass is not a dataclass. we can get fields only from origin - from unsubscripted declared type. please check PR description. There you can find explanation with examples.
why we even collecting field names one more time and not using the map?
that map is only about fields types with access by field name. Here we collect fields to get field descriptor and field metadata to build schema
Generics support
This PR contains implementation for generics support for both
dump
andload
methods. I have supported all cases which I met and I could imagine. But may be I missed something. Hope no.Key points:
dump
anddump_many
were improved to extract actual runtime type for serialization with_extract_type
functionget_fields_type_map
returns dict with actual runtime type for each field accessible by field's name.bake_schema
enumerates all dataclass fields and gets it's type according dict returned byget_fields_type_map
get_field_for
mostly was not change because it receives already prepared type to build schema fieldI went deeper into Python generics first time and it was not obviously how it works. All issues I met with explanation how I solved them are listed below in Issues section.
All provided code snippets can be executed in Python playground to check how it works:
Generics in Python
Let we have generic dataclass
class Xxx(Generic[T1, T2])
then:Xxx[int, str]
- subscripted type (closed generic)(int, str)
- tuple with args ofXxx[int, str]
Xxx
- unsubscripted type (open generic), aka origin(T1, T2)
- tuple with TypeVar items which are params ofXxx
Issues
Extracting generic instance type
This is what we should take into account:
type(...)
function always returns unsubscripted type__orig_class__
returns subscripted type__orig_class__
does not exist for dataclasses withfrozen=True
orslots=True
__orig_class__
does not exist for non generic dataclassesIt means in some cases we can not extract subscripted generic type from instance and it should be passed explicitly to dump function. All this stuff implemented in
_extract_type
and tested bytest_generic_extract_type_on_dump
Extracting generic dataclass field type
Subscripted type
Xxx[int, str]
is not a dataclass. Only unsubscriptedXxx
is dataclass. Fields ofXxx
are also have unsubscripted types. Here are steps how to calculate actual fields types:Xxx[int, str]
Xxx
fromXxx[int, str]
Xxx
Class field type with nested generic
Class field type can be more complicated than just
TypeVar
:types.UnionType
Union
Annotated
types.GenericAlias
_GenericAlias
Subscripted type can be built recursively using the same
type_var_map
from snippet above. It is implemented inbuild_subscripted_type
and types recognition can be tested using the code snippet below.Inheritance with generics
There are three connected issues to solve
Parents or parents of parents can be also generic
To solve this issue we can recursively iterate all parents via
__orig_bases__
and applybuild_subscripted_type
for each parent class withtype_var_map
of child classClass in hierarchy can use same TypeVar as generic param
v1
isint
andv2
isstr
but for bothfield.type
isT
So we can not have single
type_var_map
we should have separate maps for each class in hierarchy. All this stuff is implemented inget_class_type_var_map
. Then we just need to get field's owner class to find itstype_ver_map
and build subscripted field type. BUT... Here is another issue. Check it below.Dataclass fields have no owner class reference
It means we need to build a map from field to its owner class. But we need take into account these things:
dataclasses.fields(...)
returns all dataclass fields including all fields from all parentsTo build
fields_class_map
can be used the same approach as indataclasses
module using enumeration of reversed__mro__
. Field name can be used as a key as it is unique. This stuff is implemented inget_fields_class_map
function. The snippet below explains how it works and can be used to play with it.That's it
It works!