-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YTEP-0038: Type Annotations #17
base: master
Are you sure you want to change the base?
Conversation
This is great! One general question -- at what point should PR's that introduce new methods be type hinted? At some point it would make sense to encourage (and at some later point require?) PR authors to use type hinting so that those methods don't have to be subsequently converted, but I'm not sure what point that is. |
That's a good question! I think ideally this would not happen for some time, but eventually we would encourage it or even mandate it if it's user-facing. |
Disclaimer: Obviously my opinion is biased because I have a couple PRs pending already that have type hints, but I think that we should start encouraging them or at least letting them in ASAP. The reason for this is that it takes a lot of time to add them as a standalone task, but it's more or less marginal to add types to functions when you're already working on them. In particular, I've found that when I'm working on a fix for a part of the code that I'm not very familiar with, part of my process involves understanding what types fit e.g. a function signature. At this point, I'm spending time thinking typing no matter what, but this time is wasted in the long run if I can't add the hints to the main branch. |
Thanks for doing this! I just skimmed through it and here are a handful of questions:
from typing import Tuple
Field = Tuple[str, str]
StrTuple = Tuple[str, str]
def get_data(field: FieldType): # type conveys what the object represents
...
def get_data(field: StrTuple): # type reflects how the object is internally stored
...
def doSomethingComplex(ds: Dataset) -> List[str]:
def _comment_field(field): # nice and compact, and obvious from reading it
...
# or
def _comment_field(field: Tuple[str, str]) -> str: # explicit but tedious to write
if field[0] in ds.particle_types:
return "Great, I love particles"
else:
return "Blimey, where are all my particles?"
comments = []
# or
comments: List[str] = []
for field in ds.derived_field_list:
comments.append(_comment_field(field))
return comments Tentative list of rules for type hinting or the Zen of Typing Yt
# yt/some_objects.py
class SlicePlot:
...
class ProjectionPlot:
...
# yt/typing.py
from yt.some_objects import SlicePlot, ProjectionPlot
from typing import Union, TypeVar
T = TypeVar("T")
Plot = Union[SlicePlot, ProjectionPlot]
ListOrSingle = Union[T, List[T]]
PlotsOrPlot = ListOrSingle[Plot]
# yt/another_file B.py
from yt.some_objects import SlicePlot, ProjectionPlot
from yt.typing import ListOrSingle, PlotsOrPlot, Plot
# This is fine
def returnSomething() -> SlicePlot:
...
# The return type is overly complex. The Union[SlicePlot, ProjectionPlot] should be defined as a meaningful
# type in `yt/units.py`, e.g. imported from there
def returnComplexThing() -> Optional[Union[SlicePlot, ProjectionPlot, List[SlicePlot] List[ProjectionPlot]]]:
...
def returnComplexThing() -> PlotsOrPlot: # ok I guess? Though we have to inspect yt/units.py to figure out what PlotsOrPlot is
...
def returnComplexThing() -> Union[Plot, List[Plot]]: # not ok if ListOrSingle is defined
...
def returnComplexThing() -> ListOrSingle[Plot]: # best?
... |
This is a great point. How about in an update, I change it to say that we're immediately going to open up to adding type hinting? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for boostraping this. I want to flesh this out a little but I can't do it for the time being, so here are some comments in the mean time. If you guys want to reuse some of the material as part of the YTEP I'm fine with it, otherwise I'll do it myself soon !
However, there are some features of type-hinting that would be extremely | ||
useful to yt that only started appearing in Python 3.7. Part of this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Corentin said, it actually started with Python 3.6
the builtin typing
and types
(as well as collections
from Python 3.9 on) modules have been getting more and more support each version, and it's true that dropping Python 3.6 makes the task a little less difficult, but it's not a requirement to get started.
For completeness, the specific improvement in Python 3.7 over 3.6 is that type hints evaluation is postponed, instead of immediate. This makes the following snippet invalid in Python 3.6 but valid in 3.7 and above
def baconify(spam) -> Bacon: # error, Bacon type isn't defined yet !
return Bacon(spam)
class Bacon:
def __init__(self, spam):
...
Note that is order to make this Python 3.6 compliant, it is sufficient to quote the problematic annotation as
def baconify(spam) -> "Bacon":
The main downside is that it creates unnecessary churn to write Python 3.6 compliant annotation if we're going to drop it immediately after, which is why those two problems should indeed be discussed together.
* Identify any type hint `Union` types that need to be made (to account for, | ||
for instance, supplying either a `unyt_quantity` or a tuple of (value, | ||
unit)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alas, typing.Union
isn't the magical beast I once thought it was, and there's much more to typing than defining well thought instances of this. For instance typing.TypeVar
is slightly different and close to templating in C++, and one needs to know both and think about which one applies in a given situation.
For instance, let's say we have the following function, whose design exploits duck typing
def sum_strings(strings, joiner):
return joiner.join(strings)
(For the sake of the argument, I'm going to willingly ignore the fact that typing.AnyStr
exists)
What input types are relevant here ? str
and bytes
work, so we might simply express this as a typing.Union
from typing import Sequence, Union
MyStrT = Union[str, bytes]
def sum_strings(strings:Sequence[MyStrT], joiner:MyStrT) -> MyStrT:
...
The problem is that then it's not clear what the output type will be in case the user code mixes allowed types as
sum_strings(["spam", "bacon"], b"eggs")
but even more importantly... it's not clear from the signature that this call won't even work !
The solution here is to use typing.TypeVar
instead:
from typing import Sequence, TypeVar
MyStrT = TypeVar("MyStrT", str, bytes)
# MyStrT can be str or bytes (and nothing else), but two MyStrT objects defined in the same signature can't have mixed types
def sum_strings(strings:Sequence[MyStrT], joiner:MyStrT) -> MyStrT:
...
Now a type checker will be able to tell that my user code is violating the signature.
Let's take another example where Union isn't the right tool
def usum(it):
ret = it[0]
for elem in it:
ret += elem
ret -= it[0]
return ret
Now if we want to add type hints to this signature, the easy way out is
from typing import Any, Sequence
def usum(it:Sequence[Any]) -> Any:
...
But that's not exactly relevant is it ? The return type will be the same that the element type in the input sequence,
and the signature should reflect that. This can't be expressed by typing.Union
, but it can be expressed with typing.TypeVar
from typing import Sequence, TypeVar
T = TypeVar("T") # this can be anything, but two "T"s in a single signature _have_ to be the same type
def usum(it:Sequence[T]) -> T:
...
* Type-hint constructors for all top-level objects (`SlicePlot`, `Sphere`, etc) | ||
* Type-hint all visualization function calls (`p.set_width`) | ||
* Type-hint public methods on top-level objects (`.profile`) | ||
* Utilize annotators for internal (underscore) methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the feasibility of this last part remains to be demonstrated afaic.
source/YTEPs/YTEP-0038.rst
Outdated
Alternatives | ||
------------ | ||
|
||
None that I can identify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one: do nothing and spend our dev time elsewhere 🙈
But as I mentioned in a comment already, doing this while we're working on the code for different reasons adds a marginal cost, at least for people who want to do it.
I'll go ahead and update the ytep with all of these comments. Thank you! |
I want to answer some of the points in @cphyc 's comment.
Good news : I have faith that this will eventually be automated with pyupgrade, which we're already using as part of our pre-commit tool belt. This is a few years down the road but that's what the tool is for, and I trust his developer (who's also the person behind pre-commit and pre-commit.ci) will have our back by then.
I'm sorry Corentin, there's no easy way to say this but your example that follows this quote is simply wrong. (wizardry takes time :/). Specifically
We might need to rely on About the name of the typing module, I think there are only two serious contenders if we want to fit the ecosystem:
I think we should start with a Otherwise, I love the "Zen" you wrote, and I support each of the rules you're proposing. Thanks a lot ! |
A question that should be adressed in the YTEP is when we should start adding typechecking to our CI. mypy can be used through pre-commit, which means the required additions to our infrastructure is minimal, but I'm not sure about perfs (specifically, I don't know whether the hook runs on every files or just staged ones). |
One thing we haven't done much of, but that we could, is use Data classes
or something like them where we've used, for instance, field tuples. (Only
if this would have no user-facing changes/breakages, which I am almost
convinced we could do.)
…On Mon, Mar 1, 2021, 4:54 PM Clément Robert ***@***.***> wrote:
A question that should be adressed in the YTEP is when we should start
adding typechecking to our CI. mypy can be used through pre-commit, which
means the required additions to our infrastructure is minimal, but I'm not
sure about perfs (specifically, I don't know whether the hook runs on every
files or just staged ones).
There may be important difficulties as to how this should be done too.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#17 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAVXOY3DYOG3HWITKCBQ5DTBQLLPANCNFSM4YMYWDOQ>
.
|
Dataclasses were introduced in Python 3.7 |
Oops, my bad you're right. I intended to write I have updated my example. |
@cphyc One thing you brought up in your comments was the idea of, should we type for clarity, or purpose. I am inclined to think that we might want to think of this in combination with #8 , which lets us add on validation in some circumstances. I think that we might want to use this for very specific end-user facing things, and utilize traitlets for validation. But that's not an incredibly well-thought out response, I confess. I'm also kind of inclined to think of the contents of the |
I'm just now realising |
oops sorry, misclicked |
For the record I love this idea, though it seems hard to find good (or any) resource on how to make both work together. Anything helps ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I just spotted a few typos on the way ;)
Let's type then!
Co-authored-by: Corentin Cadiou <[email protected]>
Co-authored-by: Corentin Cadiou <[email protected]>
Quick note since it hasn't been mentioned yet: although typing adds no performance gain yet, it opens the way to use mypyc (or similar tools ?) that allows to compile normally typed-annotated python (no additional micro language or framework). |
This is a draft of a type hinting YTEP. I may have missed some things or not gone into enough detail.
I particularly want to ask @neutrinoceros and @cphyc to take a look, and to feel free to push changes to this PR (and to add their names as authors.)