-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support pandas 2 in flytekit #1818
Changes from 19 commits
6c15fda
e4a5e55
1b49769
909fbce
f01e3b7
3962fe6
015a0ac
68c032c
30eee83
0347700
8e0b58f
4fb0c03
4d30607
e2b5450
3e64dcb
bf968f3
ae53480
c2921d0
df88479
6d5d9fb
d3b8e5f
2e88e74
4edec78
5f477c7
fdaa5f6
f6ab978
10b00c3
e68da19
c57c0fe
94c9aca
10d8466
e15910d
d5669da
85a431b
97bd5e9
242436d
bb7ab63
22ba6c0
dc6119f
1dcba35
0e8ea2c
b29ec36
e6c3a0d
8a2d047
b311082
37af712
68fcc8f
b47f436
788f918
18c4fb8
97b8547
1185631
351d673
d05a11f
30a986a
55c8847
1da781d
cae2746
a155a0e
915818f
cef8fae
9a8278b
52a1143
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,19 @@ | ||
import importlib.util | ||
import sys | ||
import types | ||
|
||
LAZY_MODULES = [] | ||
|
||
|
||
class DummyModule(types.ModuleType): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: can we name this LazyModule? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I feel it's a bit overkill. All the pandas API we use in flytekit works in both pandas v1 and v2. I'll add it upon any user runs into any issue. |
||
def __init__(self, module_name: str): | ||
super().__init__(module_name) | ||
self._module_name = module_name | ||
|
||
def __getattribute__(self, attr): | ||
raise ImportError(f"Module {object.__getattribute__(self, '_module_name')} is not yet installed.") | ||
|
||
|
||
def is_imported(module_name): | ||
""" | ||
This function is used to check if a module has been imported by the regular import. | ||
|
@@ -24,6 +34,10 @@ | |
return sys.modules[fullname] | ||
# https://docs.python.org/3/library/importlib.html#implementing-lazy-imports | ||
spec = importlib.util.find_spec(fullname) | ||
if spec is None: | ||
# Return a dummy module if the module is not found in the python environment, | ||
# so that we can raise a proper error when the user tries to access an attribute in the module. | ||
return DummyModule(fullname) | ||
pingsutw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
loader = importlib.util.LazyLoader(spec.loader) | ||
spec.loader = loader | ||
module = importlib.util.module_from_spec(spec) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,7 +31,6 @@ | |
"googleapis-common-protos>=1.57", | ||
"flyteidl>=1.10.0", | ||
"wheel>=0.30.0,<1.0.0", | ||
"pandas>=1.0.0,<2.0.0", | ||
"pyarrow>=4.0.0,<11.0.0", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just as a reminder. This can also be deleted. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like it was added back to the dependencies in pyproject.toml? Is that intended? Sorry I'm not following up the PR closely. |
||
"click>=6.6,<9.0", | ||
"croniter>=0.3.20,<4.0.0", | ||
|
@@ -69,7 +68,6 @@ | |
"diskcache>=5.2.1", | ||
"cloudpickle>=2.0.0", | ||
"cookiecutter>=1.7.3", | ||
"numpy", | ||
"gitpython", | ||
"kubernetes>=12.0.1", | ||
"rich", | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
import pytest | ||
|
||
from flytekit.lazy_import.lazy_module import DummyModule, lazy_module | ||
|
||
|
||
def test_lazy_module(): | ||
mod = lazy_module("pandas") | ||
assert mod.__name__ == "pandas" | ||
mod = lazy_module("fake_module") | ||
assert isinstance(mod, DummyModule) | ||
with pytest.raises(ImportError, match="Module fake_module is not yet installed."): | ||
print(mod.attr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this would be too annoying for local development? We'll remember to install pandas + pyarrow to enable some for the test suite.
I am okay with it. For others maybe a comment:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added pandas to dev-requirement, and uninstall it in the GA workflow instead.