Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table.to_pandas() converts ints to doubles #43112

Open
bveeramani opened this issue Jul 1, 2024 · 1 comment · May be fixed by #44538
Open

Table.to_pandas() converts ints to doubles #43112

bveeramani opened this issue Jul 1, 2024 · 1 comment · May be fixed by #44538

Comments

@bveeramani
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

When you call to_pandas, Arrow converts ints to doubles This leads to precision issues (e.g., some ints can't be represented with doubles).

We can potentially avoid this issue by using the pandas nullable integer data type: https://pandas.pydata.org/docs/user_guide/integer_na.html.

import pyarrow

table = pyarrow.Table.from_pydict({"column": [0, None]})
df = table.to_pandas()
assert df.dtypes[0] == int, df.dtypes[0]
Traceback (most recent call last):
  File "/Users/balaji/Documents/GitHub/ray/1.py", line 5, in <module>
    assert df.dtypes[0] == int, df.dtypes[0]
           ^^^^^^^^^^^^^^^^^^^
AssertionError: float64

Component(s)

Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants