`Table.to_pandas()` converts ints to doubles #43112

bveeramani · 2024-07-01T21:12:57Z

Describe the bug, including details regarding any error messages, version, and platform.

When you call to_pandas, Arrow converts ints to doubles This leads to precision issues (e.g., some ints can't be represented with doubles).

We can potentially avoid this issue by using the pandas nullable integer data type: https://pandas.pydata.org/docs/user_guide/integer_na.html.

import pyarrow

table = pyarrow.Table.from_pydict({"column": [0, None]})
df = table.to_pandas()
assert df.dtypes[0] == int, df.dtypes[0]

Traceback (most recent call last):
  File "/Users/balaji/Documents/GitHub/ray/1.py", line 5, in <module>
    assert df.dtypes[0] == int, df.dtypes[0]
           ^^^^^^^^^^^^^^^^^^^
AssertionError: float64

Component(s)

Python

The text was updated successfully, but these errors were encountered:

attwelveDev · 2024-10-18T09:29:39Z

take

bveeramani added the Type: bug label Jul 1, 2024

github-actions bot added the Component: Python label Jul 1, 2024

github-actions bot assigned attwelveDev Oct 18, 2024

attwelveDev linked a pull request Oct 26, 2024 that will close this issue

GH-43112: [Python] Set nullable Int64 dtype for integer columns with None values when converting to pandas #44538

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Table.to_pandas()` converts ints to doubles #43112

`Table.to_pandas()` converts ints to doubles #43112

bveeramani commented Jul 1, 2024

attwelveDev commented Oct 18, 2024

Table.to_pandas() converts ints to doubles #43112

Table.to_pandas() converts ints to doubles #43112

Comments

bveeramani commented Jul 1, 2024

Describe the bug, including details regarding any error messages, version, and platform.

Component(s)

attwelveDev commented Oct 18, 2024

`Table.to_pandas()` converts ints to doubles #43112

`Table.to_pandas()` converts ints to doubles #43112