Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force cast to pd.DataFrame on nonsensical manipulation of TableDataFrame #105

Open
jfcorbett opened this issue Mar 23, 2021 · 2 comments
Open
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@jfcorbett
Copy link
Member

jfcorbett commented Mar 23, 2021

Currently. .transpose() raises a warning but continues on its merry way, resulting in units that are nonsense. This shouldn't be allowed.

Should fall back on returning a vanilla pd.DataFrame instead.

Example:

from pdtable import Table
import pandas as pd
df = pd.DataFrame({"a":[1,2,3], "b":['q','w','e']})
t = Table(df, name="foo", units=["m", "text"])
print(t)

displays

**foo
all
 a [m] b [text]
     1        q
     2        w
     3        e

So far so good. Now let's .transpose() the backing TableDataFrame, and create a Table from that:

print(Table(t.df.transpose()))

We get a warning:

[...]\pdtable\frame.py:92: UserWarning: While combining pdTable metadata an unknown __finalize__ method "transpose" was encountered. Will try to propagate metadata with generic methods, but please check outcome of this and notify pdTable maintainers.
  warnings.warn(
**foo
all
0 [text] 1 [text] 2 [text]
       1        2        3
       q        w        e

... but this still spits out a valid Table whose units are nonsense. This should not be allowed.

The proposed fix (i.e. force cast to pd.DataFrame) can be done by adding an elif clause as indicated below, which would eventually cover all methods that are nonsensical in the StarTable universe, such as transpose().

    if method is None or method in frozenset({"reindex", "take", "copy"}):
        # method: None - copy, slicing (pandas <1.1)
        src = [other]
    elif method == "merge":
        src = [other.left, other.right]
    elif method == "concat":
        src = other.objs
    elif method == "transpose":  # <<<<<<<<<<<<<<<<<<<<
        return None              # <<<<<<<<<<<<<<<<<<<<
    else:
        # Unknown method - try to handle this as well as possible, but rather warn and drop units than break things.
        src = [other]
        warnings.warn(
            f'While combining pdTable metadata an unknown __finalize__ method "{method}" was encountered. '
            f"Will try to propagate metadata with generic methods, but please check outcome of this "
            f"and notify pdTable maintainers."
        )
@jfcorbett jfcorbett added bug Something isn't working good first issue Good for newcomers labels Mar 23, 2021
@jfcorbett
Copy link
Member Author

@JanusWesenberg thoughts?
@heidikira FYI

@jfcorbett
Copy link
Member Author

Note to self and others interested: pandas' implementation of __finalize__():
https://github.com/pandas-dev/pandas/blob/d01561fb7b9ad337611fa38a3cfb7e8c2faec608/pandas/core/generic.py#L5447

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant