Skip to content

Commit

Permalink
Fixed error of wrong type by removing warning for distinct categorica…
Browse files Browse the repository at this point in the history
…ls (#502)

* Fixed error of wrong type by removing warning for distinct categoricals #501

* Prepare release 5.3.0
  • Loading branch information
steffen-schroeder-by authored Dec 10, 2021
1 parent 89ef9d2 commit 1821ea5
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 10 deletions.
3 changes: 2 additions & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
Changelog
=========

Version 5.3.0 (2021-11-26)
Version 5.3.0 (2021-12-10)
==========================
* Add Deprecation warnings and migration helpers in order to facilitate the Kartothek version 6.0.0 migration.
* Removed warning for distinct categoricals (#501)


Version 5.2.0 (2021-11-22)
Expand Down
9 changes: 0 additions & 9 deletions kartothek/io/dask/_utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# -*- coding: utf-8 -*-


import warnings
from functools import partial

import pandas as pd
Expand All @@ -12,8 +11,6 @@
except ImportError:
pass

CATEGORICAL_EFFICIENCY_WARN_LIMIT = 100000


def _identity():
def _id(x):
Expand Down Expand Up @@ -49,12 +46,6 @@ def _cast_categorical_to_index_cat(df, categories):
def _construct_categorical(column, dataset_metadata_factory):
dataset_metadata = dataset_metadata_factory.load_index(column)
values = dataset_metadata.indices[column].index_dct.keys()
if len(values) > CATEGORICAL_EFFICIENCY_WARN_LIMIT:
warnings.warn(
"Column {} has {} distinct values, reading as categorical may increase memory consumption.",
column,
len(values),
)
return pd.api.types.CategoricalDtype(values, ordered=False)


Expand Down

0 comments on commit 1821ea5

Please sign in to comment.