You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was running a workflow.transform(sampled_dataset) step on a sample of my inference dataset and received the following error
Traceback (most recent call last):
File "/databricks/python/lib/python3.8/site-packages/nvtabular/ops/categorify.py", line 510, in transform
encoded = _encode(
File "/databricks/python/lib/python3.8/site-packages/nvtabular/ops/categorify.py", line 1707, in _encode
if isinstance(df[cl].dropna().iloc[0], (np.ndarray, list)):
File "/databricks/python/lib/python3.8/site-packages/pandas/core/indexing.py", line 1073, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/databricks/python/lib/python3.8/site-packages/pandas/core/indexing.py", line 1625, in _getitem_axis
self._validate_integer(key, axis)
File "/databricks/python/lib/python3.8/site-packages/pandas/core/indexing.py", line 1557, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/databricks/python/lib/python3.8/site-packages/merlin/dag/executors.py", line 237, in _run_node_transform
transformed_data = node.op.transform(selection, input_data)
File "/databricks/python/lib/python3.8/site-packages/merlin/core/dispatch.py", line 69, in inner2
return func(*args, **kwargs)
File "/databricks/python/lib/python3.8/site-packages/nvtabular/ops/categorify.py", line 534, in transform
raise RuntimeError(f"Failed to categorical encode column {name}") from e
RuntimeError: Failed to categorical encode column my_categorical_column
I noticed this happens when the dataset to be transformed has a categorical column (my_categorical_column) with 100% NaNs. It looks like that happens when this line is reached 👇 where we do a dropna() followed by iloc[0]
It's not a huge blocker for me right now, as this mostly happens on dataset samples, but I'm wondering whether that behavior is expected. Any thoughts? 😃
The text was updated successfully, but these errors were encountered:
I was running a
workflow.transform(sampled_dataset)
step on a sample of my inference dataset and received the following errorI noticed this happens when the dataset to be transformed has a categorical column (
my_categorical_column
) with 100% NaNs. It looks like that happens when this line is reached 👇 where we do adropna()
followed byiloc[0]
NVTabular/nvtabular/ops/categorify.py
Line 1707 in ee21af0
It's not a huge blocker for me right now, as this mostly happens on dataset samples, but I'm wondering whether that behavior is expected. Any thoughts? 😃
The text was updated successfully, but these errors were encountered: