Skip to content

Commit

Permalink
fix columns None after map (#373)
Browse files Browse the repository at this point in the history
  • Loading branch information
drcege authored Jul 30, 2024
1 parent aaa404a commit ef38589
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions data_juicer/core/ray_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,9 @@ def set_dataset_to_absolute_path(dataset, dataset_path, cfg):
def preprocess_dataset(dataset: Dataset, dataset_path, cfg) -> Dataset:
if dataset_path:
dataset = set_dataset_to_absolute_path(dataset, dataset_path, cfg)
if Fields.stats not in dataset.columns(fetch_if_missing=False):
logger.info(f'columns {dataset.columns(fetch_if_missing=False)}')
columns = dataset.columns()
if Fields.stats not in columns:
logger.info(f'columns {columns}')

def process_batch_arrow(table: pa.Table) -> pa.Table:
new_column_data = [{} for _ in range(len(table))]
Expand Down

0 comments on commit ef38589

Please sign in to comment.