You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reader functions fail when a (perfectly legal) comment is located to the right of the column name row, in a non-transposed table. For example when reading this CSV data:
**places;
all
place;distance;ETA;is_hot;;;---> parser chokes on this perfectly legal comment <---;
text;km;datetime;onoff
home;0.0;2020-08-04 08:00:00;1
work;1.0;2020-08-04 09:00:00;0
beach;2.0;2020-08-04 17:00:00;1
This is due to a misconceived "leniency" in pdtable.io.parsers.blocks.preprocess_column_names():
def preprocess_column_names(col_names_raw: Sequence[str], fixer: ParseFixer):
"""
handle known issues in column_names
"""
n_names_col = len(col_names_raw)
for el in reversed(col_names_raw):
if el is not None and len(el) > 0:
break
n_names_col -= 1
...
Thus everything on the column name line is counted as a column name up to the last non-blank cell, including any comments and all the empty cells between the actual column names and comments.
This is later passed to a ParseFixer via fixer.fix_missing_column_name(input_columns=column_names) the fixer then assumes that the empty cells are simply column names that the user forgot to write in, and replaces them with placeholder column names 'missing_fixed_000', 'missing_fixed_001', ....
This breaks support for comments. All of this should be ripped out.
The text was updated successfully, but these errors were encountered:
The reader functions fail when a (perfectly legal) comment is located to the right of the column name row, in a non-transposed table. For example when reading this CSV data:
This is due to a misconceived "leniency" in
pdtable.io.parsers.blocks.preprocess_column_names()
:Thus everything on the column name line is counted as a column name up to the last non-blank cell, including any comments and all the empty cells between the actual column names and comments.
This is later passed to a ParseFixer via
fixer.fix_missing_column_name(input_columns=column_names)
the fixer then assumes that the empty cells are simply column names that the user forgot to write in, and replaces them with placeholder column names'missing_fixed_000', 'missing_fixed_001', ...
.This breaks support for comments. All of this should be ripped out.
The text was updated successfully, but these errors were encountered: