Error loading dataset from URL #752

directedLink · 2024-10-14T10:05:19Z

I am experiencing an error when trying to load the PubMed dataset from a specific URL.

https://huggingface.co/learn/nlp-course/chapter5/4

Use the following code to load the dataset:

from datasets import load_dataset

data_files = "https://the-eye.eu/public/AI/pile_preliminary_components/PUBMED_title_abstracts_2019_baseline.jsonl.zst"
pubmed_dataset = load_dataset("json", data_files=data_files, split="train")

When running the code, I encounter the following error message:

FileNotFoundError: Unable to find 'https://the-eye.eu/public/AI/pile_preliminary_components/PUBMED_title_abstracts_2019_baseline.jsonl.zst'

The text was updated successfully, but these errors were encountered:

grib0ed0v · 2024-10-30T12:45:02Z

You could use workaround highlighted here.

data_files = "https://huggingface.co/datasets/casinca/PUBMED_title_abstracts_2019_baseline/resolve/main/PUBMED_title_abstracts_2019_baseline.jsonl.zst"
pubmed_dataset = load_dataset("json", data_files=data_files, split="train")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error loading dataset from URL #752

Error loading dataset from URL #752

directedLink commented Oct 14, 2024

grib0ed0v commented Oct 30, 2024

Error loading dataset from URL #752

Error loading dataset from URL #752

Comments

directedLink commented Oct 14, 2024

grib0ed0v commented Oct 30, 2024