Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AESLC Checksum error #89

Open
imoneoi opened this issue Jul 21, 2023 · 5 comments
Open

AESLC Checksum error #89

imoneoi opened this issue Jul 21, 2023 · 5 comments

Comments

@imoneoi
Copy link

imoneoi commented Jul 21, 2023

When I try to create "flan2021_submix", the process fails with a wrong checksum on the AESLC dataset.

Downloading and preparing dataset 11.10 MiB (download: 11.10 MiB, generated: Unknown size, total: 11.10 MiB) to /home/one/tensorflow_datasets/aeslc/1.0.0...
Extraction completed...: 0 file [00:05, ? file/s]█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.05s/ url]
Dl Size...: 11 MiB [00:05,  2.17 MiB/s]
Dl Completed...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.06s/ url]
ERROR:absl:Failed to load task 'aeslc_template_0to10_no_opt_x_shot' as part of mixture 'flan2021_submix'
Traceback (most recent call last):
  File "/home/one/anaconda3/envs/flan/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/one/anaconda3/envs/flan/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/mnt/datadrive/Datasets/LLM/FLAN/flan/v2/run_example.py", line 100, in <module>
    dataset = selected_mixture.get_dataset(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/seqio/dataset_providers.py", line 1805, in get_dataset
    ds = task.get_dataset(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/seqio/dataset_providers.py", line 1443, in get_dataset
    ds = source.get_dataset(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/seqio/experimental.py", line 370, in get_dataset
    train_ds = _get_maybe_sharded_dataset(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/seqio/experimental.py", line 333, in _get_maybe_sharded_dataset
    ds = self._original_source.get_dataset(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/seqio/dataset_providers.py", line 496, in get_dataset
    return self.tfds_dataset.load(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/seqio/utils.py", line 182, in load
    return tfds.load(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/logging/__init__.py", line 169, in __call__
    return function(*args, **kwargs)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/load.py", line 640, in load
    _download_and_prepare_builder(dbuilder, download, download_and_prepare_kwargs)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/load.py", line 499, in _download_and_prepare_builder
    dbuilder.download_and_prepare(**download_and_prepare_kwargs)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/logging/__init__.py", line 169, in __call__
    return function(*args, **kwargs)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/dataset_builder.py", line 646, in download_and_prepare
    self._download_and_prepare(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1498, in _download_and_prepare
    split_generators = self._split_generators(  # pylint: disable=unexpected-keyword-arg
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/datasets/aeslc/aeslc_dataset_builder.py", line 46, in _split_generators
    dl_path = dl_manager.download_and_extract(_URL)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/download/download_manager.py", line 687, in download_and_extract
    return _map_promise(self._download_extract, url_or_urls)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/download/download_manager.py", line 830, in _map_promise
    res = tree_utils.map_structure(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tree/__init__.py", line 435, in map_structure
    [func(*args) for args in zip(*map(flatten, structures))])
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tree/__init__.py", line 435, in <listcomp>
    [func(*args) for args in zip(*map(flatten, structures))])
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/download/download_manager.py", line 831, in <lambda>
    lambda p: p.get(), all_promises
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/promise/promise.py", line 512, in get
    return self._target_settled_value(_raise=True)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/promise/promise.py", line 516, in _target_settled_value
    return self._target()._settled_value(_raise)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/promise/promise.py", line 226, in _settled_value
    reraise(type(raise_val), raise_val, self._traceback)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/six.py", line 719, in reraise
    raise value
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/promise/promise.py", line 87, in try_catch
    return (handler(*args, **kwargs), None)
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/download/download_manager.py", line 407, in <lambda>
    lambda dl_result: self._register_or_validate_checksums(  # pylint: disable=g-long-lambda
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/download/download_manager.py", line 464, in _register_or_validate_checksums
    _validate_checksums(
  File "/home/one/anaconda3/envs/flan/lib/python3.10/site-packages/tensorflow_datasets/core/download/download_manager.py", line 808, in _validate_checksums
    raise NonMatchingChecksumError(msg)
tensorflow_datasets.core.download.download_manager.NonMatchingChecksumError: Artifact https://github.com/ryanzhumich/AESLC/archive/master.zip, downloaded to /home/one/tensorflow_datasets/downloads/ryanzhumich_AESLC_archive_masterACSpoxw627Ay4UrkswMeyz6RrOey8kKfkhEM4VySJWU.zip.tmp.30a54832017f490096e6110966e756b4/AESLC-master.zip, has wrong checksum:
* Expected: UrlInfo(size=11.10 MiB, checksum='b5ea2ffb837c5cfb9b033d62b3940a8a2330a9eb69bd2a39a9f55db6a23a40a4', filename='AESLC-master.zip')
* Got: UrlInfo(size=11.11 MiB, checksum='cd042fa034790609468250518d44060017df153c382ae888091a37d5fa320032', filename='AESLC-master.zip')
To debug, see: https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror
@abhinavnmagic
Copy link

I am encountering the same error when trying to create flan2021_submix. Any resolution?

@imoneoi
Copy link
Author

imoneoi commented Jul 27, 2023

Any updates?

@tongyao-zhu
Copy link

I face a similar issue. Is there any update?

@imoneoi
Copy link
Author

imoneoi commented Sep 3, 2023

I manually modified the hash of AESLC in tensorflow-datasets, and it worked fine.

@lynneTest
Copy link

I manually modified the hash of AESLC in tensorflow-datasets, and it worked fine.

I met the same issue, and I think your idea would solve it. Would you please show me how to modify the hash of AESLC in tenserflow-datasets? Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants