ModuleNotFoundError: No module named 'datasets' #140

speechchemistry · 2024-10-04T09:19:47Z

Hi. Thanks for making this repository available. I'm hoping to use the speaker diarization.

I followed the install intructions at https://github.com/modelscope/3D-Speaker (except I cloned from this repository rather than the one in the alibaba-damo-academy github profile). I then changed directory to 3D-Speaker/egs/3dspeaker/speaker-diarization and followed the usage intructions at https://github.com/modelscope/3D-Speaker/tree/main/egs/3dspeaker/speaker-diarization i.e. all the requirements installed. But I got the following error when trying to run_audio-sh:

...
run_audio.sh Stage2: Do vad for input wavs...
Traceback (most recent call last):
  File "local/voice_activity_detection.py", line 27, in <module>
    from modelscope.pipelines import pipeline
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/pipelines/__init__.py", line 4, in <module>
    from .base import Pipeline
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/pipelines/base.py", line 16, in <module>
    from modelscope.msdatasets import MsDataset
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/msdatasets/__init__.py", line 2, in <module>
    from modelscope.msdatasets.ms_dataset import MsDataset
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/msdatasets/ms_dataset.py", line 9, in <module>
    from datasets import Dataset, DatasetDict, IterableDataset, IterableDatasetDict
ModuleNotFoundError: No module named 'datasets'

I tried pip install datasets and ran run_audio.sh again:

...
run_audio.sh Stage2: Do vad for input wavs...
Traceback (most recent call last):
  File "local/voice_activity_detection.py", line 27, in <module>
    from modelscope.pipelines import pipeline
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/pipelines/__init__.py", line 4, in <module>
    from .base import Pipeline
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/pipelines/base.py", line 16, in <module>
    from modelscope.msdatasets import MsDataset
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/msdatasets/__init__.py", line 2, in <module>
    from modelscope.msdatasets.ms_dataset import MsDataset
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/msdatasets/ms_dataset.py", line 24, in <module>
    from modelscope.msdatasets.utils.hf_datasets_util import load_dataset_with_ctx
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/msdatasets/utils/hf_datasets_util.py", line 63, in <module>
    from modelscope.msdatasets.utils.hf_file_utils import get_from_cache_ms
  File "/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/modelscope/msdatasets/utils/hf_file_utils.py", line 18, in <module>
    from datasets.utils.file_utils import hash_url_to_filename, get_authentication_headers_for_url, ftp_head, fsspec_head, \
ImportError: cannot import name 'ftp_head' from 'datasets.utils.file_utils' (/home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages/datasets/utils/file_utils.py)

Have you got any suggestions on how I fix this? Is it something I'm doing wrong or is there a specific version of a module missing from requirements.txt?

The text was updated successfully, but these errors were encountered:

wanghuii1 · 2024-10-07T13:30:06Z

There might be an issue with the modelscope package;
you could try upgrading the modelscope, by "pip install -U modelscope", to see if it helps. @speechchemistry

speechchemistry · 2024-10-09T09:39:38Z

Thanks for the suggestion. But it looks like I have the latest version of modelscope already:

$ pip install -U modelscope
Requirement already satisfied: modelscope in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (1.18.1)
Requirement already satisfied: requests>=2.25 in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (from modelscope) (2.32.3)
Requirement already satisfied: tqdm>=4.64.0 in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (from modelscope) (4.66.5)
Requirement already satisfied: urllib3>=1.26 in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (from modelscope) (2.2.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (from requests>=2.25->modelscope) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (from requests>=2.25->modelscope) (3.10)
Requirement already satisfied: certifi>=2017.4.17 in /home/user/miniconda3/envs/3D-Speaker/lib/python3.8/site-packages (from requests>=2.25->modelscope) (2024.8.30)

speechchemistry · 2024-10-09T10:16:11Z

Ok I see the datasets module was updated to 3.01 last month. The following installs fixed the problem:

pip install datasets==2.21.0
pip install simplejson
pip install sortedcontainers

I'm not sure if this means the requirements.txt file needs to be tweaked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ModuleNotFoundError: No module named 'datasets' #140

ModuleNotFoundError: No module named 'datasets' #140

speechchemistry commented Oct 4, 2024

wanghuii1 commented Oct 7, 2024

speechchemistry commented Oct 9, 2024

speechchemistry commented Oct 9, 2024

ModuleNotFoundError: No module named 'datasets' #140

ModuleNotFoundError: No module named 'datasets' #140

Comments

speechchemistry commented Oct 4, 2024

wanghuii1 commented Oct 7, 2024

speechchemistry commented Oct 9, 2024

speechchemistry commented Oct 9, 2024