Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: mutable default <class 'stopes.core.stopes_module.Requirements'> for field requirements is not allowed: use #8

Open
dtometzki opened this issue Dec 27, 2024 · 11 comments
Assignees

Comments

@dtometzki
Copy link

Hello together,

merry christmas. I get the following error when execute: hat is wrong here

what is wrong here ?

/large_concept_model/scripts$ python3.11 prepare_wikipedia.py ./wiki_data
Traceback (most recent call last):
  File "/root/large_concept_model/scripts/prepare_wikipedia.py", line 13, in <module>
    from stopes.modules.preprocess.sonar_text_embedding import (
  File "/opt/conda/lib/python3.11/site-packages/stopes/modules/preprocess/__init__.py", line 7, in <module>
    from .line_processor import LineProcessorCallback as LineProcessorCallback
  File "/opt/conda/lib/python3.11/site-packages/stopes/modules/preprocess/line_processor.py", line 62, in <module>
    @dataclass
     ^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 1232, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 1222, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'stopes.core.stopes_module.Requirements'> for field requirements is not allowed: use default_factory

Best regards
Damian

@dtometzki
Copy link
Author

dtometzki commented Dec 28, 2024

Hello
the following change resolved my issue for python >=3.11:

diff --git a/stopes/modules/preprocess/line_processor.py b/stopes/modules/preprocess/line_processor.py
index d6dedc3..7768203 100644
--- a/stopes/modules/preprocess/line_processor.py
+++ b/stopes/modules/preprocess/line_processor.py
@@ -9,7 +9,7 @@ import logging
 import typing as tp
 from abc import abstractmethod
 from contextlib import AbstractContextManager
-from dataclasses import dataclass
+from dataclasses import dataclass, field
 from pathlib import Path
 
 import hydra
@@ -69,13 +69,13 @@ class LineProcessorConfig:
     # if only hydra allowed, the right type would be tp.Union[str, tp.List[str]]
     shards: tp.Any = MISSING
     buffer_size: int = 10_000
-    requirements: Requirements = Requirements(
-        nodes=1,
-        tasks_per_node=1,
-        gpus_per_node=0,
-        cpus_per_task=4,
-        timeout_min=120,
-    )
+    requirements: Requirements = field(default_factory=lambda: Requirements(
+       nodes=1,
+       tasks_per_node=1,
+       gpus_per_node=0,
+       cpus_per_task=4,
+       timeout_min=120
+    ))
     custom_name: str = ""

@Charley-xiao
Copy link

The changes proposed by @dtometzki worked for me. However, I'm encountering another issue:

Traceback (most recent call last):
  File "/home/xqw/miniconda3/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/tqdm/contrib/logging.py", line 93, in logging_redirect_tqdm
    yield
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/stopes/core/launcher.py", line 324, in schedule
    result = await self._schedule_array(module, value_array)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/stopes/core/launcher.py", line 540, in _schedule_array
    raise ArrayTaskError(
stopes.core.launcher.ArrayTaskError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/patchranking/large_concept_model/scripts/prepare_wikipedia.py", line 103, in <module>
    fire.Fire(run)
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/scripts/prepare_wikipedia.py", line 99, in run
    asyncio.run(launcher.schedule(stopes_module))
  File "/home/xqw/miniconda3/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/xqw/miniconda3/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xqw/miniconda3/lib/python3.12/asyncio/base_events.py", line 685, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/stopes/core/launcher.py", line 319, in schedule
    with logging_redirect_tqdm():
  File "/home/xqw/miniconda3/lib/python3.12/contextlib.py", line 191, in __exit__
    exc.__traceback__ = traceback
    ^^^^^^^^^^^^^^^^^
  File "<string>", line 4, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field '__traceback__'

I'm not sure if this is related to the changes I just made, or it's just another issue regarding how the developers handle dataclasses.

@elbayadm
Copy link
Contributor

elbayadm commented Jan 6, 2025

Hi, @dtometzki!
Indeed, IIUC python > 3.11 throws an error with mutable defaults and requires the use of default_factory.
In #11 I made changes to a few dataclasses to use default_factory

@elbayadm
Copy link
Contributor

elbayadm commented Jan 6, 2025

The changes proposed by @dtometzki worked for me. However, I'm encountering another issue:

Traceback (most recent call last):
  File "/home/xqw/miniconda3/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/tqdm/contrib/logging.py", line 93, in logging_redirect_tqdm
    yield
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/stopes/core/launcher.py", line 324, in schedule
    result = await self._schedule_array(module, value_array)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/stopes/core/launcher.py", line 540, in _schedule_array
    raise ArrayTaskError(
stopes.core.launcher.ArrayTaskError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/patchranking/large_concept_model/scripts/prepare_wikipedia.py", line 103, in <module>
    fire.Fire(run)
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/scripts/prepare_wikipedia.py", line 99, in run
    asyncio.run(launcher.schedule(stopes_module))
  File "/home/xqw/miniconda3/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/home/xqw/miniconda3/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xqw/miniconda3/lib/python3.12/asyncio/base_events.py", line 685, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/data/patchranking/large_concept_model/.venv/lib/python3.12/site-packages/stopes/core/launcher.py", line 319, in schedule
    with logging_redirect_tqdm():
  File "/home/xqw/miniconda3/lib/python3.12/contextlib.py", line 191, in __exit__
    exc.__traceback__ = traceback
    ^^^^^^^^^^^^^^^^^
  File "<string>", line 4, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field '__traceback__'

I'm not sure if this is related to the changes I just made, or it's just another issue regarding how the developers handle dataclasses.

Hi, @Charley-xiao!
I haven't encountered this issue before. Can you share a reproducing example?
is this simply running python prepare_wikipedia.py ./wiki_data? I see also a mix of miniconda and venv in the trace, is this intentional?

@elbayadm elbayadm self-assigned this Jan 6, 2025
@Charley-xiao
Copy link

Hi, @elbayadm. Thanks for replying.

Yes, I ran python scripts/prepare_wikipedia.py in a Conda environment after following the installation instructions using uv, as outlined in README.md. Then, I encountered the issue mentioned by @dtometzki and tried his solution, which worked but then yielded another problem as in my first comment. Now that you’ve mentioned the possibility of mixing Miniconda and venv, I realize that could be causing the issue.

@elbayadm
Copy link
Contributor

elbayadm commented Jan 6, 2025

@Charley-xiao give it a try after pulling the latest changes (to fix the default_factory issue) and running

conda deactivate
source .venv/bin/activate

@Charley-xiao
Copy link

@Charley-xiao give it a try after pulling the latest changes (to fix the default_factory issue) and running

conda deactivate
source .venv/bin/activate

It worked! Thank you so much, @elbayadm. The mix of the environments is indeed the problem.

@elbayadm elbayadm closed this as completed Jan 6, 2025
@StephenJHardy
Copy link

StephenJHardy commented Jan 7, 2025

Hi, @dtometzki! Indeed, IIUC python > 3.11 throws an error with mutable defaults and requires the use of default_factory. In #11 I made changes to a few dataclasses to use default_factory

Hi, I still have the same error as reported above - this is from a checkout of main including #11 - python 3.12.3 on ubuntu.

Traceback (most recent call last): File "/home/stephen/checkouts/large_concept_model/scripts/prepare_wikipedia.py", line 13, in <module> from stopes.modules.preprocess.sonar_text_embedding import ( File "/home/stephen/venv_fairseq/lib/python3.12/site-packages/stopes/modules/preprocess/__init__.py", line 7, in <module> from .line_processor import LineProcessorCallback as LineProcessorCallback File "/home/stephen/venv_fairseq/lib/python3.12/site-packages/stopes/modules/preprocess/line_processor.py", line 62, in <module> @dataclass ^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 1268, in dataclass return wrap(cls) ^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 1258, in wrap return _process_class(cls, init, repr, eq, order, unsafe_hash, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 994, in _process_class cls_fields.append(_get_field(cls, name, type, kw_only)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 852, in _get_field raise ValueError(f'mutable default {type(f.default)} for field ' ValueError: mutable default <class 'stopes.core.stopes_module.Requirements'> for field requirements is not allowed: use default_factory

@Charley-xiao
Copy link

Hi, @dtometzki! Indeed, IIUC python > 3.11 throws an error with mutable defaults and requires the use of default_factory. In #11 I made changes to a few dataclasses to use default_factory

Hi, I still have the same error as reported above - this is from a checkout of main including #11 - python 3.12.3 on ubuntu.

Traceback (most recent call last): File "/home/stephen/checkouts/large_concept_model/scripts/prepare_wikipedia.py", line 13, in <module> from stopes.modules.preprocess.sonar_text_embedding import ( File "/home/stephen/venv_fairseq/lib/python3.12/site-packages/stopes/modules/preprocess/__init__.py", line 7, in <module> from .line_processor import LineProcessorCallback as LineProcessorCallback File "/home/stephen/venv_fairseq/lib/python3.12/site-packages/stopes/modules/preprocess/line_processor.py", line 62, in <module> @dataclass ^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 1268, in dataclass return wrap(cls) ^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 1258, in wrap return _process_class(cls, init, repr, eq, order, unsafe_hash, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 994, in _process_class cls_fields.append(_get_field(cls, name, type, kw_only)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/dataclasses.py", line 852, in _get_field raise ValueError(f'mutable default {type(f.default)} for field ' ValueError: mutable default <class 'stopes.core.stopes_module.Requirements'> for field requirements is not allowed: use default_factory

Hi @StephenJHardy. It seems like you're using two different environments /usr/lib/python3.12 and /home/stephen/venv_fairseq/lib/python3.12. It might be the issue.

@dtometzki
Copy link
Author

dtometzki commented Jan 7, 2025

Hello @elbayadm

the changes are missing in packages stopes.

/opt/conda/lib/python3.11/site-packages/stopes/modules/preprocess/line_processor.py.

I got the same erroro again

[email protected]:~/large_concept_model/scripts$ python prepare_wikipedia.py ./wiki_dat
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/stopes/modules/preprocess/line_processor.py", line 62, in <module>
    @dataclass
     ^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 1232, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 1222, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'stopes.core.stopes_module.Requirements'> for field requirements is not allowed: use default_factory

After my changes from above it works:

[email protected]:~/large_concept_model/scripts$ python prepare_wikipedia.py ./wiki_dat
2025-01-07 17:15 INFO 2064:datasets - PyTorch version 2.5.1+cu124 available.
2025-01-07 17:15 INFO 2064:datasets - Polars version 1.19.0 available.
2025-01-07 17:16 WARNING 2064:WrappedMapper - Output schema will NOT be validated
  0%|                                                                                                                                                                                        | 0/1 [00:00<?, ?it/s]/opt/conda/lib/python3.11/site-packages/submitit/core/plugins.py:24: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  import pkg_resources
/opt/conda/lib/python3.11/site-packages/pkg_resources/__init__.py:3144: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
2025-01-07 17:16 INFO 2064:stopes.launcher - for prep_wiki found 0 already cached array results,1 left to compute out of 1                                                                                         
2025-01-07 17:16 INFO 2064:stopes.launcher - submitted job array for prep_wiki: ['2132']                                                                                                                           
2025-01-07 17:16 INFO 2064:stopes.launcher - Logs at: /root/large_concept_model/scripts/executor_logs/prep_wiki/2132_0_log.err                                                                                     
  0%|                                              

Should we open a new issue for this ? Because i cant reopen it
Damian

@elbayadm elbayadm reopened this Jan 7, 2025
@elbayadm
Copy link
Contributor

elbayadm commented Jan 7, 2025

@dtometzki I re-opened this issue. In #11, I only fixed the training errors caused by default_factory in the lcm package. We need to also fix this issue in stopes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants