Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NNCF helpers #2979

Merged
merged 18 commits into from
Oct 1, 2024
Merged

NNCF helpers #2979

merged 18 commits into from
Oct 1, 2024

Conversation

KodiaqQ
Copy link
Collaborator

@KodiaqQ KodiaqQ commented Sep 20, 2024

Changes

  • Added helpers module to NNCF.
  • Added dataset helper.

Reason for changes

  • Extended data-free usage.

Related tickets

  • 152550

Tests

  • Added tests/common/test_helpers.py

@KodiaqQ KodiaqQ requested a review from a team as a code owner September 20, 2024 10:29
@AlexKoff88
Copy link
Contributor

I am ok with the code but two comments from my side:

  • Why is it not a part of nncf.data namespace?
  • Why do we need requirements file inside, especially with "-c constraints.txt"?

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Sep 20, 2024

  • Why is it not a part of nncf.data namespace?

Since NNCF does not have helpers namespace, it was introduced. Should I introduce nncf.data.helpers module for a corresponded namespace?

  • Why do we need requirements file inside, especially with "-c constraints.txt"?

The provided code utilizes torch and transformers modules. They are not part of the main NNCF requirements, so I provided an external requirements.txt list for the helpers module.

@AlexKoff88
Copy link
Contributor

  • Why is it not a part of nncf.data namespace?

Since NNCF does not have helpers namespace, it was introduced. Should I introduce nncf.data.helpers module for a corresponded namespace?

I missed that part during the last review but I would like to name the function generate_text_data(model: PreTrainedModel, tokenizer: PreTrainedTokenizer, max_length: int = 32, num_samples: int = 128) -> List[str]
It is up to you how to arrange it internally. I would like to see the import like this: from nncf.data import generate_text_data.

  • Why do we need requirements file inside, especially with "-c constraints.txt"?

The provided code utilizes torch and transformers modules. They are not part of the main NNCF requirements, so I provided an external requirements.txt list for the helpers module.

Ok, thanks.

@KodiaqQ KodiaqQ marked this pull request as draft September 23, 2024 10:39
@github-actions github-actions bot added the NNCF Common Pull request that updates NNCF Common label Sep 23, 2024
@KodiaqQ KodiaqQ marked this pull request as ready for review September 23, 2024 14:58
@AlexanderDokuchaev
Copy link
Collaborator

@KodiaqQ @AlexKoff88 @alexsu52
Is planning to add other helper function to nncf.helpers?
May be will be better to locate the function in nncf.data or nncf/quantization/algorithms ?

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Sep 24, 2024

function in nncf.data or nncf/quantization/algorithms ?

Please suggest the appropriate place for this function considering that this method requires transformers & torch Python packages and shouldn't be mixed with the nncf.Dataset.

nncf/helpers/dataset.py Outdated Show resolved Hide resolved
nncf/helpers/requirements.txt Outdated Show resolved Hide resolved
tests/common/test_dataset_generators.py Outdated Show resolved Hide resolved
tests/common/requirements.txt Outdated Show resolved Hide resolved
nncf/data/generators.py Show resolved Hide resolved
nncf/data/generators.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch and removed NNCF Common Pull request that updates NNCF Common labels Sep 25, 2024
Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add an example of usage generate_text_data in https://github.com/openvinotoolkit/nncf/tree/develop/examples/llm_compression. It could be done in the follow-up PR, but you need to think about convenient use of the function in this PR.

nncf/data/generators.py Show resolved Hide resolved
nncf/data/generators.py Show resolved Hide resolved
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 26, 2024
@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Sep 26, 2024

I would suggest to add an example of usage generate_text_data in https://github.com/openvinotoolkit/nncf/tree/develop/examples/llm_compression. It could be done in the follow-up PR, but you need to think about convenient use of the function in this PR.

An example of the non-API method usage was added.

@alexsu52
Copy link
Contributor

I would suggest to add an example of usage generate_text_data in https://github.com/openvinotoolkit/nncf/tree/develop/examples/llm_compression. It could be done in the follow-up PR, but you need to think about convenient use of the function in this PR.

An example of the non-API method usage was added.

Great example! Please add test for the example.

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Sep 26, 2024

@AlexKoff88, re-review an example, please.

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Sep 26, 2024

Great example! Please add test for the example.

Done.

Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KodiaqQ
Copy link
Collaborator Author

KodiaqQ commented Oct 1, 2024

Since there are some issues with the test_examples, I propose to disable this example in tests until the issue is investigated and merge this functionality.
@alexsu52, @MaximProshin, what do you think?

@MaximProshin
Copy link
Collaborator

@KodiaqQ

Since there are some issues with the test_examples, I propose to disable this example in tests until the issue is investigated and merge this functionality. @alexsu52, @MaximProshin, what do you think?

I agree

@alexsu52
Copy link
Contributor

alexsu52 commented Oct 1, 2024

Since there are some issues with the test_examples, I propose to disable this example in tests until the issue is investigated and merge this functionality. @alexsu52, @MaximProshin, what do you think?

Good idea!

@KodiaqQ KodiaqQ merged commit ce1fb51 into openvinotoolkit:develop Oct 1, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation NNCF PT Pull requests that updates NNCF PyTorch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants