Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ItemNet tests and docs #202

Open
wants to merge 9 commits into
base: experimental/sasrec
Choose a base branch
from

Conversation

In48semenov
Copy link
Collaborator

Description

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Optimization

How Has This Been Tested?

Before submitting a PR, please check yourself against the following list. It would save us quite a lot of time.

  • Have you read the contribution guide?
  • Have you updated the relevant docstrings? We're using Numpy format, please double-check yourself
  • Does your change require any new tests?
  • Have you updated the changelog file?


@pytest.mark.parametrize(
"n_items,n_factors",
(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we can put it into a single row

@@ -126,6 +169,11 @@ def from_dataset(cls, dataset: Dataset, n_factors: int, dropout_rate: float) ->
raise ValueError("`item_features` in `dataset` must be `SparseFeatures` instance.")

item_cat_features = item_features.get_cat_features()

if item_cat_features.values.size == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea. We will have this block enabled by default. If dataset doesn't contain categorical features then this embeddings will not be created. But when these features are present they will be used.
This behaviour is the same as for all of our models that work with features (like ALS and Lightfm)

actual_embedding_dim = cat_item_embeddings.category_embeddings.embedding_dim

expected_item_features = dataset_item_features.item_features
# TODO: remove after adding Dense Features support
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dense features do not support categories. So there will be no support for CatFeaturesItemNet if features are dense.
@feldlime please correct me if I'm wrong

expected_item_ids = cat_item_embeddings(items)
assert expected_item_ids.shape == (n_items, n_factors)

def test_raises_when_dataset_no_features(self) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an expected behaviour (see other comment). This block (CatFeaturesItemNet) should be ignored if there are:

  1. No features in dataset
  2. Features are dense
  3. No category features are present in sparse features
    This block should just be dropped from the item net.

These would be good cases for tests. Model shouldn't raise errors in any of these cases.

@@ -45,7 +45,7 @@ def get_all_embeddings(self) -> torch.Tensor:

@property
def device(self) -> torch.device:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should drop all device properties. And stop doing to(device) and do to(<tensor>) instead. Otherwise we can catch problems with lighting multi-device training.
Let's change it to lightning advices for device agnostic code https://pytorch-lightning.readthedocs.io/en/1.0.8/multi_gpu.html

We can do it in this PR or in the next one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants