Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't reuse a loaded model saved after training #1224

Closed
PaulSteffen-betclic opened this issue Nov 6, 2023 · 5 comments
Closed

[BUG] Can't reuse a loaded model saved after training #1224

PaulSteffen-betclic opened this issue Nov 6, 2023 · 5 comments
Labels
bug Something isn't working status/needs-triage

Comments

@PaulSteffen-betclic
Copy link

Bug description

Can't use a loaded model, saved after using .fit() method.

Steps/Code to reproduce bug

import pandas as pd
import tensorflow as tf

import nvtabular as nvt
import merlin.models.tf as mm
from merlin.schema.tags import Tags
from merlin.models.utils.dataset import unique_by_tag

interactions_df = pd.DataFrame({
   'CustomerIdCat': [42, 76],
   'ItemIdCat': [1, 2],
   'ItemFeature1': [3, 3],
   'ItemFeature2': [72, 15]
})

items_df = pd.DataFrame({
   'ItemIdCat': [2, 2],
   'ItemFeature1': [3, 3],
   'ItemFeature2': [15, 15]
})

train = nvt.Dataset(interactions_df)
item_candidates = nvt.Dataset(items_df, schema=schema.select_by_tag(Tags.ITEM))

train_retrieval_loader = Loader(train, schema=train.schema, batch_size=1024)

tower_dim = 8

# create user schema using USER tag
user_schema = schema.select_by_tag(Tags.USER_ID)
# create user (query) tower input block
user_inputs = mm.InputBlockV2(user_schema)
# create user (query) encoder block
query_tower = mm.Encoder(user_inputs, mm.MLPBlock([16, tower_dim], no_activation_last_layer=True))

# create item schema using ITEM tag
item_schema = schema.select_by_tag(Tags.ITEM)
# create item (candidate) tower input block
item_inputs = mm.InputBlockV2(item_schema)
# create item (candidate) encoder block
candidate_tower = mm.Encoder(item_inputs, mm.MLPBlock([16, tower_dim], no_activation_last_layer=True))

retrieval_model = mm.TwoTowerModelV2(query_tower, candidate_tower)

with tf.device('/cpu:0'):
    retrieval_model.compile(optimizer="adam", run_eagerly=True, metrics=[mm.RecallAt(10), mm.NDCGAt(10)])
    retrieval_model.fit(train_retrieval_loader, epochs=1, batch_size=1024)

retrieval_model.save("dir_models/two_tower")

loaded_model = tf.keras.models.load_model("dir_models/two_tower")

candidate_features = unique_by_tag(item_candidates, Tags.ITEM, Tags.ITEM_ID)

topk_model = loaded_model.to_top_k_encoder(candidate_features, k=10, batch_size=128)

First, a warning is logged when saving the retrieval model:
image

Then, an error occured when using the loaded model:
image

Expected behavior

Use the loaded model to create a top k encoder.

Environment details

  • Merlin version: 23.8.0& 23.8.0+5.g16d289a77
  • Platform: macOS
  • Python version: 3.10.12
  • Tensorflow version (GPU?): 2.12.0 (yes)
@PaulSteffen-betclic PaulSteffen-betclic added bug Something isn't working status/needs-triage labels Nov 6, 2023
@rnyak
Copy link
Contributor

rnyak commented Nov 6, 2023

@PaulSteffen-betclic please pull the latest branches. there was a recent fix. if you are using our merlin-tensorflow:23.08 image, you need to do

cd /models
git pull origin main
pip install .

@PaulSteffen-betclic
Copy link
Author

@PaulSteffen-betclic please pull the latest branches. there was a recent fix. if you are using our merlin-tensorflow:23.08 image, you need to do

cd /models
git pull origin main
pip install .

I'm currently using the version 23.8.0+5.g16d289a77, which include this recent fix

@PaulSteffen-betclic
Copy link
Author

@PaulSteffen-betclic please pull the latest branches. there was a recent fix. if you are using our merlin-tensorflow:23.08 image, you need to do

cd /models
git pull origin main
pip install .

I also tried to use the latest branches, using the merlin-tensorflow:nightly image (23.08 do not work on macOS) and always the same error.

@rnyak
Copy link
Contributor

rnyak commented Nov 7, 2023

@PaulSteffen-betclic after you load the model can you please do this step before you convert it to topk_encoder model?

loaded_model = tf.keras.models.load_model(path)
     # this is necessary when re-loading the model, before building the top K
  _ = loaded_model(mm.sample_batch(dataset, batch_size=128, include_targets=False))

@PaulSteffen-betclic
Copy link
Author

@PaulSteffen-betclic after you load the model can you please do this step before you convert it to topk_encoder model?

loaded_model = tf.keras.models.load_model(path)
     # this is necessary when re-loading the model, before building the top K
  _ = loaded_model(mm.sample_batch(dataset, batch_size=128, include_targets=False))

It fix this issue !
Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status/needs-triage
Projects
None yet
Development

No branches or pull requests

2 participants