-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Triplet Margin Loss] Issue 1118 #1120
base: main
Are you sure you want to change the base?
Conversation
@vroulet May I know if there's anything that needs to be changed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cvnad1 for doing this! Sorry for the delay. Here are some comments
optax/losses/_self_supervised.py
Outdated
anchor: The anchor embeddings. Shape: [batch_size, feature_dim]. | ||
positive: The positive embeddings. Shape: [batch_size, feature_dim]. | ||
negative: The negative embeddings. Shape: [batch_size, feature_dim]. | ||
margin: The margin value. Default: 1.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to put the default values since they are given in the signature.
optax/losses/_self_supervised.py
Outdated
by V. Balntas et al. Default: False. | ||
reduction: Specifies the reduction to apply to the output: | ||
'none' | 'mean' | 'sum'. Default: 'mean'. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add reference
optax/losses/_self_supervised.py
Outdated
margin: The margin value. Default: 1.0. | ||
p: The norm degree for pairwise distance. Default: 2. | ||
eps: Small epsilon value to avoid numerical issues. Default: 1e-6. | ||
swap: Use the distance swap optimization from "Learning shallow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use rst formatting for references (see e.g. the docstring of Adam)
optax/losses/_self_supervised.py
Outdated
swap: bool = False, | ||
reduction: str = 'mean', | ||
) -> chex.Array: | ||
"""Triplet margin loss function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example (doctest)
@@ -53,5 +53,41 @@ def test_batched(self): | |||
) | |||
|
|||
|
|||
class TripletMarginLossTest(chex.TestCase): | |||
|
|||
def setUp(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using numerical values as expected returns.
They may fail depending on the backend for example.
You may consider simple test cases with a "handmade" function (see e.g. the lbfgs tests). You can check for specific inputs (like zeros or ones).
You may also add a test for some specific behaviors (like using swap here).
Also you should test this function under jit/vmap etc... (see the chex.all_variant utility in some other tests).
@vroulet we have worked on your suggestion and all the tests are passing. I think the code is ready to be merged. |
@vroulet |
@vroulet we tried multiple things to solve the error in pipeline, the tests for triplet_loss are passing locally. The errors we are getting here seems to be not from the function we implemented. Can you guide us on this? |
Hello @cvnad1 , @Saanidhyavats , |
@vroulet We have modified the code based on your review. Could you please verify if everything's correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last round of comments, I'll take care of further formattings on my end.
Could you also squish your commits if possible?
anchors: chex.Array, | ||
positives: chex.Array, | ||
negatives: chex.Array, | ||
axis: chex.Numeric = -1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
axis: int = -1
positives: chex.Array, | ||
negatives: chex.Array, | ||
axis: chex.Numeric = -1, | ||
p: chex.Numeric = 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
norm_degree
rather than p
.
Moreover we have
||x||_p = (sum_i |x_i|^p)**(1/p)
not
||x||_p = sqrt(sum_i x_i^p)
You may want to include the case ||x||_inf
in a separate PR?
>>> Array([0.14142442, 0.14142442], dtype=float32) | ||
|
||
Args: | ||
anchors: An array of anchor embeddings, with shape [batch, feature_dim]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add indents appropriately like that:
anchors: An array of anchor embeddings, with shape [batch, feature_dim].
positives: An array of positive embeddings
(similar to anchors), with shape [batch, feature_dim].
negatives: An array of negative embeddings
(dissimilar to anchors), with shape [batch, feature_dim].
axis: The axis along which to compute the distances
(default is -1).
p: The norm degree for distance calculation
(default is 2 for Euclidean distance).
margin: The minimum margin by which the positive distance
should be smaller than the negative distance.
eps: A small epsilon value to ensure numerical stability
in the distance calculation.
reduction: Specifies the reduction to apply to the
output: 'none' | 'mean' | 'sum'.
If reduction is 'mean' or 'sum', returns a scalar. | ||
|
||
References: | ||
Learning shallow convolutional feature descriptors with triplet losses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the following formatting for references
References:
V. Balntas et al, `Learning shallow convolutional feature descriptors with triplet losses
<https://bmva-archive.org.uk/bmvc/2016/papers/paper119/abstract119.pdf>`_, 2016
by V. Balntas, E. Riba et al. | ||
<https://bmva-archive.org.uk/bmvc/2016/papers/paper119/abstract119.pdf> | ||
""" | ||
chex.assert_type([anchors], float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the three chex.assert_type(...)
.
p: chex.Numeric = 2, | ||
margin: chex.Numeric = 1.0, | ||
eps: chex.Numeric = 1e-6, | ||
reduction: str = 'none', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the reduction option. No losses in optax reduce the losses after computation.
So better to keep this loss follow the same principle.
The user can take care of the reduction easily after computing the losses.
negative_distance = jnp.sqrt(jnp.power(anchors - negatives, p).sum(axis) + eps | ||
) | ||
loss = jnp.maximum(positive_distance - negative_distance + margin, 0) | ||
if reduction == 'mean': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said above, remove the reduction options.
@vroulet Hi Vincent, Added code and tests for the Triplet Margin Loss Function #1118 . Kindly review the code and please do comment in case of any changes.