Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in the training process #2

Open
Pang-b0 opened this issue Jul 19, 2023 · 9 comments
Open

Problems in the training process #2

Pang-b0 opened this issue Jul 19, 2023 · 9 comments

Comments

@Pang-b0
Copy link

Pang-b0 commented Jul 19, 2023

Hello author, thank you for such a great article. But I found some problems when I reproduced the results. The function (extract_features) seems to have an influence on the result when the code extracts the cluster center. I tried to get the center by clustering in advance and train it directly as indicators, which would be different from calling the function to extract features in the program and then getting the clustering center to train (the clustering center is exactly the same).
image

@ludles
Copy link
Member

ludles commented Jul 19, 2023

The cluster centres are used only to select seed samples. As in

km = KMeans(n_clusters=round(basic_frac * len(trainset)), random_state=args.seed).fit(train_features.cpu())
indices, _ = pairwise_distances_argmin_min(km.cluster_centers_, train_features.cpu(), metric='cosine')

I am sorry that I do not get what you mean by "clustering in advance and train it directly as indicators". Could you elaborate?

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

Thank you for your reply, I mean “If I take this code out alone and run it, the clustering center is indicators = [1414, ..., 500].”,Now that we have the cluster center, in order to simplify our subsequent code running, I will directly assign these recorded values to indices (that is, cancel the code extract_features), which will eventually affect the accuracy.
for example:
We run in the original code.
image
I got indices in the output.
image
Since we haven't changed the dino model and data set, the next implementation will definitely be these 19 clustering centers (which I have experimented with).
Therefore, in order to simplify the next execution, we will not perform the function operation of extract_feature. I changed the code to the following example.
image
Then re-running, it is found that the accuracy of the results is affected, both in "seed" and "boost" modes.

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

Thank you for your reply, I mean “If I take this code out alone and run it, the clustering center is indicators = [1414, ..., 500].”,If I take this code out alone and run it, the clustering center is indicators = [1414, ..., 500]. Now that we have the cluster center, in order to simplify our subsequent code running, I will directly assign these recorded values to indices (that is, cancel the code extract_features), which will eventually affect the accuracy.

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

The cluster centres are used only to select seed samples. As in

train_features = nn.functional.normalize(train_features, dim=1, p=2)
km = KMeans(n_clusters=round(basic_frac * len(trainset)), random_state=args.seed).fit(train_features.cpu())
indices, _ = pairwise_distances_argmin_min(km.cluster_centers_, train_features.cpu(), metric='cosine')

thats mean I created a new python file to run the extract_feature part of your code, and I'm sure that the output of indices in the source code is the same as that in the new python file. Then, after annotating the relevant part of "extract_features" in the source code, directly assign the values of indices (the cluster centers extracted in advance in the newly-built python file), which will affect the accuracy, even the difference is 2%.

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

The cluster centres are used only to select seed samples. As in

if utils.get_rank() == 0:
train_features = nn.functional.normalize(train_features, dim=1, p=2)

thats mean I created a new python file to run the extract_feature part of your code, and I'm sure that the output of indices in the source code is the same as that in the new python file. Then, after annotating the relevant part of "extract_features" in the source code, directly assign the values of indices (the cluster centers extracted in advance in the newly-built python file), which will affect the accuracy, even the difference is 2%.

I think the extract_features function in line 141 of the code may have affected the parameters in the model, but I saw that you set "model.eval ()", so I'm not sure why it has the influence of precision.

@ludles
Copy link
Member

ludles commented Jul 20, 2023

I have to admit that I have not tried this approach myself. To locate the problem, I would suggest trying torch.allclose() on train_features, km.cluster_centers_, etc., to narrow down the places of potential introduced randomness.

Randomness is a known issue in this problem. Different random seeds can certainly lead to fluctuation in performance. It actually makes more sense to run several it several times and compare the statistical mean and variance in performance.

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

I've tested it many times and it's still these seeds. That is, indices has not changed.

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

我不得不承认,我自己还没有尝试过这种方法。为了定位问题,我建议尝试 、 等,以缩小潜在引入随机性的位置。train_features``km.cluster_centers_

随机性是此问题中的一个已知问题。不同的随机种子肯定会导致性能波动。实际上,多次运行几次并比较统计平均值和性能方差更有意义。

Do you mean that train_features will affect the subsequent training? But what I understand is that if indices remains unchanged, there should be no such influence.

@Pang-b0
Copy link
Author

Pang-b0 commented Jul 20, 2023

I don't know if it's my improper operation. I found that after my dino model is fixed, if the parameters are consistent, the result will be the same.
For example, I executed the code below twice without modifying the source code, and their output precision indicators were exactly the same.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants