Problems in the training process #2

Pang-b0 · 2023-07-19T04:14:59Z

Hello author, thank you for such a great article. But I found some problems when I reproduced the results. The function (extract_features) seems to have an influence on the result when the code extracts the cluster center. I tried to get the center by clustering in advance and train it directly as indicators, which would be different from calling the function to extract features in the program and then getting the clustering center to train (the clustering center is exactly the same).

ludles · 2023-07-19T19:35:03Z

The cluster centres are used only to select seed samples. As in

credanno/eval_linear_joint_recycle.py

Lines 142 to 143 in ff63a82

    
           km = KMeans(n_clusters=round(basic_frac * len(trainset)), random_state=args.seed).fit(train_features.cpu()) 
        
           indices, _ = pairwise_distances_argmin_min(km.cluster_centers_, train_features.cpu(), metric='cosine')

I am sorry that I do not get what you mean by "clustering in advance and train it directly as indicators". Could you elaborate?

Pang-b0 · 2023-07-20T01:05:36Z

Thank you for your reply, I mean “If I take this code out alone and run it, the clustering center is indicators = [1414, ..., 500].”，Now that we have the cluster center, in order to simplify our subsequent code running, I will directly assign these recorded values to indices (that is, cancel the code extract_features), which will eventually affect the accuracy.
for example:
We run in the original code.

I got indices in the output.

Since we haven't changed the dino model and data set, the next implementation will definitely be these 19 clustering centers (which I have experimented with).
Therefore, in order to simplify the next execution, we will not perform the function operation of extract_feature. I changed the code to the following example.

Then re-running, it is found that the accuracy of the results is affected, both in "seed" and "boost" modes.

Pang-b0 · 2023-07-20T01:06:20Z

Thank you for your reply, I mean “If I take this code out alone and run it, the clustering center is indicators = [1414, ..., 500].”，If I take this code out alone and run it, the clustering center is indicators = [1414, ..., 500]. Now that we have the cluster center, in order to simplify our subsequent code running, I will directly assign these recorded values to indices (that is, cancel the code extract_features), which will eventually affect the accuracy.

Pang-b0 · 2023-07-20T01:13:11Z

The cluster centres are used only to select seed samples. As in

credanno/eval_linear_joint_recycle.py

Lines 141 to 143 in ff63a82

train_features = nn.functional.normalize(train_features, dim=1, p=2)

km = KMeans(n_clusters=round(basic_frac * len(trainset)), random_state=args.seed).fit(train_features.cpu())

indices, _ = pairwise_distances_argmin_min(km.cluster_centers_, train_features.cpu(), metric='cosine')

thats mean I created a new python file to run the extract_feature part of your code, and I'm sure that the output of indices in the source code is the same as that in the new python file. Then, after annotating the relevant part of "extract_features" in the source code, directly assign the values of indices (the cluster centers extracted in advance in the newly-built python file), which will affect the accuracy, even the difference is 2%.

Pang-b0 · 2023-07-20T01:28:37Z

The cluster centres are used only to select seed samples. As in

credanno/eval_linear_joint_recycle.py

Lines 140 to 141 in ff63a82

if utils.get_rank() == 0:

train_features = nn.functional.normalize(train_features, dim=1, p=2)

thats mean I created a new python file to run the extract_feature part of your code, and I'm sure that the output of indices in the source code is the same as that in the new python file. Then, after annotating the relevant part of "extract_features" in the source code, directly assign the values of indices (the cluster centers extracted in advance in the newly-built python file), which will affect the accuracy, even the difference is 2%.

I think the extract_features function in line 141 of the code may have affected the parameters in the model, but I saw that you set "model.eval ()", so I'm not sure why it has the influence of precision.

ludles · 2023-07-20T02:29:41Z

I have to admit that I have not tried this approach myself. To locate the problem, I would suggest trying torch.allclose() on train_features, km.cluster_centers_, etc., to narrow down the places of potential introduced randomness.

Randomness is a known issue in this problem. Different random seeds can certainly lead to fluctuation in performance. It actually makes more sense to run several it several times and compare the statistical mean and variance in performance.

Pang-b0 · 2023-07-20T02:31:19Z

I've tested it many times and it's still these seeds. That is, indices has not changed.

Pang-b0 · 2023-07-20T02:32:12Z

我不得不承认，我自己还没有尝试过这种方法。为了定位问题，我建议尝试、等，以缩小潜在引入随机性的位置。train_features``km.cluster_centers_

随机性是此问题中的一个已知问题。不同的随机种子肯定会导致性能波动。实际上，多次运行几次并比较统计平均值和性能方差更有意义。

Do you mean that train_features will affect the subsequent training? But what I understand is that if indices remains unchanged, there should be no such influence.

Pang-b0 · 2023-07-20T02:35:41Z

I don't know if it's my improper operation. I found that after my dino model is fixed, if the parameters are consistent, the result will be the same.
For example, I executed the code below twice without modifying the source code, and their output precision indicators were exactly the same.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems in the training process #2

Problems in the training process #2

Pang-b0 commented Jul 19, 2023

ludles commented Jul 19, 2023 •

edited

Loading

Pang-b0 commented Jul 20, 2023 •

edited

Loading

Pang-b0 commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023 •

edited

Loading

Pang-b0 commented Jul 20, 2023

ludles commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023

Problems in the training process #2

Problems in the training process #2

Comments

Pang-b0 commented Jul 19, 2023

ludles commented Jul 19, 2023 • edited Loading

Pang-b0 commented Jul 20, 2023 • edited Loading

Pang-b0 commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023 • edited Loading

Pang-b0 commented Jul 20, 2023

ludles commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023

Pang-b0 commented Jul 20, 2023

ludles commented Jul 19, 2023 •

edited

Loading

Pang-b0 commented Jul 20, 2023 •

edited

Loading

Pang-b0 commented Jul 20, 2023 •

edited

Loading