-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
top-K of Streaming #174
Comments
Hope #177 answers this question? |
Our code has been developed and tested. At that time, considering the memory was 16G, we only marked the deleted samples. |
The intent of the track as we shared in the competition announcement is to develop algorithms that can update and compact the index instead of marking samples as deleted. We had indicated that the final runbook might be different to test this aspect. Would it be possibly to change the configuration of your algorithm to make it work with limited memory? We could have also designed a runbook with 60M points, 16GB RAM and 2 hour limit. We decided to lower the time and number of points for faster experimentation. |
In August, it is says that 16GB RAM is avaliable of Streaming. We only notified the change to 8g in October, which was too late. Because we've already developed it. |
This is not a problem of algorithm performance, but a problem of temporarily changing rules. |
In early October, during online communication, there was no mention of modifying the memory requirements. |
In the email after the online communication meeting, there was no mention of modifying the memory requirements. |
@nk2014yj I acknowledge my mistake in not warning explicitly about the memory limit in prior communications. I apologize for that and will be careful with my communications in the future. However, please let me share our intent for this track. We want to encourage algorithms that can adapt to a long stream of updates, including deletions, as the word "streaming" indicates in the algorithms world. Marking an index with tombstones is a good starting point, but we want to think about algorithms that also compact and adapt the index over time to the active set of points. I think we communicated this intent since the competition announcement. We also stated that the final dataset and runbook were going to be different than initial choices. If the evaluation used 16GB memory with a runbook with 800Bytes vectors + 30M stream of updates, or 400byte vectors + 60M stream of updates (with many deletions), it would have been no different qualitatively than using 8GB memory and 400byte vectors + 30M stream, right? Would you have considered a 60M update stream + 16GB limit outside the scope of the rules? The intent of the final runbook was to test whether submissions where actually cleaning up the index or just marking tombstones. That said, I realize that you have spent time developing what you have. Please submit your algorithm if you are able to and I will run it with 16GB and publish it on the leaderboard with the note that it requires 8+GB. I am also happy to merge any update you might have with lower memory after the competition's deadline and scope and update the leaderboard. |
I have two questions.
It passes top-1 to run_task of StreamingRunner, please help to confirm this value, thanks.
big-ann-benchmarks/benchmark/runner.py
Line 103 in f5bb90b
Will the number of private querys remain consistent ?It will affect the overall time consumption.
The text was updated successfully, but these errors were encountered: