You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this case we check whether the exact blob exists , but in case it doesn't exist, we continue to checking partial blob appearance, in all bucket files using startswith. This introduces 2 possible issues:
In case of bucket with high amount of blob (in our case we have bucket with hundred of thousands blobs), this check might be unreasonably long
In case we have a prefix match, exists will return True, but it might not be the blob we are referring to
Possible solutions
Avoid looking for blob prefix
Add a flag to exists, something like exact_match
The text was updated successfully, but these errors were encountered:
@yaelmi3 thanks for providing this review/analysis! 🙇
Could you construct a performance test that measures how slow it is and compare it with your suggested change? I can run it on all the cloud providers to get a sense of the impact if you write a script that works with the local-mode implementation.
env: python3.10, tested with GS
Consider the following case:
In this case we check whether the exact blob exists , but in case it doesn't exist, we continue to checking partial blob appearance, in all bucket files using
startswith
. This introduces 2 possible issues:exists
will returnTrue
, but it might not be the blob we are referring toPossible solutions
exists
, something likeexact_match
The text was updated successfully, but these errors were encountered: