-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bypass server upload large file is slow because of generating index file #4201
Comments
Ashwin made a POC here: #4212 But we depend on fixing pauldmccarthy/indexed_gzip#102 and mxmlnkn/ratarmount#95 before it can work. |
Ashwin: get pauldmccarthy/indexed_gzip#103 to work, Ashwin will also publish a fork of indexed_gzip ( Jiani: fixing up #4212. Maybe try reverting to the simpler import indexed_gzip_fileobj_fork_epicfaace as indexed_gzip |
POC to fix this problem is here: Using peek() instead of read() in SQLiteIndexedTar for gz files However, the current Indexed_gzip implementation will not call peek(), instead it calls read(). Simple script to reproduce this: epicfaace/test-repro#12 |
There still some bugs in indexed_gzip, which blocks this issue: pauldmccarthy/indexed_gzip#106 |
Update: Related issue for indexed_gizp github repo: pauldmccarthy/indexed_gzip#107 |
will sync with ashwin |
Potential solution:
|
Did more tests based on this POC. And test the above potential solution. However, it does not works for me. For step2 & Option 1, I found an interesting observation: By reading through the entire unseekable file to build index, this only works for small files (smaller than 10 MB). In other words, the solution in this commit only works for small files. For those large files, it will raise ZRAN_READ_FAIL error when reading through the file (Line 753 or line 1439). For step 2 & Option 2, |
Solution:
|
Fix this, test on largest bundle uploading to dev (158GB). |
|
Can we check that we can upload a 150+GB file and close this? |
According to #3370 and test result, bypass server upload need a lot of time to calculate the index file.
The text was updated successfully, but these errors were encountered: