-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http file system directed to stream by an "Accept-Ranges": "none" response #1631
Conversation
Testing with the DAAC providing |
You seem to have the same situation as #1626 - there are some suggestions in that thread of what might be done. |
I think I'm actually running into nsidc/earthaccess#610. If I understand correctly what you say about the cache type being hard coded to "none" (i.e. BaseCache), then there should not have been a call to the cache (and no problem with the lack of If they agree, I will mark this ready for review. Caching on |
Is there any update here? |
Waiting for a review of nsidc/earthaccess#620, and I won't forget to pick this up after. |
OK thanks :) |
@martindurant The downstream issue has resolved as I'd hoped. I've sync'd this branch and ran tests locally. Ready for workflow approval and review. Thanks for your patience! |
Perfect, thank you. |
Background
HTTPFileSystem._open
chooses betweenHTTPFile
orHTTPStreamFile
, and only the former allows random access, by using the"Range"
header in subsequent HTTP GET requests. Servers do not always respond predictably to partial content requests, but they have the option of explicitly discouraging the use of"Range"
by including"Accept-Ranges": "none"
in response headers. It's rarely done, but the lack of a"Accept-Ranges"
response does not (in practice) indicate the lack of support for the"Range"
header.Changes
Previously,
HTTPFileSystem._open
only usedHTTPStreamFile
when NOT block caching (which is not the default) or when the content size could not be determined in advance (i.e. rarely). This PR makes a response header with"Accept-Range": "none"
into another rare cause for the switch to streaming.HTTPFileSystem._open
now always makes amay see a "partial" key inself.info
call (even givensize
) 🤨self.info
resultshttp._file_info
checks for the"Accept-Ranges"
header in the responseHTTPFileSystem._info
may include"partial": False
test_no_range_support
header
, theignore_range
value was irrelevant and removedaccept_range
withconftest
updated to implementAlternative
It seems like an alternative approach is to stick with
HTTPFile
but force_fetch_all
for read. Please comment if that is preferred ... I haven't figured out what does and doesn't use caching.## DraftI'm in the process of working with a NASA DAAC that does not provide partial content to test the approach, and will remove the "Draft" status when successful. Reviews in the meantime are very welcome.Ugh, I give up waiting for them. Please review when able!