You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We can't blindly run normalisation before resolving content path
It would break access to data that has filenames in non-normalized notation.
We also can't make an arbitrary decision to change the filenames while onboarding data.
There may be datasets which interlink and use different notation, and forcing normalization during onboarding to IPFS would break links in applications that operate on the data.
What is the problem we are trying to solve?
My understanding of linked issue is user copying "non-normalised" content path from somewhere, and getting "not found" error because DAG uses noralised filenames (notation mismatch).
If so, I think the best we could do UX-wise, is to retry on "not found" and trying normalised (NFC) / decomposed (NFD) forms (to cover both variants).
This way we don't break datasets where file already exists, but still fix HTTP 404 for cases where only file in different notation exists.
But this introduces a magical behavior which hides the underlying problem macOS introduced – see my comment in ipfs/kubo#10286 (comment).
Perhaps it is better to NOT fix reads, and instead give users ability to force specific normalization during data onboarding instead? (like ipfs add --normalize-names none|nfd|nfc suggested in ipfs/kubo#10286 (comment)).
See context here: ipfs/kubo#10286 (comment)
Relevant Unicode spec: https://unicode.org/reports/tr15/
The text was updated successfully, but these errors were encountered: