-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read the fdScript
field where it exists
#16
Comments
Also |
Did some empirical testing after setting up a Japanese install of System 7.1 in Mini vMac and trying Japanese input in Mac OS 9 in SheepShaver. On Mac OS 9, I duplicated my test HFS image and created a new folder on it, and gave it a name in Japanese script. (I typed “katakana” and hoped the Japanese input method turned that into something sensible, or at least validly encoded.) On System 7, simply installing KanjiTalk gave me a hard drive with some Japanese-named files and folders on it, so I simply analyzed the hard drive image. In both cases, the names of those items had no script code saved.
(Here the script code is being retrieved by accessing I also tried a commercial release (“The Dig” from this combo pack of Japanese localizations of LucasArts games), and that also had no script code set on its Japanese-named items. I don't know if Mac OS has ever saved anything to that field. It might be that Mac OS 8 removed it simply because it had never been used. In any case, even on an OS that hypothetically might use it, there's only a script code saved when the high bit is set. When it isn't, we need to do anyway what System 7 and Mac OS 9 appear to do, which is go entirely off of the user's settings. In our case, that means #21. I think it's still worth implementing this just in case it's ever used on some disk, but we can't rely on it as a substitute for a user setting, since a user setting is how Mac OS did it. |
The fact that I need to change the lister, the analyzer, and the converter for this (and for #21) suggests that I need to centralize filename decoding. Perhaps a method in ImpHFSVolume: ///keyPtr is HFSCatalogKey *. payloadPtr is HFSCatalog{File,Folder} *. Both are void *so ImpHFSPlusVolume can override them and expect the corresponding catalog key and payload types.
- (NSString *_Nonnull) nameOfItemWithCatalogKey:(void const *_Nonnull const)keyPtr payload:(void const *_Nonnull const)payloadPtr; As listed in #21, this method would consult the file/folder record first, followed by its own encoding (as set by the analyzer/lister/converter, possibly coming from user input), and try MacRoman as a last resort. The ImpHFSVolume would look at |
…nt. convert now propagates an embedded script code to the textEncoding field in the converted record. extract currently doesn't use embedded script codes yet, and anyway, it's looking like it doesn't make sense to prefer the embedded script code since it has little or no relation to the script(s) used in the filename. It may actually make more sense to prefer the --encoding value.
I noted in the commit message on that one that it's beginning to look like preferring the embedded script code is the wrong thing to do in practice. In theory, each file should know what encoding its name was encoded in. In practice, most records don't have an embedded script code regardless of what script(s) are present in their names, and the few that do have a script code with little to no relation to how the filename was encoded. For example, I set up Mac OS 9.0.4 into Japanese mode (by changing the views font to Osaka), launched SimpleText Japanese, and saved a new text file as whatever the Japanese for “untitled” is. (I don't know Japanese.) Result: A file whose name is encoded in… Shift-JIS, maybe? But definitely not MacRoman. But its embedded script code was 0, which is MacRoman. So it looks like it actually makes more sense to just prefer --encoding. If that encoding fails, then it may be worth trying the embedded script code (if there is one) and ultimately falling back to MacRoman. It might also be worth asking the user—although that means conversion and extraction have to become interruptible and resumable, ugh. |
I just came across this in the Finder Interface chapter of “Inside Macintosh: Macintosh Toolbox Essentials”:
That seems worth consulting. It'd be nice to not have to rely on the user passing in a correct encoding for the entire volume, and that encoding being correct for the entire volume. If an item has a script code set, we should try to respect it.
Classifying this as a bug since we're currently ignoring something we shouldn't be.
The text was updated successfully, but these errors were encountered: