-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom "format detection" Attribute #14
Comments
Added new FormatDetectionAttribute for file formats, allows specifying a function to be executed when Scarlet attempts detection, ex. to sanity check header values when no clear magic numbers are available, see issue #14; made P5BustupBIN container format use new attribute
Just added some functionality for this in the form of the You can now specify a static method to be executed during file detection, which should return either |
Yes, it seems to work, however, the FormatDetection should not be run as alternative ("OR") to the other detections, but as additional step ("AND"). Two examples: If the FilenamePattern indicates a match, and the FormatDetection function indicates no match, then the end result should be no match. I ran into this when trying to define a match based on file name pattern AND a detection format If the filename does not match, but the FormatDetection indicates a match, the end result should be "no match" (or rather, in this case, the FormatDetection should probably not be run at all). This happens with the P5bustupBIN giving false positives on some Disgaea3 files, which have a totally different extension (.pac instead of .bin/.dds2) but the P5 ContainerFormat still tries to unpack it because the (rather simple/generic) format detection flags it as a match. See my PR for a slight improvement to the detection function (still not perfect -- for the PAC files I chose to do a full file header verification to reduce false positives to a minimum) So, to summarize, I think it should work like that:
The rationale is this: There is probably little use in trying to, say, decompress an archive where you know the magic number is incorrect, or where you know (by some heuristic in the detection function) that it "looks" invalid, because the developer could not possibly have foreseen how such a file should be handled (otherwise he would have put the correct magic number in, or changed his detection function) |
After thinking about it a bit more, this might be a simpler heuristic: |
Some file formats lack a reliable magic number at the start, and have no clear file name pattern.
In these cases it would be good to have an attribute that takes a delegate which can be used to check, given a Stream (or EndianBinaryReader), if the format would be able to process the file in question., for example by reading in a header and doing a sanity check on the values.
The text was updated successfully, but these errors were encountered: