BNGP-5504: Correctly check type of uploaded file #899
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://eaflood.atlassian.net/browse/BNGP-5504
We were given advance notice that the IT health check may raise the issue that there is no checking that an uploaded file is actually the file type it says it is -- we only check the file extension. For example, if a file that isn't a spreadsheet is renamed
.xlsx
then it can be uploaded as a metric file and processing will be attempted. This could cause a problem if a malicious file is uploaded that isn't otherwise caught by malware scanning -- say for example a vulnerability is found in the spreadsheet processing module we use whereby attempting to process an image file causes a memory leak, then someone could rename a.jpeg
file to.xlsx
and upload it, with the subsequent processing potentially causing issues. This isn't a "malicious" file as such as the image file is otherwise normal, it's just that processing it as a spreadsheet results in an issue (I stress that this is just a theoretical vulnerability to illustrate that a file could be "malicious" and still not be picked up by malware scanning!)In advance of the full health check being delivered to us, we started to look at what we could do to mitigate this. The package
file-type
can be used to determine the type of a file and its expected file extension. We have implemented a simple check that usesfile-type
to determine the file's expected extension, and compares it with the actual file extension; if they don't match then the file is rejected.At present, we haven't yet received the full health check so don't know the scope of any recommendations. We will therefore be leaving this check part-implemented for now, with the expectation that the feature can be completed once the full health check is received and the full scope of the recommendation is known.
The checks can be enabled for a given upload page as follows:
buildConfig()
, addcheckFileType: true
, ie:processErrorUpload()
, addinvalidFileTypeErrorMessage
, ie.:A couple of caveats to bear in mind with the current implementation:
.doc
file in Word for Mac 16.90 by saving in the format it refers to as "Word 97-2004 (.doc)". However, on attempting to upload we found that this was being detected as a file of mime typeapplication/x-cfb
, extension.cfb
, aka Compound File Binary Format. It's unknown whether this is a result of saving in a legacy format from a new version of Word; or from saving in a legacy format from Word for Mac; or if it's simply a bug infile-type
that cannot detect.doc
files. Further testing may be required, or an exception made to allow files of this type..xlsx
,.docx
and.pdf
files have been tested at present. When enabling a page with types other than this, an upload of each file type should be attempted to ensure they are correctly detected.