Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rs hotfix/fix corrupt image uploads #7804

Merged
merged 18 commits into from
Jul 29, 2024

Conversation

RileySeaburg
Copy link
Member

@RileySeaburg RileySeaburg commented Jul 23, 2024

Summary

This PR refactors the file upload process to address issues with image uploads and file processing. It replaces the gulp-awspublish plugin with a custom TypeScript-based upload module using AWS SDK v3, while maintaining Gulp task integration for workflow consistency.

Solution

This PR implements the following changes:

  1. Removes the gulp-awspublish plugin and replaces it with a custom TypeScript module (config/typescript/upload.ts) using AWS SDK v3 for file uploads.
  2. Maintains separate functions for uploading images and files, preserving the existing structure.
  3. Implements a cleanup function to remove temporary working directories after successful uploads.
  4. Updates the Gulpfile to integrate the new TypeScript-based upload module within the existing Gulp task structure.

This approach was chosen to:

  1. Improve type safety and catch potential errors at compile-time by using TypeScript.
  2. Modernize the upload process by utilizing the latest AWS SDK v3, removing the dependency on the outdated gulp-awspublish plugin.
  3. Maintain the existing Gulp-based workflow for consistency and ease of integration.
  4. Ensure proper cleanup after the upload process.

The change was implemented by creating a new TypeScript file for uploads, compiling it to JavaScript, and integrating it into the existing Gulp tasks. This allows us to have more control over the upload process while keeping the familiar Gulp task structure.

How To Test

  1. Install dependencies: npm install
  2. Place test images in content/uploads/_inbox/__add image or static files to this folder__
  3. Run the Gulp upload task: npx gulp upload
  4. Verify that files are uploaded to the S3 bucket and local directories are cleaned up

Dependency updates

Dependency

Dev Dependency name Previous version New version
@aws-sdk/client-s3 -- ^3.614.0
dotenv -- ^16.3.1
mime-types -- ^2.1.35

Dev Dependency

Dev Dependency name Previous version New version
@typescript-eslint/eslint-plugin -- ^5.x.x
@typescript-eslint/parser -- ^5.x.x
@types/mime-types -- ^2.1.4
typescript -- ^4.x.x
gulp-awspublish ^8.0.0 Removed

Script updates

  • Modified lint:js and lint:js:fix to include TypeScript files
  • Modified prettier:js and prettier:js:fix to include TypeScript files
  • Added build script to compile TypeScript
  • Added type-check script to run TypeScript's type checker

Notes

  • TypeScript is now a local dependency; use npm run build to compile.
  • ESLint configuration has been updated for TypeScript in .eslintrc.yaml.
  • Run npm run build, npm run lint:js, and npm run type-check before submitting PR to ensure code quality.

Copy link

🔍 Preview in Federalist

@RileySeaburg RileySeaburg added Bug Fix This fixes an actual bug Content: Images labels Jul 23, 2024
@RileySeaburg RileySeaburg self-assigned this Jul 23, 2024
@RileySeaburg RileySeaburg requested review from nick-mon1 and mejiaj July 23, 2024 17:45
@mejiaj mejiaj requested a review from clmedders July 25, 2024 13:54
Copy link
Contributor

@nick-mon1 nick-mon1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I Tested

Tested all file types and all steps to get a successful upload.

name type upload cleanup valid public file
test-j-image-upload-7-25-2024.jpg jpg pass pass pass
test-p-image-upload-7-25-2024.png png pass pass pass
test-document-upload-7-25-2024.pdf pdf pass pass pass
test-powerpoint-upload-7-25-2024.pptx pptx pass pass pass
test-spreadsheet-upload-7-25-2024.xlsx xlsx pass pass pass

Note

I did notice in my test file names that when I used test for the pdf it would rename from test-document-upload-7-25-2024.pdf to test.document-upload-7-25-2024.pdf. Just something to keep note of.

Suggestions

I've added some jsdocs for the functions in the upload.ts (f4e50c9)

Questions

Can we add a linting and formatting step to the package.json for the typescript? Or is that handled when running tsc?

By adding another dependency with typescript, what are the benefits and some of the risks with having to support it for the uploading process over gulp?

If an image error occurs in the future, what would we need to know about the typescript code to address?

@RileySeaburg
Copy link
Member Author

What I Tested

Tested all file types and all steps to get a successful upload.

| name | type | upload | cleanup | valid public file |

| -------------------------------------- | ---- | ------ | ------- | ----------------- |

| test-j-image-upload-7-25-2024.jpg | jpg | pass | pass | pass |

| test-p-image-upload-7-25-2024.png | png | pass | pass | pass |

| test-document-upload-7-25-2024.pdf | pdf | pass | pass | pass |

| test-powerpoint-upload-7-25-2024.pptx | pptx | pass | pass | pass |

| test-spreadsheet-upload-7-25-2024.xlsx | xlsx | pass | pass | pass |

[!NOTE]

I did notice in my test file names that when I used test for the pdf it would rename from test-document-upload-7-25-2024.pdf to test.document-upload-7-25-2024.pdf. Just something to keep note of.

Suggestions

I've added some jsdocs for the functions in the upload.ts (f4e50c9)

Questions

Can we add a linting and formatting step to the package.json for the typescript? Or is that handled when running tsc?

By adding another dependency with typescript, what are the benefits and some of the risks with having to support it for the uploading process over gulp?

If an image error occurs in the future, what would we need to know about the typescript code to address?

@nick-mon1

This is not robust code to handle every edge case, merely a search for a peer review to ensure there are no obvious errors.

The key thing to remember about TypeScript is, JavaScript is valid TypeScript.

JsDocs and ESLint can assist in ensuring that formatting and documentation are handled.

I can add a commit that further integrates typescript into the tooling tomorrow.

Overall this does not complicate the gulp upload process.

They only thing it does is solve the data race issue by using Node.js to upload the files and images.

This could just as easily be written with vanilla.js.

@nick-mon1 nick-mon1 self-requested a review July 26, 2024 18:58
nick-mon1
nick-mon1 previously approved these changes Jul 26, 2024
Copy link
Contributor

@nick-mon1 nick-mon1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RileySeaburg Sounds good, please update the package.json with any formatting and linting commands to prevent build errors.

It would be preferable to use vanillia.js because it is one less dependency to support but it is best to get this unblocked now.

With that caveat in mind, thanks for getting this fixed.

@RileySeaburg
Copy link
Member Author

@nick-mon1 Idk why it said I dismissed you request.

I addressed them all here.
JSDoc and ESLint support enabled.

I added a settings.json for workspace ESLint Highlighting as well.

@RileySeaburg
Copy link
Member Author

I simplified the testing instructions since compiling typescript is not necessary.

@@ -7,24 +7,26 @@
"federalist": "export NODE_ENV=production && gulp buildAssets",
"hugo": "export $(grep -v '^#' .env | xargs) && hugo serve --bind='0.0.0.0'",
"lint:json": "find public/**/v1/json/index.html -print0 | xargs -0I {} jsonlint '{}'",
"lint:js": "eslint './themes/digital.gov/src/js/*.js'",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nick-mon1 ESlint was originally not linting any JS in ./config/gulp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could explain why it was so difficult to work with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to update this to lint the gulp files.

I know I'm asking for trouble, but it only makes sense to lint all the JS into the project.

Copy link
Contributor

@mejiaj mejiaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hard work on this. I was able to upload files successfully without any corruption. I like the improvements to linting (JSDoc) and documentation too.

Added minor comments and questions. The biggest one being the addition of typescript.

Tested

  • Uploaded powerpoint without corruption
  • Uploaded PNG successfully 1
  • Uploaded JPEG successfully 2
Build log

[14:24:06] Using gulpfile ~/web/digitalgov.gov/gulpfile.js
[14:24:06] Starting 'upload'...
[14:24:06] Starting 'fileTidy'...
Moving file from content/uploads/_inbox/TEST-USWDS Monthly Call June 2024.pptx to content/uploads/_working-files/to-process/test-uswds-monthly-call-june-2024.pptx
Moving file from content/uploads/_inbox/TEST-dg-7804.png to content/uploads/_working-images/to-process/test-dg-7804.png
Moving file from content/uploads/_inbox/TEST-ooh.png to content/uploads/_working-images/to-process/test-ooh.png
[14:24:06] Finished 'fileTidy' after 2.24 ms
[14:24:06] Starting 'cleanInbox'...
[14:24:06] Finished 'cleanInbox' after 4.16 ms
[14:24:06] Starting 'writeDataFile'...
file is written
[14:24:06] Finished 'writeDataFile' after 64 ms
[14:24:06] Starting 'processImages'...
[14:24:06] Finished 'processImages' after 2.77 ms
[14:24:06] Starting 'removeProcessedImage'...
file is written
file is written
[14:24:06] Finished 'removeProcessedImage' after 37 ms
[14:24:06] Starting 'uploadTask'...
Error processing variant /Users/jmejia-a/web/digitalgov.gov/content/uploads/_working-images/to-process/test-ooh.png: Error: /Users/jmejia-a/web/digitalgov.gov/content/uploads/_working-images/to-process/test-ooh.png: unable to open for read
unix error: No such file or directory
/Users/jmejia-a/web/digitalgov.gov/content/uploads/_working-images/to-process/test-ooh.png: unable to open for read
unix error: No such file or directory
pngload: stream error
    at Sharp.toFile (/Users/jmejia-a/web/digitalgov.gov/node_modules/sharp/lib/output.js:89:19)
    at createResponsiveVariant (/Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:90:6)
    at /Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:110:5
    at Array.forEach (<anonymous>)
    at processImageVariants (/Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:109:34)
    at /Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:139:11
    at Array.map (<anonymous>)
    at /Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:135:41
    at FSReqCallback.oncomplete (node:fs:191:23)
    at FSReqCallback.callbackTrampoline (node:internal/async_hooks:130:17)
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w1200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w400.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w800.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w1200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w400.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w800.png
File uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/static/test-uswds-monthly-call-june-2024.pptx
All uploads completed successfully.
Cleanup completed
[14:24:10] Finished 'uploadTask' after 4.41 s
[14:24:10] Finished 'upload' after 4.52 s

Files tested

Additional chores

We'll need to delete all test files on S3 bucket.

Footnotes

  1. Uploaded a small PNG that triggered a build error, but uploaded successfully.

  2. Image was converted to PNG. Is this intentional? PNG files are larger and can be worse quality than JPEG.

config/gulp/readme.md Outdated Show resolved Hide resolved
config/typescript/upload.ts Show resolved Hide resolved
config/typescript/upload.ts Show resolved Hide resolved
config/typescript/upload.ts Show resolved Hide resolved
package.json Show resolved Hide resolved
config/typescript/upload.ts Show resolved Hide resolved
@RileySeaburg
Copy link
Member Author

RileySeaburg commented Jul 29, 2024

Thanks for your hard work on this. I was able to upload files successfully without any corruption. I like the improvements to linting (JSDoc) and documentation too.

Added minor comments and questions. The biggest one being the addition of typescript.

Tested

  • Uploaded powerpoint without corruption
  • Uploaded PNG successfully 1
  • Uploaded JPEG successfully 2

Build log

[14:24:06] Using gulpfile ~/web/digitalgov.gov/gulpfile.js
[14:24:06] Starting 'upload'...
[14:24:06] Starting 'fileTidy'...
Moving file from content/uploads/_inbox/TEST-USWDS Monthly Call June 2024.pptx to content/uploads/_working-files/to-process/test-uswds-monthly-call-june-2024.pptx
Moving file from content/uploads/_inbox/TEST-dg-7804.png to content/uploads/_working-images/to-process/test-dg-7804.png
Moving file from content/uploads/_inbox/TEST-ooh.png to content/uploads/_working-images/to-process/test-ooh.png
[14:24:06] Finished 'fileTidy' after 2.24 ms
[14:24:06] Starting 'cleanInbox'...
[14:24:06] Finished 'cleanInbox' after 4.16 ms
[14:24:06] Starting 'writeDataFile'...
file is written
[14:24:06] Finished 'writeDataFile' after 64 ms
[14:24:06] Starting 'processImages'...
[14:24:06] Finished 'processImages' after 2.77 ms
[14:24:06] Starting 'removeProcessedImage'...
file is written
file is written
[14:24:06] Finished 'removeProcessedImage' after 37 ms
[14:24:06] Starting 'uploadTask'...
Error processing variant /Users/jmejia-a/web/digitalgov.gov/content/uploads/_working-images/to-process/test-ooh.png: Error: /Users/jmejia-a/web/digitalgov.gov/content/uploads/_working-images/to-process/test-ooh.png: unable to open for read
unix error: No such file or directory
/Users/jmejia-a/web/digitalgov.gov/content/uploads/_working-images/to-process/test-ooh.png: unable to open for read
unix error: No such file or directory
pngload: stream error
    at Sharp.toFile (/Users/jmejia-a/web/digitalgov.gov/node_modules/sharp/lib/output.js:89:19)
    at createResponsiveVariant (/Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:90:6)
    at /Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:110:5
    at Array.forEach (<anonymous>)
    at processImageVariants (/Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:109:34)
    at /Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:139:11
    at Array.map (<anonymous>)
    at /Users/jmejia-a/web/digitalgov.gov/config/gulp/file-process.js:135:41
    at FSReqCallback.oncomplete (node:fs:191:23)
    at FSReqCallback.callbackTrampoline (node:internal/async_hooks:130:17)
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w1200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w400.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-dg-7804_w800.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w1200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w200.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w400.png
Image uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/test-ooh_w800.png
File uploaded successfully. Public URL: https://digitalgov.s3.amazonaws.com/static/test-uswds-monthly-call-june-2024.pptx
All uploads completed successfully.
Cleanup completed
[14:24:10] Finished 'uploadTask' after 4.41 s
[14:24:10] Finished 'upload' after 4.52 s

Files tested

Additional chores

We'll need to delete all test files on S3 bucket.

Footnotes

  1. Uploaded a small PNG that triggered a build error, but uploaded successfully.
  2. Image was converted to PNG. Is this intentional? PNG files are larger and can be worse quality than JPEG.

Addressing your footnote, this is probably an artifact from the jpg -> png. issue #7686.
I will address this as well.

@mejiaj

@RileySeaburg
Copy link
Member Author

I just found out Gulp has typescript support.

Let's merge this for now since it's confirmed working - To fix production upload and will reconfigure in another PR and trello card to clean up the workflow.

@RileySeaburg RileySeaburg merged commit 12387d7 into main Jul 29, 2024
8 checks passed
@RileySeaburg RileySeaburg deleted the rs-hotfix/fix-corrupt-image-uploads branch July 29, 2024 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Fix This fixes an actual bug Content: Images
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants