Skip to content
This repository has been archived by the owner on Dec 17, 2021. It is now read-only.

Make Lambda packaging aware of third party dependencies, incorporate them in repackaging #219

Open
konklone opened this issue Mar 25, 2018 · 0 comments

Comments

@konklone
Copy link
Contributor

The pshtt and trustymail scanners each use the PSL, and the pshtt scanner also uses the Chrome preload list and HSTS preload pending list.

The latter case (Chrome preload and preload pending lists) is handled without needing to be repackaged, because I'm able to do a sort of hacky workaround where I slice the lists down to just the domain being passed in, and send that up dynamically as part of the payload to the function. That does work (the function then just has to say "is 18f.gov in [18f.gov]?") but obviously isn't a general-case solution.

And in fact, the PSL can't work that way, as it has to be used in different ways during code execution than a simple "is in list" check. And so right now, the PSL is packaged in the function, but this means it gets stale. While not a super big deal for USG purposes, this is a much bigger deal with a general internet dataset.

Some thoughts I put down in cisagov/trustymail#74 (comment):

Perhaps it's also worth domain-scan having a generalized solution to packaging Lambda functions with third party data sources. For example, a scanner could specify the source of third party data that is needed for that scanner to run, and the Lambda deploy process could automatically fetch and re-package them during packaging and deployment.

I think there is a fundamental tension between "don't have every Lambda instance make a network request to get this data" and "don't ever have to repackage Lambda functions to stay fresh with this data". I'm comfortable pushing some burden on the repackaging process (especially given how easy you've made it with Docker), and suggesting that staying fresh with the PSL and other sources means setting up (perhaps automated) repackaging of functions on a regular basis. Having repackaging be "aware" of third party dependencies per-scanner could make this easier.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant