[RFC|WIP] deploy to GCP #240

jimdigriz · 2021-04-20T00:09:50Z

WARNING: Work in Progress, do not merge.

Some client work I was doing including making Portier deployable to GCP so I took the opportunity to generalise the solution so that others should be able to use it too. The deployment is described in the documentation in my branch at https://github.com/jimdigriz/portier-broker/tree/gcp/contrib/gcp and includes a list of issues/limitations; unlikely to be solved by myself as most require a Rust coder to work on the actual broker.

Probably of most interest to the project is the (non-awful) conditional build feature slipped into the Dockerfile that lets the user deploy a non-forked version of the project in their environment and pull in the data directory as an externally sourced tarball from an HTTP server.

As for the GCP deployment, it is the usual meat and potatoes you would expect...both the good and bad...

The project is far enough along to start seeking feedback from the maintainers and, if there is an appetite for it, what changes would be necessary for consideration of inclusion in the project.

stephank

Thanks for working on this! This is super interesting as someone who works at an AWS + Terraform shop. 😃

I'm no expert in any of the gcloud stuff, so that looks good to me. Just want to flesh out the Docker part some more, because I consider it one of the primary installation methods. (Even though I haven't paid enough attention to it.)

stephank · 2021-05-07T20:00:20Z

contrib/gcp/README.md

+         * there is no cast operator available that works on `$(ref ...)`
+         * we cannot use [`outputs`](https://cloud.google.com/deployment-manager/docs/configuration/expose-information-outputs) as it uses pass by reference and creates the same problem
+ * support for [in-transit encryption](https://cloud.google.com/memorystore/docs/redis/in-transit-encryption)
+     * [redis crate supports it](https://docs.rs/redis/0.20.0/redis/enum.ConnectionAddr.html#variant.TcpTls) though Portier's [`pubsub.rs`](../../src/utils/redis/pubsub.rs) explicitly does not


Ah, sorry about that. I'm not super happy with the pubsub code, but also haven't really worked with upstream to get us everything we need. Maybe newer versions already do what we want. (Can't say, don't remember off the top of my head what the exact pain points were.)

Filed #351 for this.

Dockerfile

jimdigriz · 2021-10-12T23:05:50Z

Sorry for taking some time to get back to this, but I have managed to get everything into a much better shape, including migrating to your data_url implementation.

I have this now running where we are about to make it live in production (~100k blocked domains and ~65k allowed origins[1] slipped in via data_url) so definitely a good time to get feedback and review from the project owners if this is something worth merging.

Thanks

[1] http://localhost:[1-65535] for development in addition to our public origins

stephank

This looks great! Feel free to file more issues if you think there are more gains to be made here. I'll happily merge this is as-is, if that's okay with you too?

Has the production deploy gone well?

stephank · 2021-10-24T07:30:58Z

contrib/gcp/README.md

+
+ * [Cloud Run](https://cloud.google.com/run/pricing#tables) (~$10/region/month)
+     * You may wish to pick a Tier 1 region for better pricing where possible
+ * [Redis](https://cloud.google.com/memorystore/docs/redis/pricing#instance_pricing) (~$40/region/month)


Curious: Does GCP have other storage options that could be more attractive if we supported them? For example, AWS has DynamoDB, which allows request-based pricing.

There is Firestore but to be honest if you wanted to go down that path I would recommend providing a 'slum it with'/'poormans database' option of using an S3 backend where garbage collection is handed by the bucket retention policy. Azure does not support S3 (though their storage accounts have interesting functionality making a DB not necessary) but GCP does speak S3.

If I wanted to do this cheaper I could have run Redis from a $4/month burstable shared-core CPU instance but forking out $1000/year so I do not have to maintain authentication services is a bargain; it definitely is for the company I am doing work for where their cloud services bill is at least an order of magnitude higher per month. The other option would have been a third party identity management provider which would have cost substantially more than than the current GCP deployment.

The only real reason I can think of to avoid the Redis cost is "lets work to give Google less money" which I can get behind but personally I would prefer to see the project focus all the other things it wants to do.

So, in short I would not fret about it, if someone wants to make this cheaper, they will balance the $1000/year with their dev effort, ongoing support and PR submission 'costs' and probably like me conclude that they have bigger fish to fry.

Maybe, though not possible for GCP (as you need $1m/year to play), maybe consider offering an Azure and AWS managed service now you know Portier is already worth at least $1000/year to lazy people like me. ;)

The only real reason I can think of to avoid the Redis cost is "lets work to give Google less money" which I can get behind but personally I would prefer to see the project focus all the other things it wants to do.

Hmm, well, project goals are a little lacking, so it's "whatever one wants to work on". 🙃

Maybe, though not possible for GCP (as you need $1m/year to play), maybe consider offering an Azure and AWS managed service now you know Portier is already worth at least $1000/year to lazy people like me. ;)

That is interesting! Hadn't considered these, and I can definitely investigate the AWS side.

There is Firestore but to be honest if you wanted to go down that path I would recommend providing a 'slum it with'/'poormans database' option of using an S3 backend...

S3 honestly sounds like a pretty good idea, something I hadn't considered either. I was thinking about what it would take to adapt Portier to work with serverless runtimes, and that combined with S3 or DynamoDB could make the whole deployment more cloud native and reduce ops overhead.

I don't know how much you can share, but am curious about what factors played into the decision making for your org. Is self-hosting about control / trust? Would self-hosting have been a requirement for competing identity services? Is the branding part (custom data) important?

No worries if you're not allowed to answer these. 🙂

I don't know how much you can share, but am curious about what factors played into the decision making for your org. Is self-hosting about control / trust? Would self-hosting have been a requirement for competing identity services? Is the branding part (custom data) important?

We would have preferred a managed service simply as OAuth2/authentication is hard to get right and not something developers tend to be familiar with; most are only as involved as to import the relevant library for their language of choice. Also...monitoring/response/recovery, I always like to make that someone else problem as I have been there, got the t-shirt and it is really the worst. :)

I am struggling to see how 'trust' would be an issue here, as Portier only handles authentication (no authorization or accounting or the resource/data being protected) and so only sees the users email address and user-agent details. From a compliance perspective, it probably would be no different in practice to a "log in with Google/Microsoft/Facebook/...".

We needed custom branding (emails and the landing page) which made self-hosting necessary and operationally it would have been impossible, unwise and impolite to rely you on your demo/public service for our services. Self-hosting also let us put into place other controls such as origins and domain allow/block lists, etc. For a managed service, it would have been also great as I would not have to have plumbed in SendGrid or spend my time following up why the receiver blocked one of SendGrid's IPs due to poor reputation.

I did look around for comparable third party services before building out the self-hosted serverless GCP solution, but they tended to be more of a whole IDM kitchen sink, expect all users to be pre-registrated and charge a per-seat/account fee. I do a lot of work in the RADIUS world and balk at costs that are not percentile (ie. 95%ile) request rate orientated.

Wonderfully though, for both for self-hosted and managed, what is great is Portier is formalised as something the organisation supports and there is no barrier for internal developer adoption.

I have built a bespoke email-loop system before and it was tedious and time consuming whilst Portier presenting as a OAuth2 service which crucially means I no longer have to do 'bespoke' knowledge transfers to others on. I cannot emphasise enough how important it is to be able to say "this is just OAuth2 with ID tokens" and not "@jimdigriz's custom-sauce programming black-magic" :)

Cheers

stephank · 2021-10-24T07:32:26Z

contrib/gcp/README.md

+ * Hard coding of the Redis port to `6379/tcp`
+     * workaround: `Reference [$(ref.portier-europe-west4-redis.port)], was not of type string but [NUMBER], cannot replace inside ["redis://:[email protected]:$($(ref.portier-europe-west4-redis.port))/0\n"]`
+     * GCP's Deployment Manager is mostly awful, so when GCP throws you lemons, it provides zero tools (or documentation) to make lemonade
+         * there is no cast operator available that works on `$(ref ...)`


Would it help if we implemented separate BROKER_REDIS_HOST, BROKER_REDIS_PORT, etc.?

Unfortunately not as GCP barfs at the attempt to shoehorn a number into a string field in the YAML template. IIRC all fields are strings at the templating phase and it is only later passed as YAML (and converted back to numbers) similarly to the situation in a Jinja template...though of course there numbers are automatically casted to strings.

Fortunately in practice the hardcoding is not a real problem and I only document it to stop other developers thinking I must have been an idiot to do it this way and then they waste their time rediscovering this. It could only ever become a problem if the administrator ran multiple Redis instances in the same GCP project which I suspect no one would ever want to; it costs nothing to keep Portier in its own isolated GCP project (similarly if this was in AWS and a separate account) and gain the security benefits of doing so.

contrib/gcp/README.md

stephank · 2022-01-16T14:04:50Z

@jimdigriz I'd like to merge this. Is it still WIP in your opinion?

jimdigriz · 2022-01-16T18:40:25Z

I've had this in production for a client for a few months now and it has not needed to make any major changes since the end of October; had to move to a single region (instead of multi-region) as in production I found the C2S geo-targetting hit a different region to the S2S communication and so the Redis server in that region had no idea what was going on.

Everything has been working fine though since. I think it is ready to merge.

jimdigriz changed the title ~~[WiP] deploy to GCP~~ [RFC|WiP] deploy to GCP Apr 20, 2021

jimdigriz changed the title ~~[RFC|WiP] deploy to GCP~~ [RFC|WIP] deploy to GCP Apr 22, 2021

stephank reviewed May 7, 2021

View reviewed changes

Dockerfile Outdated Show resolved Hide resolved

stephank mentioned this pull request May 20, 2021

docker: use new debian, add data_url arg #246

Merged

stephank approved these changes Oct 24, 2021

View reviewed changes

[WIP] deploy to GCP

e913344

stephank merged commit bbc2a69 into portier:main Jan 17, 2022

jimdigriz deleted the gcp branch January 17, 2022 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC|WIP] deploy to GCP #240

[RFC|WIP] deploy to GCP #240

jimdigriz commented Apr 20, 2021 •

edited

Loading

stephank left a comment •

edited

Loading

stephank May 7, 2021

stephank Oct 24, 2021

jimdigriz commented Oct 12, 2021 •

edited

Loading

stephank left a comment

stephank Oct 24, 2021

jimdigriz Oct 24, 2021 •

edited

Loading

stephank Oct 28, 2021

jimdigriz Oct 28, 2021 •

edited

Loading

stephank Oct 24, 2021

jimdigriz Oct 24, 2021 •

edited

Loading

stephank commented Jan 16, 2022

jimdigriz commented Jan 16, 2022

[RFC|WIP] deploy to GCP #240

[RFC|WIP] deploy to GCP #240

Conversation

jimdigriz commented Apr 20, 2021 • edited Loading

stephank left a comment • edited Loading

Choose a reason for hiding this comment

stephank May 7, 2021

Choose a reason for hiding this comment

stephank Oct 24, 2021

Choose a reason for hiding this comment

jimdigriz commented Oct 12, 2021 • edited Loading

stephank left a comment

Choose a reason for hiding this comment

stephank Oct 24, 2021

Choose a reason for hiding this comment

jimdigriz Oct 24, 2021 • edited Loading

Choose a reason for hiding this comment

stephank Oct 28, 2021

Choose a reason for hiding this comment

jimdigriz Oct 28, 2021 • edited Loading

Choose a reason for hiding this comment

stephank Oct 24, 2021

Choose a reason for hiding this comment

jimdigriz Oct 24, 2021 • edited Loading

Choose a reason for hiding this comment

stephank commented Jan 16, 2022

jimdigriz commented Jan 16, 2022

jimdigriz commented Apr 20, 2021 •

edited

Loading

stephank left a comment •

edited

Loading

jimdigriz commented Oct 12, 2021 •

edited

Loading

jimdigriz Oct 24, 2021 •

edited

Loading

jimdigriz Oct 28, 2021 •

edited

Loading

jimdigriz Oct 24, 2021 •

edited

Loading