Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure Whitehall assets are being served from the asset host #348

Closed
chrisroos opened this issue Dec 15, 2017 · 9 comments
Closed

Ensure Whitehall assets are being served from the asset host #348

chrisroos opened this issue Dec 15, 2017 · 9 comments
Assignees

Comments

@chrisroos
Copy link
Contributor

chrisroos commented Dec 15, 2017

Assets in Whitehall can be accessed using both www.gov.uk and assets.publishing.service.gov.uk. These two URLs respond with the same image of Theresa May, for example:

The majority of asset URLs in the wild use the asset host but there are some using www.gov.uk. The "Heathrow expansion: revised draft Airports National Policy Statement" consultation contains a link to this response form that uses www.gov.uk, for example.

We want to serve all assets using the asset host only so that we can configure things like caching and routing in a single place.

I've extracted this from issue #297 (Migrate Consultation Response Forms to Asset Manager) as it affects more than just consultation response forms.

@chrisroos
Copy link
Contributor Author

chrisroos commented Dec 15, 2017

I was originally planning to do this by adding a redirect in the Whitehall routes file but @danielroseman has suggested that we might be able to do it by adding a redirect route to the router.

I opened alphagov/govuk-puppet#6890 in preparation for adding the redirect to Whitehall and my WIP branch is https://github.com/alphagov/whitehall/tree/redirect-asset-requests-to-asset-host.

@chrisroos
Copy link
Contributor Author

I've investigated implementing this redirect by adding routes to the router and it certainly looks possible. I think we'd need to add a route for each of the following folders on disk that are currently being served by the PublicUploadsController in Whitehall:

  • /government/uploads/government/uploads/system/uploads/attachment_data
  • /government/uploads/government/uploads/system/uploads/edition_organisation_image_data
  • /government/uploads/government/uploads/system/uploads/image_data
  • /government/uploads/government/uploads/system/uploads/person
  • /government/uploads/system/uploads/attachment
  • /government/uploads/system/uploads/classification_featuring_image_data
  • /government/uploads/system/uploads/consultation_response_form
  • /government/uploads/system/uploads/consultation_response_form_data
  • /government/uploads/system/uploads/default_news_organisation_image_data
  • /government/uploads/system/uploads/edition_organisation_image_data
  • /government/uploads/system/uploads/edition_world_location_image_data
  • /government/uploads/system/uploads/feature
  • /government/uploads/system/uploads/image_data
  • /government/uploads/system/uploads/news_article
  • /government/uploads/system/uploads/organisation
  • /government/uploads/system/uploads/person
  • /government/uploads/system/uploads/promotional_feature_item
  • /government/uploads/system/uploads/take_part_page
  • /government/uploads/system/uploads/topical_event
  • /government/uploads/uploaded/hmrc
  • /government/uploads/uploaded/number10

Note that we don't have to add a redirect for /government/uploads/system/uploads/attachment_data as that will continue to be served by Whitehall's AttachmentsController in the short term.

I've tested this in integration by adding a redirect for people images:

$ ssh backend-1.backend.integration
$ govuk_app_console asset-manager

irb> require 'gds_api/router'
irb> router_api = GdsApi::Router.new(Plek.find('router-api'))
irb> router_api.add_redirect_route('/government/uploads/system/uploads/person', 'prefix', 'https://assets-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/person', 'permanent', segments_mode: 'preserve', commit: true)
# Note that I've stripped the basic authentication username and password from the curl command

$ curl -v "https://www-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/person/image/6/s216_PM_portrait_960x640.jpg"

GET /government/uploads/system/uploads/person/image/6/s216_PM_portrait_960x640.jpg HTTP/1.1
> Host: www-origin.integration.publishing.service.gov.uk
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: nginx
< Date: Mon, 18 Dec 2017 15:38:21 GMT
< Content-Type: text/html; charset=utf-8
< Content-Length: 153
< Connection: keep-alive
< Cache-Control: max-age=1800, public
< Expires: Mon, 18 Dec 2017 16:08:21 UTC
< Location: https://assets-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/person/image/6/s216_PM_portrait_960x640.jpg
< Accept-Ranges: bytes
< X-Varnish: 2059978690
< Age: 0
< Via: 1.1 varnish
< X-Cache: MISS
< Strict-Transport-Security: max-age=31536000

@chrisroos
Copy link
Contributor Author

I've opened alphagov/whitehall#3627 to do this redirection in the Whitehall app.

@chrisroos
Copy link
Contributor Author

I've merged alphagov/whitehall#3627 and will test the behaviour once it's been deployed to integration.

@chrisroos
Copy link
Contributor Author

chrisroos commented Dec 20, 2017

This has been deployed to integration and I've confirmed that it's working as expected:

# Note that I've removed the username/password from the curl command below.
#

$ curl -v -s "https://www-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/person/image/6/s216_PM_portrait_
960x640.jpg" > /dev/null

< HTTP/1.1 302 Found
< Server: nginx
< Date: Wed, 20 Dec 2017 17:49:17 GMT
< Content-Type: text/html; charset=utf-8
< Content-Length: 203
< Connection: keep-alive
< Cache-Control: no-cache
< Location: https://assets-origin.integration.publishing.service.gov.uk/government/uploads/system/uploads/person/image/6/s216_PM_portrait_960x640.jpg
< Strict-Transport-Security: max-age=31536000
< Via: 1.1 router
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-Frame-Options: DENY
< X-Request-Id: 055aa8ed-1be9-499d-b7fd-50348a982002
< X-Xss-Protection: 1; mode=block
< Accept-Ranges: bytes
< X-Varnish: 2060347186
< Age: 0
< Via: 1.1 varnish
< X-Cache: MISS

I'll aim to get it deployed to staging and production tomorrow.

@floehopper
Copy link
Contributor

@chrisroos:

I'll aim to get it deployed to staging and production tomorrow.

I've just merged #364 and I'd like to test those changes on integration before they are deployed to staging/production.

@chrisroos
Copy link
Contributor Author

chrisroos commented Dec 21, 2017

I've realised that alphagov/whitehall#3627 will also redirect requests for HMRC's Basic PAYE Tools files. We don't know how the software handles redirects and so I've opened alphagov/whitehall#3636 to explicitly avoid redirecting these requests.

We're tracking the migration of HMRC assets in issue #217.

@chrisroos
Copy link
Contributor Author

The changes in alphagov/whitehall#3627 and alphagov/whitehall#3636 have now been deployed to production.

I've confirmed that non-HMRC assets are now being redirected to the asset host as expected:

# Requesting an HMRC asset continues to respond to requests on gov.uk
$ curl -I "https://www.gov.uk/government/uploads/uploaded/hmrc/realtimepayetools-update.xml"
HTTP/1.1 200 OK
<snipped>

# Requesting a non-HMRC asset redirects requests to the asset host
$ curl -I "https://www.gov.uk/government/uploads/system/uploads/person/image/6/s216_PM_portrait_960x640.jpg"
HTTP/1.1 302 Found
Location: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/person/image/6/s216_PM_portrait_960x640.jpg
<snipped>

While chatting to @floehopper, we realised that we're using a temporary (302) redirect where we should probably be using a permanent (301) redirect. I've captured this in issue #366.

@chrisroos
Copy link
Contributor Author

I think this is all done for now. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants