Skip to content

Publication Services Microservice for Publishing and Information Project

License

Notifications You must be signed in to change notification settings

hmcts/pip-publication-services

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pip-publication-services

Table of Contents

Overview

pip-publication-services is a microservice that deals with operations related to sending of notification emails to verified users, admin users and publication subscribers using GOV.UK Notify, as well as forwarding of publications to third-party publishers.

It sits within the Court and Tribunal Hearings Service (CaTH hereafter), written with Spring Boot/Java.

In practice, the service is usually containerized within a hosted kubernetes environment within Azure.

All interactions with pip-publication-services are performed through the API (specified in API Documentation) either as a standalone service or via connections to other microservices.

Features and Functionality

  • Generation of the following email contents using the emails templates stored in GOV.UK Notify:
    • Welcome emails relating to the creation of new and existing verified and admin user accounts, as well as duplicate media accounts.
    • Emails to inactive verified, admin and HMCTS IdAM users, requesting them to verify or sign in to their accounts.
    • Subscription related emails to verified users when new publications are uploaded to CaTH, or when a user’s subscriptions are being deleted as a result of routine maintenance of locations within CaTH.
    • Notification emails to all system admin users when selected audit actions are performed by other system admins.
    • CaTH service team emails containing reports of current data contents and statistics, and when a publication is being uploaded with an unindentified location.
    • Emails containing one-time password (OTP) used for user verification in Azure B2C password reset flow.
  • Handles forwarding of newly uploaded publications to third party publishers.
  • Rate limit the number of emails that can be sent to a user within a set interval.
  • Secure/Insecure Mode: Use of bearer tokens for authentication with the secure instance (if desired).
  • OpenAPI Spec/Swagger-UI: Documents and allows users or developers to access API resources within the browser.

GOV.UK Notify

GOV.UK Notify allows government departments to send emails, text messages and letters to their users. To use this Notify service, you will first need to create an account on its website, if you don't have one. You will also need to request access from a member of the CaTH team in order to view/edit the information on the GOV.UK Notify CaTH account. The CaTH team member will need to log in to GOV.UK Notify to invite you to join and configure your access policies according to your roles.

pip-publication-services integrates with the java client libraries for GOV.UK Notify by means of an API key. The API keys are stored in Microsoft Azure key vaults which can be accessed from the Azure portal with relevant permissions and subscriptions. There are 2 API keys configured for CaTH:

  • gov-uk-notify-api-key - Live key for sending of official CaTH emails.
  • gov-uk-notify-team-key - Key to be used by the CaTH team for local development or on non-prod environments.
  • gov-uk-notify-test-api-key - Test key with limited allowances to use for testing and local development.

To use the GOV.UK Notify email sending service, reusable email templates need to be pre-configured on the CaTH account using its website. The email templates have some placeholder fields which will be replaced with the actual contents generated by pip-publication-services. The PIP service uses the unique ID associated with an email template to identify and link to the individual template on GOV.UK Notify. The placeholder values are set dynamically by CaTH when a email is requested and sent to the Notify service together with the intended recipient email addresses. The replacement of placeholders and the actual sending of the emails are handled by GOV.UK Notify.

For any update to GOV.UK Notify e-mail template such as adding new fields or changes to existing placeholders, we should first create a copy of the existing template and make the relevant changes in the new copy. We then update the code in the service to point to this new template for local testing. Once the changes have been verified and deployed, we can make changes to the main template and point the code back to the original template.

Subscription Fulfillment

CaTH verified users are able to add email subscriptions to new publications using either the court name, case reference number, case name or case URN. When a publication matching the users' subscriptions are uploaded to CaTH, they will receive an email notification telling them the publication is available to view from the CaTH frontend.

pip-publication-services fulfills the subscription with the following process:

  • When a new publication is uploaded, pip-subscription-management determines all the subscribers to that publication and notify pip-publication-services, passing in all the subscribers' emails alongside the publication unique reference ID.
  • pip-publication-services then retrieves the publication metadata from pip-data-management and determine whether the publication has been uploaded as a flat file or in JSON format.
  • If the publication was uploaded as a flat file, it will retrieve the uploaded file from Azure blob storage through pip-data-management.
  • If the publication was uploaded in JSON format and the JSON payload is less than the set limit (currently default to 2MB), it will:
    • generate the publication summary information through pip-data-management.
    • retrieve the files in alternative publishing formats (PDF and/or Excel spreadsheet) through pip-data-management. These files are pre-generated and stored in Azure blob storage during the upload process.
  • The email will be personalised using the information above by means by placeholders. Any required files will be uploaded to GOV.UK Notify and links to download the files will be provided in the emails.
  • All the required email information will be sent to GOV.UK Notify to generate the emails for the subscribers.

Third Party Publisher

When a new publication is uploaded to CaTH, pip-subscription-management retrieves all the third party publishers for that publication. The allowed third party users (also known as channel) for that publication is determined using the user roles in relation to the list type being published (can be press, CFT or crime lists). It will then notify pip-publication-services, passing in the Artefact ID and the third party API destination, so the publication can be sent to its destination.

Currently Courtel is the only third party publisher configured for CaTH but the service is able to send publications to multiple third party channels if we have more publishers.

If the publication was uploaded as a flat file, the same file will be retrieved from Azure blob storage through pip-data-management, and forwarded to Courtel in multipart/form-data format.

If the publication was uploaded in JSON format, the original content will be sent to Courtel in JSON format. The service will also retrieve the stored PDF from Azure blob storage through pip-data-management, and make a second call to the Courtel API attaching the PDF in multipart/form-data format.

Courtel will also be notified if publications sent to them have been deleted by CaTH.

Rate Limiting

The number of emails allowed to be sent to a given user is controlled using rate-limiting. This is done to avoid DDOS attack using mass mailing as there is no limit on email sending in GOV.UK Notify. The email capacity used for rate limiting in Publication Services is divided into 2 groups, i.e. standard capacity email group and high capacity email group.

Standard capacity is used for the common email types which tend to be generated just once or a few times a day such as account creation emails and media application reporting emails. For endpoints which has a tendency to generate a high number of emails per day, high capacity is set for rate limiting. This includes the system admin update emails and the subscription emails.

The implementation of rate limiting is done using the token bucket algorithm based on an analogy of a fixed capacity bucket into which tokens of predetermined size are added at a fixed rate. The bandwidth of the rate limit, i.e. the amount of tokens the bucket allowed to hold, and the period within tokens will be fully regenerated, are pre-configured. The staging environment is set to have a much higher capacity due to the running of end-to-end tests and manual testing.

Redis is used to store the email count for users in a hashed cache created during application startup. If the user's email count has reached the bucket capacity, the endpoint would normally return a 429 too many requests status. For the case where the endpoint is sending to a list of email addresses, an OK status will be returned and the emails failed to be sent would be logged as error.

The redis cache entries are removed if they have not been accessed for a period of time. The cache expiry time, the email capacity and the bucket refill interval can all be controlled and overridden with environment variables.

Roles

Any endpoint that should require authentication, needs to be annotated either at controller or endpoint level with @IsAdmin.

Architecture Diagram

Architecture Diagram for pip-publication-services

Getting Started

Prerequisites

General
  • Java JDK 21 - this is used throughout all of our services.
  • REST client of some description (e.g. Curl, Insomnia, Postman). Swagger-UI can also be used to send requests.
Local development
  • Azurite - Local Azure emulator used along with Azure Storage explorer for local storage.
  • Azure Storage Explorer - Used for viewing and storing blobs within an Azurite instance locally.
Nice-to-haves
  • pip-dev-env - This repo provides a development environment wherein ensure all microservices, as well as external services (e.g. postgres & redis) are all running in tandem within the service. It eases the development process and is particularly helpful when working with cross-service communication, as it also reduces strain on local performance from having many separate IDE windows open.

Installation

  • Clone the repository
  • Ensure all required environment variables have been set.
  • Build using the command ./gradlew clean build
  • Start the service using the command ./gradlew bootrun in the newly created directory.

Configuration

Environment Variables

Environment variables are used by the service to control its behaviour in various ways.

These variables can be found within various separate CaTH Azure keyvaults. You may need to obtain access to this via a support ticket.

  • Runtime secrets are stored in pip-ss-{env}-kv (where {env} is the environment where the given instance is running (e.g. production, staging, test, sandbox)).
  • Test secrets are stored in pip-bootstrap-{env}-kv with the same convention.
Get environment variables with python scripts

Python scripts to quickly grab all environment variables (subject to Azure permissions) are available for both runtime and test secrets.

Runtime secrets

Below is a table of currently used environment variables for starting the service, along with a descriptor of their purpose and whether they are optional or required.

Variable Description Required?
SPRING_PROFILES_ACTIVE If set to dev, the application will run in insecure mode (i.e. no bearer token authentication required for incoming requests.) Note - if you wish to communicate with other services, you will need to set them all to run in insecure mode in the same way. No
APP_URI Uniform Resource Identifier - the location where the application expects to receive bearer tokens after a successful authentication process. The application then validates received bearer tokens using the AUD parameter in the token No
CLIENT_ID Unique ID for the application within Azure AD. Used to identify the application during authentication. No
TENANT_ID Directory unique ID assigned to our Azure AD tenant. Represents the organisation that owns and manages the Azure AD instance. No
CLIENT_SECRET Secret key for authentication requests to the service. No
ACCOUNT_MANAGEMENT_URL URL used for connecting to the pip-account-management service. Defaults to staging if not provided. No
DATA_MANAGEMENT_URL URL used for connecting to the pip-data-management service. Defaults to staging if not provided. No
SUBSCRIPTION_MANAGEMENT_URL URL used for connecting to the pip-subscription-management service. Defaults to staging if not provided. No
ACCOUNT_MANAGEMENT_AZ_API Used as part of the scope parameter when requesting a token from Azure. Used for service-to-service communication with the pip-account management service. No
DATA_MANAGEMENT_AZ_API Used as part of the scope parameter when requesting a token from Azure. Used for service-to-service communication with the pip-data-management service. No
SUBSCRIPTION_MANAGEMENT_AZ_API Used as part of the scope parameter when requesting a token from Azure. Used for service-to-service communication with the pip-subscription-management service. No
NOTIFY_API_KEY Used in the authorisation header for interaction with GOV.UK notify client. The API key follows the format {key_name}-{iss-uuid}-{secret-key-uuid}. When running the service locally we should not use the live key. Only test or team API key should be used. Yes
PI_TEAM_EMAIL The email address for sending CaTH reporting emails (e.g. media applications, mi reports, unindentified blobs) to. No
THIRD_PARTY_CERTIFICATE A trust store containing certification for Courtel as the trusted third party publisher. No
REDIS_HOST Hostname of the Redis instance used for rate limiting. Default to localhost. No
REDIS_PORT Port that the Redis instance is running on. Default to port 6379. No
REDIS_PASSWORD Password used to connect to the Redis instance. Default to nothing. No
RATE_LIMIT_CACHE_EXPIRY The expiry duration (in minutes) of a rate-limiting redis cache entry based on the last time it was accessed. Default to 30 minutes. No
STANDARD_MAX_EMAILS The maximum number of emails allowed to be sent to a user in a given interval for standard capacity email types. Used for rate limiting. Default to 10 (per 30 minutes) for all environments except staging where it is set to 100. No
HIGH_CAPACITY_MAX_EMAILS The maximum number of emails allowed to be sent to a user in a given interval for high capacity email types. Used for rate limiting. Default to 200 (per 30 minutes) for all environments except staging where it is set to 1000. No
EMAIL_RATE_LIMIT_INTERVAL The rate limiting interval in minutes. Default to 30 minutes. No
SUMMARY_MAX_INBOUND_SIZE The maximum size of the input for the generation of the email summary. Default to 256kb. No
Additional Test secrets

Secrets required for getting integration tests to run correctly can be found in the below table:

Variable Description
CLIENT_ID As above
CLIENT_SECRET As above
APP_URI As above
TENANT_ID As above
ACCOUNT_MANAGEMENT_AZ_API As above
DATA_MANAGEMENT_AZ_API As above
SUBSCRIPTION_MANAGEMENT_AZ_API As above
NOTIFY_API_KEY As above. Only the test API key should be used when running integration tests.
PI_TEAM_EMAIL As above
CLIENT_ID_FT Client ID of external service used for authentication with publication-services application in the functional tests.
CLIENT_SECRET_FT Client secret of external service used for authentication with publication-services application in the functional tests.

Application.yaml files

The service can also be adapted using the yaml files found in the following locations:

API Documentation

Our full API specification can be found within our Swagger-UI page. It can be accessed locally by starting the service and going to http://localhost:8081/swagger-ui/swagger-ui/index.html Alternatively, if you're on our VPN, you can access the swagger endpoint at our staging URL (ask a teammate to give you this).

Examples

As mentioned, the full api documentation can be found within swagger-ui, but some of the most common operations are highlighted below.

Most of the communication with this service benefits from using secure authentication. While possible to stand up locally in insecure mode, to simulate a production environment it is better to use secure mode. Before sending in any requests to the service, you'll need to obtain a bearer token using the following approach:

Requesting a bearer token

To request a bearer token, send a POST request following this template:

curl --request POST \
  --url https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token \
  --header 'Content-Type: multipart/form-data' \
  --form client_id={CLIENT_ID_FOR_ANOTHER_SERVICE} \
  --form scope={APP_URI}/.default \
  --form client_secret={CLIENT_SECRET_FOR_ANOTHER_SERVICE}\
  --form grant_type=client_credentials

You can copy the above curl command into either Postman or Insomnia, and they will automatically be converted to the relevant formats for those programs.

Note - the _FOR_ANOTHER_SERVICE variables need to be extracted from another registered microservice within the broader CaTH umbrella (e.g. pip-data-management)

Using the bearer token

You can use the bearer token in the Authorization header when making requests. Here is an example using an endpoint to send publication to third party publisher.

curl --request POST \
  --url http://localhost:8081/notify/api \
  --header 'Authorization: Bearer {BEARER_TOKEN_HERE}' \
  --header 'Content-Type: application/json' \
  --data-raw '[
    {
    	"apiDestination": "<thirdPartyUrl>",
    	"artefactId": "3fa85f64-5717-4562-b3fc-2c963f66afa6"
    }
  ]'

Deployment

We use Jenkins as our CI/CD system. The deployment of this can be controlled within our application logic using the various Jenkinsfile-prepended files within the root directory of the repository.

Our builds run against our dev environment during the Jenkins build process. As this is a microservice, the build process involves standing up the service in a docker container in a Kubernetes cluster with the current staging master copies of the other interconnected microservices.

If your debugging leads you to conclude that you need to implement a pipeline fix, this can be done in the CNP Jenkins repo

Monitoring and Logging

We utilise Azure Application Insights to store our logs. Ask a teammate for the specific resource in Azure to access these. Locally, we use Log4j.

In addition, this service is also monitored in production and staging environments by Dynatrace. The URL for viewing our specific Dynatrace instance can be had by asking a team member.

Application Insights

Application insights is configured via the lib/applicationinsights.json file. Alongside this, the Dockerfile is configured to copy in this file and also download the app insights client.

The client at runtime is attached as a javaagent, which allows it to send the logging to app insights.

To connect to app insights a connection string is used. This is configured to read from the KV Secret mounted inside the pod.

It is possible to connect to app insights locally, although somewhat tricky. The easiest way is to get the connection string from azure, set it as an environment variable (APPLICATIONINSIGHTS_CONNECTION_STRING), and add in the javaagent as VM argument. You will also need to remove / comment out the connection string line the config.

Security & Quality Considerations

We use a few automated tools to ensure quality and security within the service. A few examples can be found below:

  • SonarCloud - provides automated code analysis, finding vulnerabilities, bugs and code smells. Quality gates ensure that test coverage, code style and security are maintained where possible.
  • DependencyCheckAggregate - Ensures that dependencies are kept up to date and that those with known security vulnerabilities (based on the National Vulnerability Database(NVD)) are flagged to developers for mitigation or suppression.
  • JaCoCo Test Coverage - Produces code coverage metrics which allows developers to determine which lines of code are covered (or not) by unit testing. This also makes up one of SonarCloud's quality gates.
  • PMD - Static code analysis tool providing code quality guidance and identifying potential issues relating to coding standards, performance or security.
  • CheckStyle - Enforces coding standards and conventions such as formatting, naming conventions and structure.

Test Suite

This microservice is comprehensively tested using unit, integration and functional tests.

Unit tests

Unit tests can be run on demand using ./gradlew test.

Integration tests

Integration tests can be run on demand using ./gradlew integration.

For our integration tests, we are using Square's MockWebServer library. This allows us to test the full HTTP stack for our service-to-service interactions.

The mock server interacts with external CaTH services on staging.

Functional tests

Functional tests can be run using ./gradlew functional

Functional testing is performed on the stood-up publication-services instance on the dev pod (during pull request) or on staging (when running on the master branch).

This publication-services instance interacts with external CaTH services on staging.

Fortify

We use Fortify to scan for security vulnerabilities. This is run as part of our nightly pipelines.

Contributing

We are happy to accept third-party contributions. See .github/CONTRIBUTING.md for more details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Publication Services Microservice for Publishing and Information Project

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages