Skip to content

Commit

Permalink
polish the doc
Browse files Browse the repository at this point in the history
  • Loading branch information
thiennn committed Oct 20, 2023
1 parent 3a9a864 commit 0f5814d
Showing 1 changed file with 20 additions and 16 deletions.
36 changes: 20 additions & 16 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ https://github.com/nashtech-garage/yas

## Repo setup

Entire the source code of Yas project is hosted publicly in GitHub as a monorepo and the source code of each micro-services are put in its own folder. Generally, there are two ways to organize the source code for microservices projects: multi-repos and monorepo. Multi-repos means that there are multiple repositories to host the project, each micro-service hosted in its own repo. We chose monorepo for simplicity, by this way we can have only one issue tracker to watch for entire the project. Some features or bugs require code change in multiple micro-services, with monorepo we can create one commit that can span multiple micro-services. The code is visible to everyone, so we don’t have the need of a separate access control for each micro-service.
The source code of the Yas project is publicly hosted on GitHub as a monorepo, where each microservice has its own folder. There are two common ways to organize the source code for microservices projects: multi-repos and monorepo. Multi-repos means that each microservice has its own repository, while monorepo means that all microservices share a single repository. We chose monorepo for simplicity, as it allows us to have a single issue tracker for the entire project. Some features or bugs may require code changes in multiple microservices, and with monorepo we can make a single commit that covers them all. The code is visible to everyone, so we do not need a separate access control for each microservice.

## Continous Interation

In Yas, we use GitHub Actions, which are totally free for open-source project, to build the continuous integration pipeline. All the GitHub Actions workflows are put in `/.github/workflows` folder. Each micro-services will have its own workflow. Let look at the first part of the typical workflow: product
In Yas, we use GitHub Actions to build the continuous integration pipeline. It is totally free for open-source project. All the GitHub Actions workflows are put in `/.github/workflows` folder. Each micro-services will have its own workflow. Let look at the first part of the typical workflow: product

```yaml
name: product service ci
Expand All @@ -44,9 +44,9 @@ on:
workflow_dispatch:
```
We use the `on` keyword to specify what event will trigger our workflow. Here we trigger the workflow when there are pushes to the main branch. As we organized the project in a single monorepo, we need to specify the paths, the workflow only run when there are changes in that paths. That mean developers push code to the order folders doesn’t trigger the workflow of the product. Next, we also would like to run the workflows in the pull requests to make sure the code changes in pull request pass all the requirement before being merged. Finally, with `workflow_dispatch` we allow the workflow can be trigger manually in GitHub UI.
We use the `on` keyword to specify what event will trigger our workflow. Here we trigger the workflow when there are pushes to the main branch. Since we organized the project in a single monorepo, we need to specify the paths that are relavant for each workflow. This way, the workflow will only run when there are changes in those paths. For example, pushing code to the order folders will not trigger the product workflow. We also want to run the workflows on pull requests to make sure that the code changes pass all the requirement before being merged. Finally, with `workflow_dispatch` we enable the workflow can be triggered manually from the GitHub UI.

Next, we will define jobs in our workflow. One GitHub Actions workflow can contain many jobs which run parallel by default. Each job runs inside its own virtual machine (runner) specified by `run-on`. In our case we only need one job. In the job we can have many steps. Each step is either a shell script or an action. Steps are executed in order and are dependent on each other. Since each step is executed on the same runner, you can share data from one step to another. For example, you can have a step that builds your application followed by a step that tests the application that was built.
Next, we will define jobs in our workflow. A GitHub Actions workflow can contain multiple jobs that run in parallel by default. Each job runs inside its own virtual machine (runner) specified by `run-on`. In our case, we only need one job. Within the job we can have multiple steps. Each step is either a shell script or an action. Steps are executed in order and depend on each other. Since all steps run on the same runner, we can share data from one step to another. For instance, we can have a step that builds our application followed by a step that tests the application that was built.

```yaml
jobs:
Expand Down Expand Up @@ -89,27 +89,27 @@ jobs:
tags: ghcr.io/nashtech-garage/yas-product:latest
```

The first step in our workflow is checkout the source code. This is done by using the action `actions/checkout` version 3. The next steps we reuse some actions defined in the https://github.com/nashtech-garage/yas/blob/main/.github/workflows/actions/action.yaml file which will setup Java SDK 17 and some caches to improve the build time. We build the source code by run Maven command; run test, and export test result to be showed in the UI. There is a limitation that if there are many GitHub Action workflows are triggered by one git push, the test report is showed only in the first workflow. The issue is reported here https://github.com/dorny/test-reporter/issues/67
The first step in our workflow is checking out the source code. This is done by using the action `actions/checkout` version 3. In the next steps we reuse some actions defined in the https://github.com/nashtech-garage/yas/blob/main/.github/workflows/actions/action.yaml which will setup Java SDK 17 and some caching to improve the build time. We build the source code by run Maven command; run test, and export test result to be showed in the GitHub UI. There is a limitation in the case that if there are many GitHub Action workflows are triggered by one git push, the test report is showed only in the first workflow. The issue is reported here https://github.com/dorny/test-reporter/issues/67

![yas-unit-test](images/yas-ci.png)


We use SonarCloud to analyze the source code. SonarCloud is free for open-source projects. To authenticate with SonarCloud, we will need the SONAR_TOKEN. After register an account on SonarCloud and add our GitHub repo to SonarCloud, we can get the SONAR_TOKEN. This SONAR_TOKEN needs to be added to repository secret in GitHub. In the repository, go to Settings –> Security –> Secrets and variables –> Actions and add new repository secret. Because the security reason, the SONAR_TOKEN is not available in pull requests from forked repos. We added the `if:` statement so that this step only run on the main branch or pull requests created from within our repo not from a fork. The SonarCloud bot will add the scanning report to every pull request as image below.
We use SonarCloud to analyze the source code. SonarCloud is free for open-source projects. To authenticate with SonarCloud, we will need the SONAR_TOKEN. After registering an account on SonarCloud and add our GitHub repo to SonarCloud, we can get the SONAR_TOKEN. This SONAR_TOKEN needs to be added to repository secret in GitHub. In the repository, go to Settings –> Security –> Secrets and variables –> Actions and add new repository secret. Because the security reason, the SONAR_TOKEN is not available in pull requests from forked repos. We added the `if:` statement so that this step only run on the main branch or pull requests created from within our repo not from a fork. The SonarCloud bot will add the scanning report to every pull request as image below.

![yas-pr-check](images/yas-ci-check.png)


The final steps are login to GitHub Packages, build and push the docker image to there. We only build and push docker image when the workflow is run in the main branch not on pull requests.
The final steps are login to GitHub Packages, build and push the docker images. We only build and push docker images when the workflow is run in the main branch not on pull requests.

To improve the code quality of the project, we have configured that every pull request needs to pass certain conditions: build success, pass sonar gate and have at least 2 developers review and approved, otherwise the Merge button will be blocked.

## Authentication and Authorization

Authentication is hard. Many developers have been struggling to find better ways to secure the browser-based application, especially with SPAs. Traditionally, websites use cookies to authenticate user requests, then with SPAs people moved to using token for authentication. Lets review how the cookies and token authentication works and the differences between them
Authentication is a challenging task for many developers who want to secure their browser-based applications, especially with SPAs. Traditionally, websites use cookies to authenticate user requests, but with SPAs, people moved to using tokens for authentication. Let's review how cookies and token authentication work and what are the differences between them.

#### Cookies-based authentication

Cookies are small pieces of data created by a web server and placed on the user’s web browser. The browser will automatically send them for subsequence requests in the same domain. Authentication cookies are used by web servers to authenticate that a user is logged in.
Cookies are small pieces of data created by a web server and placed on the user’s web browser. The browser will automatically send them for subsequence requests in the same domain. Authentication cookies are used by web servers to verify that a user is logged in.

##### The advantages

Expand All @@ -118,28 +118,28 @@ Cookies are small pieces of data created by a web server and placed on the user

##### The downside

- It is vulnerable to cross-site request forgery attacks (XSRF or CSRF). Although there are workarounds to mitigate this threat, the risk still there. Recently the major browsers have introduced SameSite attribute that allow us to decide whether cookies should be sent to third-party websites using the Strict or Lax setting.
- It is vulnerable to cross-site request forgery attacks (XSRF or CSRF). Although there are workarounds to mitigate this threat, the risk still there. Recently, the major browsers have introduced SameSite attribute that allow us to decide whether cookies should be sent to third-party websites using the Strict or Lax setting.
- Cookies is not friendly with REST APIs

#### Token-based authentication

The web browser will receive a token from the web server after it has verified the user’s login detail. Then in subsequent requests, that token will be sent to server as an authentication header.
The web browser will receive a token from the web server after it has validated the user’s login detail. Then in subsequent requests, that token will be sent to server as an authentication header.

##### The advantages

- Unlike cookies, token is not automatically received or sent to server. It has to be done by JavaScript. Therefore, it is invulnerable to cross-site request forgery attacks (CSRF)
- Unlike cookies, token is not automatically received or sent to server. It has to be handled by JavaScript. Therefore, it is invulnerable to cross-site request forgery attacks (CSRF)
- Token is friendly with REST APIs

##### The disadvantages

- Because the token must be read and sent by JavaScript so it is vulnerable to cross-site scripting (XSS)
- Granting, storing and renewing token is complicated. In 2012 when the OAuth2 RFC was released, the implicit flow is the recommended way for SPAs. However, it has many drawbacks, the main concern is that the access token is delivered to browser via a query string in the redirect URI, which is visible in the browser’s address bar, the browsers history. The access token can also be maliciously injected. The implicit flow is deprecated by code flow with PKCE. Regarding to any approaches, the token has to be stored in the browser, and this is a risk.
- Because the token must be read and sent by JavaScript, they are vulnerable to cross-site scripting (XSS)
- Granting, storing and renewing tokens is complicated. In 2012, when the OAuth2 RFC was released, the implicit flow was the recommended way for SPAs. However, it has many drawbacks, the main concern being that the access token is delivered to browser via a query string in the redirect URI, which is visible in the browser’s address bar and in the browsers history. The access token can also be maliciously injected. The implicit flow is deprecated by code flow with PKCE. Regardless of any approach, the token has to be stored in the browser, which is a risk.

In Yas, we use SameSite cookies and token together with backend for frontend (BFF) pattern with Spring Cloud Gateway. We also use Keycloak as the authentication provider.
In Yas, we use SameSite cookies and tokens together with the backend for frontend (BFF) pattern with Spring Cloud Gateway. We also use Keycloak as the authentication provider.

![spa-authetnication-bff-yas](images/yas-authen-bff.png)

The BFF work as a reverse proxy for both Next.js and resource servers behind. The authentication between Browser and BFF is done by cookies. The BFF takes the OAuth2 client role and authenticate with Keycloak by OAuth2 code flow using spring-boot-starter-oauth2-client. When received the access token, BFF keeps it in memory and automatically append it along with api requests to resource servers. With this implementation, we can take out the risk of storing token in the browsers. Renewing tokens also handled automatically by the OAuth2 client. Below is the excerpt of the pom.xml of the backoffice-bff
The BFF acts as a reverse proxy for both Next.js and resource servers behind it. The authentication between browser and BFF is done by cookies. The BFF plays the role of OAuth2 client and authenticate with Keycloak by OAuth2 code flow with spring-boot-starter-oauth2-client. When it receives the access token, BFF keeps it in memory and automatically appends it along with API requests to resource servers. With this implementation, we can eliminate the risk of storing token in the browsers. Renewing tokens is also handled automatically by the OAuth2 client. Below is an excerpt of the pom.xml of the backoffice-bff

```xml
<dependency>
Expand Down Expand Up @@ -187,6 +187,10 @@ spring:

## Change Data Capture (CDC) with Debezium

We use debezium to capture the change in some table, those changes will be pushed to kafka topics. There is background job that listen to those topics, receive the ids of products having data changed, call to product rest API to get product information and update to elastic search.

Debezium acts as a source connector of Kafka connect. It captures row-level changes that insert, update, and delete database content and that were committed to a PostgreSQL database. The connector generates data change event records and streams them to Kafka topics

## Product searching with Elasticsearch

## Duplicating data to improve performance
Expand Down

0 comments on commit 0f5814d

Please sign in to comment.