Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Disaster Recovery Support to Mushop Catalogue #444

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

naikvenu
Copy link
Contributor

@naikvenu naikvenu commented Oct 2, 2024

This change aims to support Mushop disaster recovery usecase. Only doing this for catalogue service due to effort and complexity involved with other services like carts, orders, user. showcasing this capability should serve the immediate purpose.

  • The code changes are done for catalogue: added a failover condition for primary and standby.
  • In order to achieve this use case we need to rely on TNS_ADMIN and the sqlnet.ora needs to be changed as below:

WALLET_LOCATION = (SOURCE = (METHOD = file) (METHOD_DATA = (DIRECTORY="?/network/admin")))
SSL_SERVER_DN_MATCH=yes
change to,
WALLET_LOCATION = (SOURCE = (METHOD = file) (METHOD_DATA = (DIRECTORY="${TNS_ADMIN}")))

This is a manual step that will be added to the doc for users who wants to try the DR use case.

  • All the values for catalogue are separated and shown as an example under: values-dr.yaml

  • Under deploy/complete/helm-chart/mushop/charts/catalogue/values.yaml , i need your suggestion whether we need to separate this image as :dr tag or it needs to be the latest build itself. If it needs to go in the latest build then we need the wallet changes to be done for catalogue. I will provide the steps on that.

@naikvenu naikvenu requested a review from junior as a code owner October 2, 2024 21:24
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Oct 2, 2024
Copy link
Member

@junior junior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main issues:

  • Oracle client downgraded
  • hardcoded wallet version location on multiple places
  • Code changes on the service should com first and tested before Kubernetes changes
  • Same service is reused by different applications (monolith and micro service)
  • DR is not the main use case
  • Service image location should be the common location
  • Service should be versioned, and the build is automated and pushed to the registry. (You can use your fork and configure to push to your container registry before approval and sent to the main)

{{- $secretPrefix := (and .Values.osb.atp .Chart.Name) | default (and $globalOsb.atp ($globalOsb.instanceName | default "mushop")) | default .Chart.Name -}}
{{- $connectionSecret := (and $usesOsbDb (printf "%s-oadb-connection" $secretPrefix)) | default .Values.oadbConnectionSecret | default (.Values.global.oadbConnectionSecret | default (printf "%s-oadb-connection" $secretPrefix)) -}}
{{- $credentialSecret := (and $usesOsbDb (printf "%s-oadb-credentials" $secretPrefix)) | default .Values.oadbUserSecret | default (printf "%s-oadb-credentials" $secretPrefix) -}}
{{- $primaryWalletPath := .Values.env.primary_oadb_wallet_path | default "/usr/lib/oracle/19.3/client64/lib/network/admin" -}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the wallet location cannot be hardcoded, as the version can be changed. The usage of 19.10 will be changed very soon. Why downgrade from 19.10 to 19.3? Remember that the Oracle Client need to support both amd64 and arm64, as MuShop can use Ampere shapes today.

secretKeyRef:
name: {{ $connectionSecret }}
key: oadb_service
- name: PRIMARY_OADB_WALLET_PATH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are assuming that ALL users will be using a DR environment. We should not change to "PRIMARY_xxx", just leave as normal and include the option of Standby.

Comment on lines 131 to 134
mountPath: /usr/lib/oracle/19.3/client64/lib/network/admin/
readOnly: true
- name: standby-wallet
mountPath: /usr/lib/oracle/19.3/client64/lib/network/admin/standby
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the location (mainly the version) cannot be hardcoded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why downgrade from 19.10 to 19.3?

Comment on lines 4 to 6
# This is temporary repo where i have built the container.
repository: syd.ocir.io/apaccpt03/demo/mushop-catalogue
tag: latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot have images separated. The official is the iad.ocir.io/oracle/ateam/mushop-catalogue, you should just version the service. We cannot use an image from Sydney from a random tenancy. We do not use the tag latest exactly because of this kind of changes. As this is a major update, should be 2.0.0. Also, the source code should on the repo before the image.

@@ -27,6 +28,8 @@ resources:

env:
zipkin:
primary_oadb_wallet_path: /usr/lib/oracle/19.3/client64/lib/network/admin/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not call primary, as if the "Standard" wallet. 99% of the usage will be with regular 1 DB. The standy is extra and optional, and just for the DR use case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes on the service should be in a separate PR and prior the configuration changes on the Kubernetes. Remember that the same service is used by MuShop basic (compute), MuShop complete (Kubernetes) and standalone on several labs.


// Check if DB connection can be made, only for logging purposes, should not fail/exit
// Test the connection to the primary database
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be clear that is the main or primary DB, as most of the use cases will not use DR

db, err := sqlx.Open("godror", *connectString)
if err != nil {
logger.Log("err", err)
os.Exit(1)
logger.Log("Error", "Failed to open Primary Database connection", "CONNECTSTRING", *connectString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not call primary on the logs as can cause confusion.

err = db.Ping()
if err != nil {
logger.Log("Error", "Unable to connect to Database", "CONNECTSTRING", connectString)
logger.Log("Error", "Unable to connect to Primary Database", "CONNECTSTRING", *connectString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Also, the log should show that tried to the main DB and then is trying the standby.

@naikvenu
Copy link
Contributor Author

Main issues:

  • Oracle client downgraded
  • hardcoded wallet version location on multiple places
  • Code changes on the service should com first and tested before Kubernetes changes
  • Same service is reused by different applications (monolith and micro service)
  • DR is not the main use case
  • Service image location should be the common location
  • Service should be versioned, and the build is automated and pushed to the registry. (You can use your fork and configure to push to your container registry before approval and sent to the main)

I have made changes in such a way that both normal and DR will work together. Now, with these changes i am handling the DB initialization part as well. there was a flag missing in catalogue-job.yaml which i have added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants