Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Escalation] SPIKE: e2e pipeline blocker #30822

Closed
sfreudenthaler opened this issue Dec 2, 2024 · 6 comments · Fixed by #30932
Closed

[Escalation] SPIKE: e2e pipeline blocker #30822

sfreudenthaler opened this issue Dec 2, 2024 · 6 comments · Fixed by #30932

Comments

@sfreudenthaler
Copy link
Contributor

sfreudenthaler commented Dec 2, 2024

TIMEBOX 1 day (fix if you can)

Placeholder

Raised by @bryanboza, his e2e rock is blocked because he's having trouble getting them into the pipeline. This needs significant refinement before we can pick it up.

possibly a spike if it's not an easy fix.

Need to pair on it live with someone because there's knowledge gaps on the team.

@sfreudenthaler sfreudenthaler moved this to Current Sprint Backlog in dotCMS - Product Planning Dec 2, 2024
@sfreudenthaler sfreudenthaler moved this from Current Sprint Backlog to Next 1-3 Sprints in dotCMS - Product Planning Dec 2, 2024
@spbolton
Copy link
Contributor

spbolton commented Dec 2, 2024

Configuration Discrepancies and Testing Strategy

Context

Currently, there is a discrepancy between the environment configuration used during CI (integration/Postman tests) and the Docker Compose setup provided for running dotCMS. While Maven-based configurations ensure consistency and repeatability in CI, the Docker Compose file, used for local development or example setups, does not undergo similar testing. This raises concerns about configuration drift, reliability, and how the environment aligns with what customers might use.


Key Issues and Questions

1. Configuration Consistency Across Testing and Deployment

  • Problem: The Maven configuration in CI is distinct from the Docker Compose setup. Without testing the latter, there’s no assurance that it works as expected with default configurations.
  • Questions:
    • Should integration and Postman tests use the same configuration as a default customer setup (via Docker Compose)?
    • If different configurations are needed, where and how should these be tested?
    • Are we tailoring the tests to ensure they pass, instead of reflecting real-world setups?

2. Centralized Source of Truth for Configuration

  • Problem: Configuration exists in multiple places—Maven POM, Docker Compose, and potentially individual developer setups. This creates the risk of divergence and inconsistencies.
  • Questions:
    • Can we generate a Docker Compose file dynamically from the Maven configuration to establish a single source of truth?
    • Alternatively, can the Maven configuration rely on a base Docker Compose file and extend it for testing purposes?

3. Alignment Between Parent POM and Runtime Behavior

  • Problem: The parent POM defines a base configuration for running dotCMS, extended in submodules like Postman and E2E tests. However, these configurations may differ from runtime behavior in Docker Compose.
  • Questions:
    • Should we standardize around default values in the parent POM and remove overrides in test modules?
    • How do these runtime behaviors differ from Docker Compose, and what adjustments are necessary?

4. Testing the Delivered Docker-Compose Configuration

  • Problem: The Docker Compose file provided to customers is not explicitly tested. There is no mechanism to ensure that this configuration remains functional as changes are introduced.
  • Questions:
    • Do we need separate testing to validate the Docker Compose setup?
    • Should E2E tests be used to verify this configuration, or should it be tested through another mechanism?
    • How do we ensure the Docker Compose file is always in a working state?

Key Decisions to Address

  1. Validity of Current Testing Configurations

    • Are we testing with a configuration that reflects real-world use cases or customer environments? If not, adjustments should be prioritized to improve reliability.
  2. Synchronization of Configuration

    • How do we ensure synchronization between CI, Docker Compose, and development environments?
    • Centralizing configuration management (e.g., environmental profile-based setups) could eliminate duplication and reduce complexity.
  3. Eliminating Custom Configurations

    • Can we remove one-off configurations in favor of standardized, profile-driven setups that adapt based on the environment (e.g., CI vs. local development)?
  4. Testing the Docker Compose Configuration

    • Should we introduce automated tests (e.g., E2E or smoke tests) to validate the Docker Compose configuration as part of CI/CD?
    • Alternatively, should there be a dedicated process or task to ensure the Docker Compose file is functional and in sync with Maven-based configurations?

Possible Solutions

1. Unified Testing Approach

  • Extend integration and Postman tests to include scenarios run against Docker Compose.
  • Define a process to validate default Docker Compose setups in CI.

2. Centralized Configuration

  • Introduce a script or mechanism to generate Docker Compose files from Maven configurations (or vice versa) to maintain a single source of truth.
  • Modify the Maven setup to extend from a baseline Docker Compose configuration.

3. Environmental Profiles

  • Adopt a profile-based configuration system where defaults are defined centrally, and overrides are applied for specific environments (e.g., CI, local dev).

4. Testing Workflow Refinement

  • Establish a task to review and synchronize configurations between Maven and Docker Compose during development.
  • Automate checks for consistency to avoid manual tracking.

5. Testing Delivered Docker Compose

  • Option 1: Use E2E tests to validate the Docker Compose configuration explicitly. This ensures that the customer-delivered setup works as expected.
  • Option 2: Create a lightweight testing process (e.g., smoke tests) focused solely on verifying that the Docker Compose file is functional and can start services correctly.
  • Option 3: Include Docker Compose validation as part of the CI/CD pipeline to ensure every release is tested against the delivered configuration.

Questions to Resolve

  • Which configuration should be the primary source of truth—Maven or Docker Compose?
  • How much flexibility is needed in configurations to accommodate different testing and runtime environments?
  • Should developers' local environments default to Docker Compose, with CI ensuring compatibility?
  • How do we best integrate testing for the Docker Compose file into our workflow to ensure it remains functional and reliable?
  • How do we maintain a consistent composable design that allows for structured override of default values only where needed and in limited sets of environments to prevent configuration insconsitencies and inability to change.

@sfreudenthaler sfreudenthaler changed the title e2e pipeline blocker [Escalation] e2e pipeline blocker Dec 3, 2024
@sfreudenthaler sfreudenthaler changed the title [Escalation] e2e pipeline blocker [Escalation] SPIKE: e2e pipeline blocker Dec 10, 2024
@spbolton spbolton moved this from Next 1-3 Sprints to Current Sprint Backlog in dotCMS - Product Planning Dec 11, 2024
@spbolton spbolton self-assigned this Dec 11, 2024
@spbolton spbolton linked a pull request Dec 11, 2024 that will close this issue
Copy link

@sfreudenthaler sfreudenthaler moved this from Current Sprint Backlog to In Review in dotCMS - Product Planning Dec 11, 2024
@spbolton
Copy link
Contributor

I think this PR resolved the e2e test issues. https://github.com/dotCMS/core/actions/runs/12172665490. As I suspected could be the cause there seems to have been a timing issue and the requests are being made before the system is running. https://github.com/dotCMS/core/actions/runs/12172665490. As I mentioned to Bryan though adding a delay or waiting for state may stop the tests from breaking, but It may also indicate that we are getting into a state on startup where the server is accepting requests but it is not really ready yet and whether we hit this state may have a level of randomness and based upon server performance. Without digging in more it is hard to debug this exactly, I have a suspicion from indecations in the log that we may be asking for system notifications very early in the process during startup and it is returning an error, this error may be breaking the js causing side effects like the login button not being enabled.

using e2e to test some of the complex behavior here when it relies on the server having just started up may be a little tricky. This is where we may be able to simulate the requests throwing error in the logs with jmeter, and validate the timing on starting up, as well as validating ui impact on a failure. We seem to be asking for the notifications quite a lot ( i know in the cloud this is cached by bunny ) and maybe the client should impose a minimum delay in requesting, but also it is possible we should better handle failover here where a failure in notifications should not break other js in the browser and just look like there are non (if in fact this is the case)

It does seem at least though that we are may not be blocked by this issue any more.

@spbolton
Copy link
Contributor

Log in failure before change showing login requests and requests for anoncements for reference.

[DOTCMS]20:39:42.326  INFO  servlet.ServletToolboxManager - Trying config file '/WEB-INF/toolbox.xml'
[DOTCMS]20:39:42.441  INFO  servlet.ServletToolboxManager - Toolbox setup complete.
[DOTCMS]20:39:42.456  INFO  apps.SecretsKeyStoreHelper - KeyStore loaded successfully after `1` tries.
[DOTCMS]20:39:42.457  INFO  apps.SecretsKeyStoreHelper - KeyStore loaded successfully after `1` tries.
[DOTCMS]20:39:44.522  INFO  ajax.BrowserAjax - currentLoggedUser: Admin User - iddotcms.org.1
[ES][2024-12-04T20:39:51,968][INFO ][o.o.i.i.ManagedIndexCoordinator] [f9666f53549c] Cancel background move metadata process.
[ES][2024-12-04T20:39:51,969][INFO ][o.o.i.i.ManagedIndexCoordinator] [f9666f53549c] Performing move cluster state metadata.
[ES][2024-12-04T20:39:51,969][INFO ][o.o.i.i.MetadataService  ] [f9666f53549c] Move metadata has finished.
[DOTCMS]20:39:52.022  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:39:52.131  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:39:52.270  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:39:59.091  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:39:59.198  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:39:59.314  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:06.284  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:40:06.401  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:06.508  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:13.929  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:40:14.041  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:14.144  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:20.340  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:40:20.445  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:20.582  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:27.151  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:40:27.255  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:27.379  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:35.388  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:40:35.527  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:35.651  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:44.087  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : User dotcms.org.1 has successfully login from IP: 172.17.0.1 -- ip:172.17.0.1,user:Admin User [ID: dotcms.org.1][email:[email protected]]
[DOTCMS]20:40:44.180  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:44.296  INFO  announcements.AnnouncementsHelperImpl - loading announcements from [http://localhost:8080/api/content/render/false/query/+contentType:Announcement%20+languageId:1%20+deleted:false%20+live:true%20/orderby/Announcement.announcementDate%20desc]
[DOTCMS]20:40:50.654  INFO  util.SecurityLogger - class com.dotcms.cms.login.LoginServiceAPIFactory$LoginServiceImpl : An invalid attempt to login as dotcms.org.1 has been made from IP: 172.17.0.1 -- ip:172.17.0.1,user:null
[DOTCMS]20:40:52.040  ERROR ejb.UserManagerImpl - User '[email protected]' does not exist.
[DOTCMS]20:40:55.847  INFO  util.PropertyMessageResources - Loading all possible messages for locale es_ES
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): dot.common.message.no.workflow.schemes=No hay ninguna acción de flujo de trabajo disponible para este contenido en su paso actual del flujo de trabajo. Para activar las acciones de flujo de trabajo para este contenido, asegúrese de que se ha asignado un esquema de flujo de trabajo a este tipo de contenido y que el esquema asignado al tipo de contenido tiene acciones de flujo de trabajo disponibles para el paso de flujo de trabajo en el que está actualmente este contenido. ==> No hay ninguna acción de flujo de trabajo disponible para este contenido en su paso actual del flujo de trabajo. Para activar las acciones de flujo de trabajo para este contenido, asegúrese de que se ha asignado un esquema de flujo de trabajo a este tipo de contenido y que el esquema asignado al tipo de contenido tiene acciones de flujo de trabajo disponibles para el paso de flujo de trabajo en el que está actualmente este contenido.
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.actions_header.label=Acciones ==> Acciones
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.key_header.label=Clave ==> Clave
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.key_input.placeholder=Ingresar Clave ==> Tecla intro
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.value_header.label=Valor ==> Valor
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.value_input.placeholder=Ingresar Valor ==> Introducir valor
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.value_no_rows.label=No Se Encontraron Registros ==> No se encontraron registros
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): keyValue.error.duplicated.variable=La clave Variable "{0}" ya existe ==> Ya existe una clave con el nombre de variable "{0}"
[DOTCMS]20:40:55.855  WARN  struts.MultiMessageResources - Duplicate resource property definition (key=was ==> is now): Unpublished=Despublicado ==> Sin publicar
[INFO] ·×F··········Assertion failed
[DOTCMS]20:40:56.927  INFO  util.PropertyMessageResources - Loading all possible messages for locale it_IT
[INFO] ·Assertion failed
[DOTCMS]20:40:58.027  INFO  util.PropertyMessageResources - Loading all possible messages for locale fr_FR
[INFO] ·Assertion failed
[DOTCMS]20:40:59.059  INFO  util.PropertyMessageResources - Loading all possible messages for locale de_DE
[INFO] ·Assertion failed

@spbolton
Copy link
Contributor

Awaiting on this PR if this works in CI then we are good for now #30932

@sfreudenthaler
Copy link
Contributor Author

sounds like @bryanboza is unblocked. It came up in a 1:1 discussion with him earlier today

@github-project-automation github-project-automation bot moved this from In Review to Done in dotCMS - Product Planning Dec 12, 2024
@sfreudenthaler sfreudenthaler linked a pull request Dec 13, 2024 that will close this issue
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
2 participants