Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Backup & Restore initial architecture development #2650

Closed
Tracked by #2648
viniciusdc opened this issue Aug 28, 2024 · 4 comments
Closed
Tracked by #2648

[ENH] Backup & Restore initial architecture development #2650

viniciusdc opened this issue Aug 28, 2024 · 4 comments
Labels
area:backup-and-restore 💾 All items related to backup and restore needs: follow-up 📫 Someone needs to get back to this issue or PR project: JATIC Work item needed for the JATIC project type: enhancement 💅🏼 New feature or request

Comments

@viniciusdc
Copy link
Contributor

viniciusdc commented Aug 28, 2024

Original RFD

Most of the topics presented in this issue serves as a general overview and sumiry of the ideas and discussions originally presented in the RFD#49-Backup-Restore. With a more extended overview of the Restful Interface Approach there exemplified.

Architecture Overview

Nebari's backup-restore service is designed to help Nebari administrators manage backups and restorations, especially during migrations or performing disaster recovery actions. It focuses on the three main services currently available on Nebari that comprehend the general user's data:

  • Keycloak: all related user credentials and roles control across the nebari's ecosystem;
  • Conda-Store: Manages and builds shareable Conda environments for each user;
  • NFS: Responsible for storing all the relevant user and group data usually present in the jupyterlab instances.

Nebari's backup-restore service follows a hexagonal architecture, which enhances maintainability and future expandability. It's composed of the following modules:

  • adapters: act as the interface between the Core and external interactions, allowing different ways to interact with the system:
    • API Adapter (FastAPI): Exposes the system's functionalities through HTTPS endpoints, allowing external communication.
  • core: the main gateway for all services. It encapsulates all logic present in each lower service component and aggregates it for easier state management in smaller controllers called managers. The core performs actions such as overall system backup or restoration, lists available backup options and storage options, and provides general information the adapters might require to showcase status properly.
  • services: Services represent the individual services available on Nebari whose backup and restoration processes are supported. They include, but are not limited to, key cloaks, conda store, and more. Their primary responsibilities are granting access to import/export functionalities for each service and providing ways to handle data conflicts and reconciliation.
flowchart TB

    subgraph Components
        RestoreManager
        StorageManager
        BackupManager
    end

    subgraph Core
        Components
    end

    subgraph Adapters
        API[API Adapter]
    end

    subgraph Services
        direction TB
        Service1
        Service2
    end

    subgraph Storage
        direction TB
        Local
        S3
    end

    Client <-->|HTTPS requests| Adapters

    API --> Core

    RestoreManager <--> StorageManager
    BackupManager <--> StorageManager

    Components --> Services
    Components --> Storage
Loading

Reasonings and Considerations

The referenced RFD provides a general overview of the pros and cons of this approach. However, it's essential to highlight the following key points:

  1. This organization's primary focus was to enhance maintainability for Nebari maintainers by concentrating on the components and development within the service's own modules.

  2. When presenting this service to our internal community, it was crucial to consider the requirement for a solution that could operate both within and outside our infrastructure. This dual capability ensures flexibility in the event of a central disaster scenario. Consequently, the internal organization of each service module, particularly the Importers and Exporters, was designed to facilitate the development of custom reconciliation strategies by leveraging the available endpoints.

  3. Given the extensive control this service exerts over the internal workings of a Nebari deployment, security was a primary consideration. This led to integrating a storage manager solution to ensure that data remains within the service, allowing each cloud provider to manage that data using their own security solutions.

  4. Additionally, each service's modular configuration was designed to accommodate specific requirements, such as intrinsic authorization within the target services (e.g., Keycloak, Conda-Store). This approach minimizes the control footprint required by the general adapters, ensuring a more secure and modular system.

@viniciusdc viniciusdc changed the title initial architecture development Backup & Restore initial architecture development Aug 28, 2024
@viniciusdc
Copy link
Contributor Author

viniciusdc commented Aug 28, 2024

This is done already, I just need to populate the issue with its description and state. was created just to help organizing the current development process Completed

@viniciusdc viniciusdc changed the title Backup & Restore initial architecture development [ENH] Backup & Restore initial architecture development Aug 28, 2024
@viniciusdc viniciusdc added type: enhancement 💅🏼 New feature or request project: JATIC Work item needed for the JATIC project area:backup-and-restore 💾 All items related to backup and restore labels Aug 30, 2024
@viniciusdc viniciusdc added this to the BackUp & Restore milestone Aug 30, 2024
@viniciusdc viniciusdc added the needs: follow-up 📫 Someone needs to get back to this issue or PR label Aug 30, 2024
@kcpevey
Copy link
Contributor

kcpevey commented Sep 5, 2024

This is complete, @viniciusdc just needs to clean up the architecture drawing and post here.

@viniciusdc
Copy link
Contributor Author

viniciusdc commented Sep 5, 2024

Below is the general overview of all the service components outlined above; consider keycloak as an example of a service.
Backup   Restore (B R) - Page 3 (1)

A Keycloak backup lifecycle:

  • [1] CClientrequests backup to be initiated for all keycloak components vi API (async);
  • [2] The backup manager then proceeds to validate the request, communicate with the chosen storage, and then call the internal backup process available in the selected service while returning all metadata related to this backup process to the client
  • [3] Given the backup operation, all exporting endpoints are reached within the service's internal process, and the general data required for storage is uploaded to the available storage.

The client can then inspect the date directly in the storage location or by reaching the list or status endpoints later.

In this example, for a client to retrieve or schedule* a backup, it can (once authenticated) follow two different routes that are chosen based on the goal for such Client

  • If the reconciliation workflow (how the data is handled and presented back to the services) is to be managed outside of the backup-restore application, and the target service supports its data* to be serializable (see discussion https://github.com/nebari-dev/governance/issues/49##serial-data-export-versus-backup for extra details). Then, the approach you would consider are the routes outlined with the append (A or B).
  • On the other hand, if you expect Nebari to handle all the backing up of data and restoration workflow (our default reconciliation strategy), then the usual [1]-[4] or [5]-[8] is the direction you should take.

@viniciusdc
Copy link
Contributor Author

completed, the implementation details and specifics of each service are to be handled per #2651

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:backup-and-restore 💾 All items related to backup and restore needs: follow-up 📫 Someone needs to get back to this issue or PR project: JATIC Work item needed for the JATIC project type: enhancement 💅🏼 New feature or request
Projects
Development

No branches or pull requests

2 participants