Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Architecture] Infrastructure Deployment API & Plugin development #23

Closed
tsebastiani opened this issue Jan 13, 2023 · 8 comments
Closed
Assignees
Labels
enhancement New feature or request

Comments

@tsebastiani
Copy link
Member

tsebastiani commented Jan 13, 2023

The aim of this issue is to describe the redesign in the CRC-Cloud architecture drafted in the PR #16 in order to decouple the Infrastructure provisioning from the OpenShift initialization and setup.

AS-IS

Originally the script has been developed to deploy the OpenShift instances on AWS, so everything has been packed in a single script that did the job. Since the project started to gain interest from other folks has been started a discussion regarding the support of other cloud infrastructures and provisioning technologies.

Assumptions

  • current monolithic design do not facilitate team work
  • current programming language (Bash scripting), if not properly structured, will become very soon unmaintainable
  • there is a wide variety of IaC technologies that could be supported and more will come in the near future

Benefits

  • codebase quality and maintainability will be dramatically improved
  • multiple Infrastructure deployment technologies could be supported without impacting the project logic
  • community could provide other deployment strategies increasing the project value

Enhancements to the current implementation

  • support git repos links as modules from CLI in order to use externally hosted plugins.
  • add to the cluster_infos.json the name of the plugin that made the deployment to enable the teardown to automatically select it and avoid conflicts between the creation and the teardown.

Design

The design is vaguely inspired to Inversion of Control design pattern widely adopted in MVC frameworks where the program engine, in our case the crc-cloud.sh script, will automatically load the plugin code expecting that has been implemented respecting the API interface and calling the interface methods from the main program flow independently from the plugin logic from which it is fully decoupled. Considering that bash scripting is not an Object Oriented programming language, the compliance enforcement of the plugins to the API cannot be made at compile time, but will be done at runtime from purposely made methods defined in the API.

insfrastructure_deployer
infrastructure deployer API (macro) finite state machine (click to zoom)

API

api_load_deployer ()

Loads the selected Infrastructure Deployer Plugin. Before loading it makes several integrity checks on the plugin script:

  • Does the plugin name contain only lowercase letters and underscores?
  • Does the plugin folder contain a folder with the same name of the selected plugin?
  • Does the folder contain a file named main.sh?
  • Does the main.sh implement all the interface methods?
    • deployer_load_dependencies()
    • deployer_usage()
    • deployer_get_eip()
    • deployer_get_iip()
    • deployer_create()
    • deployer_create()
    • deployer_teardown()
  • does the other methods implemented start with _<plugin_name>_?
  • does the other methods implemented in the scripts included by main.sh start with _<plugin_name>_?

api_wait_instance_readiness ()

Checks if the created VM is ready to receive an ssh connection and starts the OpenShift instance setup

Infrastructure Deployer API

deployer_load_dependencies ()

In this method should be done all the the plugin dependency loading, this guarantees that the PLUGIN_ROOT_FOLDER has been correctly valorized and that the dependency loading is made correctly to not break the application flow.

deployer_usage()

Show the plugin description and usage, and presents the plugin specific arguments (if needed)

deployer_get_eip()

Returns the public IP address of the VM created by the plugin.

deployer_get_iip()

Returns the internal IP address of the VM created by the plugin.

deployer_create ()

This is the entry point for the infrastructure creation

deployer_teardown()

This is the entry point for the infrastructure teardown

@lmilbaum
Copy link
Contributor

Thank you for putting this together. I like the approach of decoupling the bootstrapping code from the provisioning. It is a good engineering pattern/practice.

If understand correctly, the suggested architecture activating support for multiple bootstrapping technologies. I would like us to consider the following:

  1. Was this requirement requested by users? How was it formed? Which pain point it solves?
  2. Which technology (one or more) will be adopted by the core project?
  3. Please describe the flow of a contributor who wants to update the bootstrapping code when multiple bootstrapping technologies are being used
  4. Please describe how the CI tests will be structured when multiple bootstrapping technologies are being used

@praveenkumar
Copy link
Member

Thank you for putting this together. I like the approach of decoupling the bootstrapping code from the provisioning. It is a good engineering pattern/practice.

If understand correctly, the suggested architecture activating support for multiple bootstrapping technologies. I would like us to consider the following:

1. Was this requirement requested by users? How was it formed? Which pain point it solves?

As of now there is no user base for this project so this is created in the mind of our current expectation where we all are trying out different tech stack (pulumi/terraform ..etc.) and as now on draft side I can see that for terraform as of now only for infra creation part where deployer make sense and provisioning can be used same as what is currently used by shell also.

2. Which technology (one or more) will be adopted by the core project?

I think as of now this this arch. doesn't specify any tech but more of like if you are thinking how it should look like then this is one of design to achieve it ( bash part is just sample purpose because as of now that works)

3. Please describe the flow of a contributor who wants to update the bootstrapping code when multiple bootstrapping technologies are being used

I am still not in favor of using multiple bootstrapping tech, for POC point of view it is good and can be experiment with but not for long term.

4. Please describe how the CI tests will be structured when multiple bootstrapping technologies are being used

Same as point-3 but we already have a CI issue for discuss that part.

Bottom line is, this issue is discussion point of how the project arch should be, if we decide to use X bootstrap tech at then end then multiple deployer not needed but the flow still needed.

@adrianriobo have different view point when using IaC since he want to maintain every resource with the tool even provisioning part. I will ask him to put a flow here for that so we all are in same page.

@lmilbaum
Copy link
Contributor

@praveenkumar Let me just clarify one tiny thing. I am a user :-). My plans are to consume this project in the CI of my projects. The suggested architecture doesn't support the approach you have shared. It just made me more confused.

@praveenkumar
Copy link
Member

@praveenkumar Let me just clarify one tiny thing. I am a user :-). My plans are to consume this project in the CI of my projects. The suggested architecture doesn't support the approach you have shared. It just made me more confused.

@lmilbaum Please share your concern what doesn't work with this arch and if possible can you outline the issue with this arch or share what you think would be good for in your usecase.

@lmilbaum
Copy link
Contributor

Here is how we are using it now - https://github.com/platform-engineering-org/elk/blob/main/provision/up.yml.
Be aware that this playbook is still WIP.
For me, the spinning should be a black box. I don't care which technology is used for provisioning+bootstrapping. It should be one simple command.

Here is a short list of my user stories:

  1. As a user, I want to have one command to spin the environment (AWS instance with CRC)
  2. As a user, I want the spinning time to be as short as possible
  3. As a user, I want the command to work on the following operating systems: RHEL 8, Ubuntu
  4. As a user, I don't want to install any tools which are required by the provided command (in step 1)

@adrianriobo
Copy link
Contributor

adrianriobo commented Jan 18, 2023

So here are my thoughts on this:

My point of view is as pretty align with @lmilbaum my final goal with it would be to include it as a step on a pipeline acting as a black box which creates the cluster and then gives me back what I need to connect remotely with the cluster and also the assets required to add a final step to the pipeline to cleanup the cluster (destroy resources on the provider) ...actually we already have something like this, specially with the container holding the current state of openspotng.

Besides this, my PoC is with Pulumi and I promote that option based on several points:

  • Pulumi basically copied the model introduced by Terraform and the only real difference is to move from a DSL syntax (terraform) to write code (pulumi)

  • One thing I also promote is handle the state of the setup as any other resource managed by the IaC framework of choice, as this brings some benefits like store the state of the execution, and delegates the orchestration of the exection to the framework ( basically the execution is another resouce added to graph on the plan with the dependecies on the other resources...like IPs, keys or whatever....)

  • Write code has much advantages over DSLs like debugging, code tracing easily, allow better error handling and it allows to create interfaces and code structures align with the design introduced by @tsebastiani and also made possible this statement "the compliance enforcement of the plugins to the API can be made at compile time" about the possible interfaces defined for the supported providers

  • Other interesting aspect of using pulumi is the easily integration for testing part of the project, with a framework like ginkgo you can directly use the code to test the functionality all on the same project

  • About dependencies:

As a user, I want the command to work on the following operating systems: RHEL 8, Ubuntu
As a user, I don't want to install any tools which are required by the provided command (in step 1)

This is solved by using the container image otherwise it will always require to install certain tools. Also beyond this dependencies there are others like user permissions on target cloud providers (this could be another thread)

So basically all these points are implemented on the PoC I did for pulumi ( + added the import operation #10 ) I uploaded #26 (do not freak out +2k files is b/c of vendor folder xD) just in case someone wants to check it but I have one missing thing with the import to be fully functional, (create and destroy are full functional)

Here is also a small diagram on all of this (related to cmd api #21):

crc-cloud drawio

@tsebastiani
Copy link
Member Author

I do really like that Pulumi! Which Programming language are you using for the POC?

@adrianriobo
Copy link
Contributor

it is go

@tsebastiani tsebastiani unpinned this issue Feb 7, 2023
@gbraad gbraad closed this as completed Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

5 participants