Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

f7t-base image no longer exists? #165

Open
lmagdanello opened this issue Jul 29, 2022 · 18 comments
Open

f7t-base image no longer exists? #165

lmagdanello opened this issue Jul 29, 2022 · 18 comments
Assignees

Comments

@lmagdanello
Copy link

I was trying to run the project demo but it doesn't identify the f7t-base image anymore. I looked in docker.io and didn't find it.

Test environment:

Vagrant>

  • Vagrant version: 2.2.19
  • VirtualBox version: 5.1-5.1.38_122592

VMs>

  • CentOS 7.9
  • Kernel 3.10.0-1160.42.2.el7

[vagrant@mngt0-1 ~]$ git clone https://github.com/eth-cscs/firecrest.git
Cloning into 'firecrest'...
remote: Enumerating objects: 5516, done.
remote: Counting objects: 100% (1440/1440), done.
remote: Compressing objects: 100% (415/415), done.
remote: Total 5516 (delta 1102), reused 1250 (delta 1008), pack-reused 4076
Receiving objects: 100% (5516/5516), 1.69 MiB | 0 bytes/s, done.
Resolving deltas: 100% (3279/3279), done.
[vagrant@mngt0-1 ~]$ cd firecrest/deploy/demo/
[vagrant@mngt0-1 demo]$ sudo docker-compose up -d --build
[+] Building 1.7s (23/23) FINISHED
 => [demo_utilities internal] load build definition from Dockerfile                                                                                                  0.3s
 => => transferring dockerfile: 565B                                                                                                                                 0.3s
 => [demo_utilities internal] load .dockerignore                                                                                                                     0.2s
 => => transferring context: 2B                                                                                                                                      0.2s
 => [demo_reservations internal] load build definition from Dockerfile                                                                                               0.2s
 => => transferring dockerfile: 406B                                                                                                                                 0.2s
 => [demo_cluster internal] load build definition from Dockerfile                                                                                                    0.2s
 => => transferring dockerfile: 4.07kB                                                                                                                               0.2s
 => [demo_reservations internal] load .dockerignore                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => ERROR [demo_reservations internal] load metadata for docker.io/library/f7t-base:latest                                                                           1.5s
 => [f7t-base internal] load .dockerignore                                                                                                                           0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_cluster internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_storage internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [f7t-base internal] load build definition from Dockerfile                                                                                                        0.1s
 => => transferring dockerfile: 433B                                                                                                                                 0.1s
 => [demo_client internal] load .dockerignore                                                                                                                        0.1s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_tasks internal] load .dockerignore                                                                                                                         0.1s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_status internal] load .dockerignore                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_certificator internal] load .dockerignore                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_compute internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [demo_storage internal] load build definition from Dockerfile                                                                                                    0.1s
 => => transferring dockerfile: 934B                                                                                                                                 0.1s
 => [demo_client internal] load build definition from Dockerfile                                                                                                     0.1s
 => => transferring dockerfile: 493B                                                                                                                                 0.1s
 => [demo_tasks internal] load build definition from Dockerfile                                                                                                      0.1s
 => => transferring dockerfile: 645B                                                                                                                                 0.1s
 => [demo_status internal] load build definition from Dockerfile                                                                                                     0.1s
 => => transferring dockerfile: 550B                                                                                                                                 0.1s
 => [demo_certificator internal] load build definition from Dockerfile                                                                                               0.1s
 => => transferring dockerfile: 780B                                                                                                                                 0.1s
 => [demo_compute internal] load build definition from Dockerfile                                                                                                    0.1s
 => => transferring dockerfile: 638B                                                                                                                                 0.1s
 => CANCELED [demo_certificator internal] load metadata for docker.io/library/centos:7                                                                               1.2s
 => CANCELED [f7t-base internal] load metadata for docker.io/library/python:3.8.12-slim                                                                              1.0s
------
 > [demo_reservations internal] load metadata for docker.io/library/f7t-base:latest:
------
failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
@aledabin aledabin self-assigned this Jul 29, 2022
@aledabin
Copy link
Collaborator

Hi @lmagdanello,
f7t-base is the first container docker-compose should build. I can't reproduce it, by chance are you using a Mac? If so, could you check if docker/compose#8449 (comment) or subsequent workarounds solve the issue?
There may be also an issue building the cluster image, you can try using replacing on docker-compose.yml the build section for the cluster (lines 186-189) with image: "adabin/f7t-demo-cluster"

@lmagdanello
Copy link
Author

Nops, Im using a virtual machine (vagrant ++ virtualbox) with CentOs 7.9.

Let me see changing to adabin-f7t-demo-cluster.

@lmagdanello
Copy link
Author

Change the dummy cluster?

  # dummy cluster
  cluster:
    container_name: cluster
    build:
      context: ../test-build
      dockerfile: ./cluster/Dockerfile
      network: host
    networks:
      firecrest-internal:
        ipv4_address: 192.168.220.12
    hostname: cluster
    volumes:
      - ./logs/cluster/:/var/log/slurm/:delegated

@lmagdanello
Copy link
Author

BTW, I would like to use the FirecREST API with an existing Slurm. Can I remove this creation and use it with my own Slurm? I tried following the README.md changing the SYSTEMS_ and MACHINES_ parameters in the files, but I'm having this problem with f7t so I haven't been able to test it yet.

@lmagdanello
Copy link
Author

I changed the dummy cluster but the problem with f7t-base still occurs. I believe I can remove the Cluster build since I'm going to use my own Slurm, correct?
But even that won't influence because the error with f7t-base will still occur.

@aledabin
Copy link
Collaborator

Yes, to use your own Slurm cluster/machine you have to change some configurations, I'll write them here later.
Could you copy the output of docker-compose --verbose build f7t-base ?

@lmagdanello
Copy link
Author

[vagrant@mngt0-1 demo]$ pwd
/home/vagrant/firecrest/deploy/demo
[vagrant@mngt0-1 demo]$ sudo docker-compose --verbose build f7t-base
[+] Building 24.9s (8/9)
 => [internal] load build definition from Dockerfile                                                                                                                 0.0s
 => => transferring dockerfile: 433B                                                                                                                                 0.0s
 => [internal] load .dockerignore                                                                                                                                    0.0s
 => => transferring context: 2B                                                                                                                                      0.0s
 => [internal] load metadata for docker.io/library/python:3.8.12-slim                                                                                                1.3s
 => [1/4] FROM docker.io/library/python:3.8.12-slim@sha256:a2d8844be9a3d5df8cd64c11bba476156cbfe5991db643c83e88ae383c15b5d0                                         23.5s
 => => resolve docker.io/library/python:3.8.12-slim@sha256:a2d8844be9a3d5df8cd64c11bba476156cbfe5991db643c83e88ae383c15b5d0                                          0.0s
 => => sha256:08935196cb79d643967462c812670918f45fac4f72b1222b6351877ce6f07f40 10.77MB / 10.77MB                                                                     5.3s
 => => sha256:a2d8844be9a3d5df8cd64c11bba476156cbfe5991db643c83e88ae383c15b5d0 1.86kB / 1.86kB                                                                       0.0s
 => => sha256:45d5f0d52b630ac0828f7f822e93303aa5b9061d13cb9c60e2d82da2d6144a9a 1.37kB / 1.37kB                                                                       0.0s
 => => sha256:513da2530098611bdf3e39269c89782f680662f5fbb76fe35d7a966f4c390528 7.56kB / 7.56kB                                                                       0.0s
 => => sha256:f7a1c6dad28192bd417b78079d6ddc03cbca6d5ea46bba12769b235b6353c00c 31.37MB / 31.37MB                                                                     7.8s
 => => sha256:92c59ec44e08abbd071efa42d226c5bdb8396ee193a53e02c4232296ef855372 1.08MB / 1.08MB                                                                       1.2s
 => => sha256:7aa958b41dfba14f3e45e7d62c56fc7d17075653f451400fbff5a91958effdcc 234B / 234B                                                                           1.5s
 => => sha256:86c7a279e30496fecd928449673d3e3e4792642d9f7a6fd5bb8c6a0248f02932 2.64MB / 2.64MB                                                                       3.6s
 => => extracting sha256:f7a1c6dad28192bd417b78079d6ddc03cbca6d5ea46bba12769b235b6353c00c                                                                            1.9s
 => => extracting sha256:92c59ec44e08abbd071efa42d226c5bdb8396ee193a53e02c4232296ef855372                                                                            0.1s
 => => extracting sha256:08935196cb79d643967462c812670918f45fac4f72b1222b6351877ce6f07f40                                                                            0.6s
 => => extracting sha256:7aa958b41dfba14f3e45e7d62c56fc7d17075653f451400fbff5a91958effdcc                                                                            0.0s
 => => extracting sha256:86c7a279e30496fecd928449673d3e3e4792642d9f7a6fd5bb8c6a0248f02932                                                                            0.3s
 => [internal] load build context                                                                                                                                    0.0s
 => => transferring context: 526B                                                                                                                                    0.0s
 => [2/4] RUN pip3 install --upgrade pip                                                                                                                             3.3s
 => [3/4] ADD deploy/docker/base/requirements.txt base/requirements.txt                                                                                              0.0s
 => [4/4] RUN pip3 install -r base/requirements.txt                                                                                                                  8.4s
 => exporting to image                                                                                                                                               0.5s
 => => exporting layers                                                                                                                                              0.5s
 => => writing image sha256:ef7cf2aea61345c81cdf97249531902760851b5116b8eb18cc69666084f02341                                                                         0.0s
 => => naming to docker.io/library/f7t-base                                                                                                                          0.0s
DEBU[0000] serving grpc connection
DEBU[0000] stopping session                              span="load buildkit capabilities"
DEBU[0000] serving grpc connection
DEBU[0025] stopping session

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
[vagrant@mngt0-1 demo]$ sudo docker images | grep f7t-base
f7t-base                   latest                         ef7cf2aea613   About a minute ago   170MB

@lmagdanello
Copy link
Author

Hmm? It looks like it was built

@aledabin
Copy link
Collaborator

Yes, it seems that. Does the standard docker-compose up -d --build work now?

@aledabin
Copy link
Collaborator

Currently it requires some work to connect this demo to another machine.

FirecREST uses SSH certificates to connect to the cluster/Slurm machine, so first: can you modify the SSH server configuration on the machine you want to connect to? If yes, and for testing purposes (not secure):

  • copy deploy/test-build/environment/keys/ca-key to /etc/ssh/ca-key.pub on the target machine
  • copy the last Match section from deploy/test-build/cluster/ssh/sshd_config into your /etc/ssh/sshd_config
    • modify the match address accordingly
    • maybe comment MaxAuthTries and PubkeyAcceptedKeyTypes so you can connect from the same IP as where this demo is running
    • uncomment AllowUsers and list the appropiate usernames
    • restart the service
  • either create a user 'test1' on the Slurm machine or add a user on Keycloak (how to access is described in README.md) with a different password (it is only to authenticate to Keycloak, not to your Slurm machine)

If you can't make that changes, next week I can post a patch to use standard SSH keys but it becames limited to one user.

  • on deploy/demo/common/common.env modify:
    • F7T_SYSTEMS_INTERNAL_COMPUTE, F7T_SYSTEMS_INTERNAL_STORAGE, F7T_SYSTEMS_INTERNAL_UTILITIES : IP or DNS of your Slurm machine
    • F7T_COMPUTE_BASE_FS : change only if users are on a different location than /home
    • F7T_USE_SPANK_PLUGIN=False
    • F7T_SSH_CERTIFICATE_WRAPPER=False
    • F7T_OPA_USE=False

@lmagdanello
Copy link
Author

Yes, it seems that. Does the standard docker-compose up -d --build work now?

Yeah! Now it's working.

About the ssh changes:
"uncomment AllowUsers and list the appropiate usernames":

I have some problems with these changes. For example, I have a lot of users and groups that access the server, so in that case I couldn't use AllowUsers. Users and groups are administered with ldap 😞

@aledabin
Copy link
Collaborator

Yeah! Now it's working.

Great! Maybe it tried to build everything in parallel and so it failed for most of the microservices.

About the SSH server, ok, just take into account that the SSH CA key can be used to generate a certificate for any user, that is the reason to trust the key only for a subset of IPs or users.

In any case, you can generate a new one (deploy/test-build/README.md) and mount the private key to the Certificator microservice on

- ../test-build/environment/keys/ca-key:/ca-key:ro
The other pair of SSH keys can be shared as they are just auxiliary to the certificates.

@lmagdanello
Copy link
Author

Hey @aledabin! 😃

Sorry for the delay, I'm still testing the recommended changes in the lab.

A question that may be a little stupid on my part, but I need to ask: to use/run FirecREST, the only way is through the demo? Or can I run it with my Slurm another way?

@aledabin
Copy link
Collaborator

aledabin commented Aug 5, 2022

Hi, not a bad question at all. The demo shows the functionality of FirecREST components and 3rd party integration. You can also run them in a different setup: running Python process directly (we'll publish using Gunicorn instead of Flask soon), with containers (the Dockerfiles are at deploy/docker) or on K8S (there are Helm charts on deploy/k8s). In any of those cases, at least the environmental variables with URLs and the Kong configuration have to be updated.

@lmagdanello
Copy link
Author

Got it. I'll investigate more about the containers part. I want to try using FirecREST with the environment I have as an alternative to Slurm's native JWT. Mainly for the additional features.

@lmagdanello
Copy link
Author

I can try to prepare some Ansible roles that do this setup (the environmental variables) and I can make a replica here in this #issue. Any tips before starting to dive into deploying containers (deploy/docker)?

@simonbray
Copy link

FWIW I also saw the error with the f7t-base image, for me also it was transient 😕

@aledabin
Copy link
Collaborator

Hi @lmagdanello, in the next month we'll provide Ansible scripts to run all the FirecREST components.

There are more detailed documentation on how to install at doc/install.md and configurations options at doc/configuration.md

Please let us know if we can help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants