Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added yml config for Green Metrics Tool #589

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Added yml config for Green Metrics Tool #589

wants to merge 20 commits into from

Conversation

mrchrisadams
Copy link
Member

This is a clone of pull #587, that I am able to make changes to as I work on (I was unable to make changes to Jame's PR, as it's on his repo):

#587

@mrchrisadams
Copy link
Member Author

Ok, I've had a quick phone call with Arne, to check how best to run these tests.

It's helpful to approach getting measurements in two stages:

  1. figure out how to get readings from the GMT system, and working out what a usage_scenario.yml needs to look to get any reading at all.
  2. figure out how to get accurate readings - once we have a way to run the system to test the scenarios we care about (visiting a webpage, hitting an API, filling out a form, etc), we then care about getting figures that are not polluted by information from other processes and software running on the system.

With step one, you care more about fast iterations and getting familiar with a new syntax than having precise accurate measurements, so installing a local version of the GMT system can be justified because running testing jobs locally will be faster than having to push to a CI system each time to see a test result.

Installing locally on my 2020 macbook M1 was fairly straight forward when following these steps

https://docs.green-coding.io/docs/installation/installation-macos/

The only point of confusion was me figuring out how to actually run a job once I have a local instance of the GMT system running.

How DO you run a test?

Once you have a running system, from the host machine, you need to run a script while inside the checked out GMT project. A bit like this:

cd PATH_TO_GREEN_METRICS_TOOL

# (activate the virtual environment if you aren't already in it)
source ./venv/bin/activate

python3 runner.py --uri PATH/TO/YOUR/PROJECT/ --name NAME-OF-YOUR-PROJECT

The easiest way to check this part works is to follow the steps in the docs page linked below for the simple example of the Stress Container One Core 5 Seconds scenario:

https://docs.green-coding.io/docs/measuring/interacting-with-applications/

You'll then get a readout a bit like this

python3 runner.py --uri /tmp/easiest-application --name testing-my-demo
Detected supplied folder:  /tmp/easiest-application
 
Running System Checks 
Checking db online                            : OK
Checking single energy scope machine provider : OK
Checking tmpfs mount                          : OK
Checking < 5% CPU utilization                 : WARN (Your system seems to be busy. Utilization is above 5%. Consider terminating some processes for a more stable measurement.)
Checking 1GB free hdd space                   : OK
Checking free memory                          : OK
Checking docker daemon                        : OK
Checking running containers                   : WARN (You have other containers running on the system. This is usually what you want in local development, but for undisturbed measurements consider going for a measurement cluster [See https://docs.green-coding.io/docs/installation/installation-cluster/].)
Checking utf file encoding                    : OK
 
Checking out repository 
 
Having Usage Scenario  Stress Example 
From:  Arne Tarara <[email protected]>
Description:  Stress container on one core for 5 seconds 

(snip - it's very long)

Saving logs to DB 
 
Starting cleanup routine 
Stopping metric providers
Stopping containers
29655cc6ee693235b9021373d4f4ccd68af201a301d1a43213a2c31397bd9c02
Removing network
GMT_default_tmp_network_5493047
 
Removing all temporary GMT images 
Untagged: alpine_gmt_run_tmp:latest
 

>>>> Warning: GMT is not instructed to prune docker images and build caches. 
We recommend to set --docker-prune to remove build caches and anonymous volumes, because otherwise your disk will get full very quickly. If you want to measure also network I/O delay for pulling images and have a dedicated measurement machine please set --full-docker-prune <<<<

 
 -Cleanup gracefully completed 
 -Cleanup gracefully completed 
 

>>>> MEASUREMENT SUCCESSFULLY COMPLETED <<<<

Visiting the local instance of the GMT tool should show a reading you can visit, that looks a bit like this. I'm working on a Mac, and I have a bajillion other apps running, so I'm not worried that the measurement is invalid here - I care that I'm able to run the test and see any output:

Screenshot 2024-05-15 at 13 55 27

Next comment shows me running this locally to check this repo.

@mrchrisadams
Copy link
Member Author

OK, here's what I'm seeing now with the current usage_scenario.yml, that we haven't really worked on yet to make it fit the syntax.

It's failing, but that's ok - we'll probably need to fail a few times until we have figured out the correct syntax to use for the usage scenario to run properly, and at a minimum, simulate a client hitting one of the green web greencheck APIs, for example:

python3 runner.py --uri /Users/chrisadams/Code/tgwf/admin-portal --name testing-greenweb
Detected supplied folder:  /Users/chrisadams/Code/tgwf/admin-portal
 
Running System Checks 
Checking db online                            : OK
Checking single energy scope machine provider : OK
Checking tmpfs mount                          : OK
Checking < 5% CPU utilization                 : WARN (Your system seems to be busy. Utilization is above 5%. Consider terminating some processes for a more stable measurement.)
Checking 1GB free hdd space                   : OK
Checking free memory                          : OK
Checking docker daemon                        : OK
Checking running containers                   : WARN (You have other containers running on the system. This is usually what you want in local development, but for undisturbed measurements consider going for a measurement cluster [See https://docs.green-coding.io/docs/installation/installation-cluster/].)
Checking utf file encoding                    : OK
 
Checking out repository 
 
Capturing container logs 
 
Reading process stdout/stderr (if selected) and cleaning them up 

(snip)



<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0_o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Traceback (most recent call last):
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1672, in <module>
    run_id = runner.run()  # Start main code
             ^^^^^^^^^^^^
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1545, in run
    raise exc
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1468, in run
    self.initial_parse()
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 344, in initial_parse
    schema_checker.check_usage_scenario(self._usage_scenario)
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/lib/schema_checker.py", line 151, in check_usage_scenario
    raise SchemaError("The 'image' key under services is required when 'build' key is not present.")
schema.SchemaError: The 'image' key under services is required when 'build' key is not present.


<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0_o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Error: Base exception occured in runner.py

Exception (<class 'schema.SchemaError'>): The 'image' key under services is required when 'build' key is not present.
Run_id (<class 'uuid.UUID'>): a82ae06f-4178-43f1-afac-bb715a4c6516

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0_o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

At this point we now need to figure out:

  • a) which docker compose syntax to use in usage_scenario.yml to spin up our dockerized system
  • b) what to put into the flow section of usage_scenario.yml that lays out the steps to test

@mrchrisadams
Copy link
Member Author

Brief update. It looks we're getting through until a build issue for django.

From what I can see, it's related to the .env parsing code in django-environ, which is expecting a database connection string.

I'm able to reproduce the error in the Dockerfile now, so I'll fix that.

@mrchrisadams
Copy link
Member Author

OK, I have docker building, but I'm seeing an error when running this, that I think is related to the fact that the project directory I am using doesn't normally use docker. I'm seeing this:


(lots of logs of filenames... )



Only files from the supplied repository are allowed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1672, in <module>
    run_id = runner.run()  # Start main code
             ^^^^^^^^^^^^
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1569, in run
    raise exc
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1566, in run
    self.stop_metric_providers()
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/runner.py", line 1243, in stop_metric_providers
    df = metric_provider.read_metrics(self._run_id, self.__containers)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/metric_providers/powermetrics/provider.py", line 135, in read_metrics
    raise e
  File "/Users/chrisadams/Code/tgwf/green-metrics-tool/metric_providers/powermetrics/provider.py", line 129, in read_metrics
    data = plistlib.loads(data)
           ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/plistlib.py", line 892, in loads
    return load(fp, fmt=fmt, dict_type=dict_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/plistlib.py", line 884, in load
    return p.parse(fp)
           ^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/plistlib.py", line 186, in parse
    self.parser.ParseFile(fileobj)
xml.parsers.expat.ExpatError: no element found: line 1110, column 0


<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0_o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Error: Base exception occured in runner.py

Exception (<class 'xml.parsers.expat.ExpatError'>): no element found: line 1110, column 0
Run_id (<class 'uuid.UUID'>): bbc17e60-ceb7-4b05-8969-003a035cae2e

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 0_o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    

Working from a totally fresh checkout should simulate how the hosted system would work any too

@mrchrisadams
Copy link
Member Author

ok, I think I have this in better shape, but I'm seeing an error in the usage scenario that I don't see when running docker compose. I wondering if it's linked to a difference in how environment variables are passed into the container between GMT and docket compose.

Here's the log output:

Setting up container:  django 
Resetting container
Creating container
Setting ports:  ['9000:9000']
Waiting for dependent container mariadb
State of container 'mariadb': running
Waiting for dependent container rabbitmq
State of container 'rabbitmq': running
Running docker run with: docker run -it -d --name django -v /Users/chrisadams/Code/tgwf/gmt-testing-admin-portal:/tmp/repo:ro -v .:/app -p 9000:9000 -e LOTS_OF_SETTINGS network greenwebapp_gmt_run_tmp
Stdout: 56b64e137d8ae9f5883dc3502e4f9ca4268cd9ff68fad2f6b04218c382aaf9d6
 
Setting up container:  test-container 
Resetting container
Creating container
Waiting for dependent container django
State of container 'django': exited
Waiting for 1 second
State of container 'django': exited
Waiting for 1 second
State of container 'django': exited
Waiting for 1 second
State of container 'django': exited
Waiting for 1 second
State of container 'django': exited
Waiting for 1 second

We see django stuck in a failing loop. the output logs show this:

56b64e137d8ae9f5883dc3502e4f9ca4268cd9ff68fad2f6b04218c382aaf9d6_:
stdout: /bin/sh: 1: gunicorn: not found

How does GMT handle paths and run the final docker command

In the docker container we set PATH like this - it's pretty much what the source ./venv/bin/activate script does when you set up a virtual environment in python:

ENV PATH=$VIRTUAL_ENV/bin:$PATH \
    PYTHONPATH=/app

This sets up the PATH, so that when we call gunicorn, our webserver like this, we don't need to do the source ./venv/bin/activate thing.

The final line of the docker container looks like this:

# Use the shell form of CMD, so we have access to our environment variables
# $GUNICORN_CMD_ARGS allows us to add additional arguments to the gunicorn command
CMD gunicorn greenweb.wsgi \
  --bind $GUNICORN_BIND_IP:$PORT \
  --config gunicorn.conf.py \
  $GUNICORN_CMD_ARGS

In this case here, we are using environment variables laid out in .env.docker, as referred to in part of the docker compose file (see the subsection below):

django:
    env_file:
    - path: ./.env.docker
    build:
      context: .
      dockerfile: Dockerfile
    container_name: greenweb-app
    image: greenweb-app
    expose:
      - 9000
    ports:
      - 9000:9000

@arne - any idea why this might be happening?

There is a bin/sh binary in the docker container, but when run it, and any other shell, I see that it is respecting the PATH:

docker compose run --rm django /bin/dash gunicorn # finds the binary
docker compose run --rm django /bin/sh gunicorn # finds the binary
docker compose run --rm django /bin/bash gunicorn # finds the binary

Yet when I run GMT, the output suggests a different shell, or different variables not being available:

stdout: /bin/sh: 1: gunicorn: not found

An ideas how to resolve this?

@ArneTR
Copy link
Contributor

ArneTR commented May 15, 2024

Compose understands the ENV directive but the docker CLI client does not.

When executing commands with GMT we use docker CLI.

Can you just change the command to use the actual full path?

Something like /bin/sh PATH_TO_VENV/venv/bin/gunicorn ?

@mrchrisadams
Copy link
Member Author

sure, I'll give it a go now

@mrchrisadams
Copy link
Member Author

Hmm.. this didn't seem to resolve it

stdout: /bin/sh: 1: /app/.venv/bin/gunicorn: not found

I have one question here @arne:

ENV directive but the docker CLI client does not

Is there a link you know that outlines which directives are or are not supported, and which version of docker is in use under the hood?

I'm asking as for the container to make it to the gunicorn stage, I think it must have been able to run other commands that would have relied on access to values defined in the ENV section, and it might just be the final command that works differently.

Ideally, I'd be able to the into the generated container, so I can see what is going, but it's not obvious to me.

Alternatively - if these problems are related to the build stage, another approach might be to fetch the built dockerfile from an image repository.

The published images are not on Dockerhub, but Scaleway's image registry

@ArneTR
Copy link
Contributor

ArneTR commented May 15, 2024

We have a debugging mode especially for that. Just append the --debug flag. Then after every stage the GMT halts and you can just use a docker exec -it CONTAINER_NAME bash to peek into the container.

Also to speed up your iteration you can use --dev-no-build, --dev-no-metrics, --dev-no-sleeps which will speed up runs tremendously and get you faster to a point of failure.

If you want even more debugging you can use --dev-flow-timetravel which will create a kind of hot reload. The GMT will break when an error of any kind happens during the flows. Since all data from a local repository is jsut mounted into the containers you can change the files on disk and the changes will appear directly inside of the container. => YOu have to be in the flow part already and not in the container boot phase ... but I thought I mention it still.

=> My offer still stands: If you just can't get it running I can also give it a look and see why the command is not working. I however want to encourage you to play more with the GMT ;) Feel free though to hit me up if you feel that your interest for learning is exhausted

@mrchrisadams
Copy link
Member Author

mrchrisadams commented May 16, 2024

Thanks for these @ArneTR these are the commands I was running yesterday.

/Users/chrisadams/Code/tgwf/gmt-testing-admin-portal is totally fresh checkout of this repo, and this branch

python3 runner.py \
  --uri /Users/chrisadams/Code/tgwf/gmt-testing-admin-portal \
  --name testing-fresh-greenweb  \
  --allow-unsafe \
  --print-logs \
  --dev-no-metrics \
  --dev-no-sleeps

I had discovered --debug when looking through the code before, and had tried that, but you're right, I'll give it another go to try inspecting the state of the containers.

From what I can see, it looks like the build is happening with this kaniko-project/executor:latest image, which isn't the usual docker binary, and is a piece of software I'm less familiar with. I'm assuming kaniko-project is reference to the project of the same name, which allows you to run docker builds inside an existing docker context.

I'm basing this on the code I see here in the the runner:

context, dockerfile = self.get_build_info(service)
print(f"Building {service['image']}")
self.__notes_helper.add_note({'note': f"Building {service['image']}", 'detail_name': '[NOTES]', 'timestamp': int(time.time_ns() / 1_000)})

# Make sure the context docker file exists and is not trying to escape some root. We don't need the returns
context_path = join_paths(self._folder, context, 'directory')
join_paths(context_path, dockerfile, 'file')

docker_build_command = ['docker', 'run', '--rm',
    '-v', f"{self._folder}:/workspace:ro", # this is the folder where the usage_scenario is!
    '-v', f"{temp_dir}:/output",
    'gcr.io/kaniko-project/executor:latest',
    f"--dockerfile=/workspace/{context}/{dockerfile}",
    '--context', f'dir:///workspace/{context}',
    f"--destination={tmp_img_name}",
    f"--tar-path=/output/{tmp_img_name}.tar",
    '--no-push']

if self.__docker_params:
    docker_build_command[2:2] = self.__docker_params

print(' '.join(docker_build_command))

ps = subprocess.run(docker_build_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, encoding='UTF-8', check=False)

if ps.returncode != 0:
    print(f"Error: {ps.stderr} \n {ps.stdout}")
    raise OSError(f"Docker build failed\nStderr: {ps.stderr}\nStdout: {ps.stdout}")

# import the docker image locally
image_import_command = ['docker', 'load', '-q', '-i', f"{temp_dir}/{tmp_img_name}.tar"]
print(' '.join(image_import_command))
ps = subprocess.run(image_import_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, encoding='UTF-8', check=False)

This is from:
https://github.com/green-coding-solutions/green-metrics-tool/blob/main/runner.py#L566-L599

I'll try inspecting the running container some more to see if there is something I missed.

@mrchrisadams
Copy link
Member Author

mrchrisadams commented May 16, 2024

It just occured to me.

When I'm inspecting the container in the django build step, here am I not inspecting the kaniko container?

The build step doesn't seem to be calling docker build from the host machine:

docker_build_command = ['docker', 'run', '--rm',
    '-v', f"{self._folder}:/workspace:ro", # this is the folder where the usage_scenario is!
    '-v', f"{temp_dir}:/output",
    'gcr.io/kaniko-project/executor:latest',
    f"--dockerfile=/workspace/{context}/{dockerfile}",
    '--context', f'dir:///workspace/{context}',
    f"--destination={tmp_img_name}",
    f"--tar-path=/output/{tmp_img_name}.tar",
    '--no-push']

So presumably, my final django build artefact is in the tarball that gets loaded in next, right?

# import the docker image locally
image_import_command = ['docker', 'load', '-q', '-i', f"{temp_dir}/{tmp_img_name}.tar"]
print(' '.join(image_import_command))
ps = subprocess.run(image_import_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, encoding='UTF-8', check=False)

If that's the case that would explain why the container I inspected looked so different to the one I was expecting to see.

@mrchrisadams
Copy link
Member Author

I think I'm able to finally inspect the django container to see how the build phase worked differently to how it worked with docker compose.

cp /tmp/green-metrics-tool/docker_images/greenwebapp_gmt_run_tmp.tar ./image_to_inspect.tar
docker image load -i ./image_to_inspect.tar
docker run -it --rm greenwebapp_gmt_run_tmp:latest bash

The contents of this look more familiar to me at least now.

@mrchrisadams
Copy link
Member Author

I've been able to inspect the image in the tarball that is generated by the kaniko container, the image is called greenwebapp_gmt_run_tmp:latest

Running it without any environment variables passed in gives me this response:

docker run -it --rm \
    greenwebapp_gmt_run_tmp:latest

Output is:

Error: '' is not a valid port number.

That's to be expected, as the final line is this, and relies on there being environment variables being injected in the usage_scenario.yml file:

CMD /app/.venv/bin/gunicorn greenweb.wsgi --bind $GUNICORN_BIND_IP:$PORT --config gunicorn.conf.py $GUNICORN_CMD_ARGS

When I run it with the same environment variables as listed in the usage_scenario.yml file, I see it working

docker run -it --rm \
    -e VIRTUAL_ENV=/app/.venv \
    -e PYTHONPATH=/app \ 
    -e PATH=/app/.venv/bin:$PATH \ 
    -e PORT=9000 \ 
    -e GUNICORN_BIND_IP=0.0.0.0 \ 
    -e PYTHONDONTWRITEBYTECODE=1 \ 
    -e PYTHONUNBUFFERED=1 \ 
    -e DATABASE_URL=mysql://deploy:deploy@db:3306/greencheck \ 
    -e DATABASE_URL_READ_ONLY=mysql://deploy:deploy@db:3306/greencheck \ 
    -e RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/ \ 
    -e DJANGO_SETTINGS_MODULE=greenweb.settings.development greenwebapp_gmt_run_tmp:latest 
    

Output is what I expect:

[2024-05-16 09:04:21 +0000] [7] [INFO] Starting gunicorn 22.0.0
[2024-05-16 09:04:21 +0000] [7] [INFO] Listening at: http://0.0.0.0:9000 (7)
[2024-05-16 09:04:21 +0000] [7] [INFO] Using worker: sync
[2024-05-16 09:04:21 +0000] [8] [INFO] Booting worker with pid: 8
[2024-05-16 09:04:21 +0000] [9] [INFO] Booting worker with pid: 9
[2024-05-16 09:04:21 +0000] [10] [INFO] Booting worker with pid: 10
[2024-05-16 09:04:21 +0000] [11] [INFO] Booting worker with pid: 11
[2024-05-16 09:04:21 +0000] [12] [INFO] Booting worker with pid: 12
[2024-05-16 09:04:21 +0000] [13] [INFO] Booting worker with pid: 13
[2024-05-16 09:04:21 +0000] [14] [INFO] Booting worker with pid: 14
[2024-05-16 09:04:21 +0000] [15] [INFO] Booting worker with pid: 15


# this is me quitting the container and sending a signal to quit with ctrl C

^C[2024-05-16 09:04:35 +0000] [7] [INFO] Handling signal: int
[2024-05-16 09:04:35 +0000] [10] [INFO] Worker exiting (pid: 10)
[2024-05-16 09:04:35 +0000] [8] [INFO] Worker exiting (pid: 8)
[2024-05-16 09:04:35 +0000] [9] [INFO] Worker exiting (pid: 9)
[2024-05-16 09:04:35 +0000] [13] [INFO] Worker exiting (pid: 13)
[2024-05-16 09:04:35 +0000] [14] [INFO] Worker exiting (pid: 14)
[2024-05-16 09:04:35 +0000] [11] [INFO] Worker exiting (pid: 11)
[2024-05-16 09:04:35 +0000] [15] [INFO] Worker exiting (pid: 15)
[2024-05-16 09:04:35 +0000] [12] [INFO] Worker exiting (pid: 12)
[2024-05-16 09:04:36 +0000] [7] [INFO] Shutting down: Master

Here's the version number reported by docker when I call it:

docker --version
Docker version 24.0.6, build ed223bc

And this is where

which docker
/usr/local/bin/docker

I'll post some more when I look it again in the afternoon

@ArneTR
Copy link
Contributor

ArneTR commented May 16, 2024

You sure are traversing an interesting path with the GMT that even I I think have never taken :)

Inspecting the tarball is something you should not need to do ever. When you use --debug you can just enter the container as it is running on the system and any error it might give you can see directly.

Maybe I have not really understood in which step you are stuck, so I am trying a more detailed breakdown of the debug options:

  • If the container does not build it is usually because something is wrong with the dockerfile, as we use docker build for building there should be at least no strong anomalies.
  • Once you have the container build you should be able to inspect it a it exists on the host system. As mentioned before you might need to set the --dev-no-build flag which will trigger no re-builds but rather keep a cached copy. That you can enter from the command line

Kaniko is not something to worry about as it is just a sandbox we put around the build process that ensures some reproducible builds in terms of benchmarking time and some security benefits. It will produce an image on the host system though. If you however expect the build process to somehow get information from the host system this will induce a problem as this is exactly what we are trying to isolate. Everything needed for the container must be in the Dockerfile and the filesystem context.

From what I understand is that you can pass the build step but the container is not booting, correct?

You can see the command that is executed by GMT in order to start the container in the CLI logs. When I run with this command: python3 runner.py --uri ~/Sites/admin-portal --dev-no-build --dev-no-sleeps --dev-no-metrics --name test --skip-system-checks --skip-unsafe I get:

...
State of container 'rabbitmq': running
Running docker run with: docker run -it -d --name django -v /Users/arne/Sites/admin-portal:/tmp/repo:ro --mount type=bind,source=/Users/arne/Sites/admin-portal,target=/app,readonly -e VIRTUAL_ENV=/app/.venv -e PYTHONPATH=/app -e PATH=/app/.venv/bin:$PATH -e PORT=9000 -e GUNICORN_BIND_IP=0.0.0.0 -e PYTHONDONTWRITEBYTECODE=1 -e PYTHONUNBUFFERED=1 -e DATABASE_URL=mysql://deploy:deploy@db:3306/greencheck -e DATABASE_URL_READ_ONLY=mysql://deploy:deploy@db:3306/greencheck -e RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/ -e DJANGO_SETTINGS_MODULE=greenweb.settings.development --net greencheck-network greenwebapp_gmt_run_tmp

Setting up container:  test-container
Resetting container
Creating container
Waiting for dependent container django
State of container 'django': exited
Waiting for 1 second
State of container 'django': exited
Waiting for 1 second
State of container 'django': exited
...

So I can reproduce that the container fails to boot.

if you want to inspect the container though you can just copy the command from the GMT and boot it yourself:

docker run --rm -it --name django -v /Users/arne/Sites/admin-portal:/tmp/repo:ro --mount type=bind,source=/Users/arne/Sites/admin-portal,target=/app,readonly -e VIRTUAL_ENV=/app/.venv -e PYTHONPATH=/app -e PATH=/app/.venv/bin:$PATH -e PORT=9000 -e GUNICORN_BIND_IP=0.0.0.0 -e PYTHONDONTWRITEBYTECODE=1 -e PYTHONUNBUFFERED=1 -e DATABASE_URL=mysql://deploy:deploy@db:3306/greencheck -e DATABASE_URL_READ_ONLY=mysql://deploy:deploy@db:3306/greencheck -e RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/ -e DJANGO_SETTINGS_MODULE=greenweb.settings.development greenwebapp_gmt_run_tmp bash

Note that:

  • I added --rm to make it temporary
  • I removed -d as I want interactive
  • I removed --net greencheck-network as the network is not existent after GMT closes

I can then enter the container and see what is going on. Indeed the $PATH variable is not correctly set for the container and gunicorn cannot be found. Even when I start the venv and start gunicorn via CLI I get a broken response as some files seem to be missing.

The reason for that is that you are shadowing your filesystem by issuing a mount different to how you do it in the compose.yml

See:
Note the different volume mounts:

# compose.yml
    volumes:
      - ./apps:/app/apps
      - ./greenweb:/app/greenweb

#usage_scenario.yml
  volumes:
    - .:/app

By mounting everything into /app you also shadow the .venv directory and thus gunicorn is unknown in the container.

To my understanding you do not need the mounts at all since you anyway copy everything in the Dockerfile in the container, right? I suppose the volume mounts in the compose.yml exist only for your local debugging purposes and having files available to change in the running containers ...

So in summary:

  • Remove the volume loading for - .:/app from the usage_scenario.yml
  • Change the copy commands in the test-container to use the mountpoint of the repo:
    setup-commands:
      - cp /tmp/repo/green_metric_tests/greencheck_test.spec.js .
      - cp /tmp/repo//green_metric_tests/package.json .

Then for me at least I can reach the flow and the npm tests executes.

I am getting an error though which might be macOS related ... unsure. Might also be that the user in the container has wrong access rights ...?

Error:

Exception (<class 'RuntimeError'>): Process '['docker', 'exec', 'test-container', 'npm', 'test']' had bad returncode: 1. Stderr:
 Exception during run: Error: EPERM: operation not permitted, scandir '/proc/1/map_files/400000-5bdd000'
    at Object.readdirSync (node:fs:1509:26)
    at GlobSync._readdir (/node_modules/glob/sync.js:288:46)
    at GlobSync._readdirInGlobStar (/node_modules/glob/sync.js:267:20)
    at GlobSync._readdir (/node_modules/glob/sync.js:276:17)
    at GlobSync._processReaddir (/node_modules/glob/sync.js:137:22)
    at GlobSync._process (/node_modules/glob/sync.js:132:10)
    at GlobSync._processGlobStar (/node_modules/glob/sync.js:380:10)
    at GlobSync._process (/node_modules/glob/sync.js:130:10)
    at GlobSync._processGlobStar (/node_modules/glob/sync.js:383:10)
    at GlobSync._process (/node_modules/glob/sync.js:130:10)
    at GlobSync._processGlobStar (/node_modules/glob/sync.js:383:10)
    at GlobSync._process (/node_modules/glob/sync.js:130:10)
    at GlobSync._processGlobStar (/node_modules/glob/sync.js:383:10)
    at GlobSync._process (/node_modules/glob/sync.js:130:10)
    at new GlobSync (/node_modules/glob/sync.js:45:10)
    at Function.globSync [as sync] (/node_modules/glob/sync.js:23:10)
    at lookupFiles (/node_modules/mocha/lib/cli/lookup-files.js:90:15)
    at /node_modules/mocha/lib/cli/collect-files.js:36:39
    at Array.reduce (<anonymous>)
    at module.exports (/node_modules/mocha/lib/cli/collect-files.js:34:26)
    at singleRun (/node_modules/mocha/lib/cli/run-helpers.js:120:17)
    at exports.runMocha (/node_modules/mocha/lib/cli/run-helpers.js:190:10)
    at exports.handler (/node_modules/mocha/lib/cli/run.js:370:11)
    at /node_modules/yargs/build/index.cjs:443:71 {
  errno: -1,
  code: 'EPERM',
  syscall: 'scandir',
  path: '/proc/1/map_files/400000-5bdd000'
}

I hope this helps and you can continue from here.

P.S:

  • I added the info about debugging with --print-logs to https://docs.green-coding.io/docs/measuring/debugging/
  • The $PATH variable as you submit it will not be accepted by the GMT. It saw no issue with that at the moment but we do not allow any variables that contain a $. Since you already set an ENV in the Dockerfile there is no need to set it again in the compose.yml or usage_scenario.yml

@mrchrisadams
Copy link
Member Author

mrchrisadams commented May 16, 2024

Ah, @ArneTR I think the shadowing of mounts sounds like it was the source of the issue.

It makes sense to me why that would happen. I'm also able to get to the same point, and reproduce the error you see with mocha / node js. I think that unblocks me 🎆

Thanks so much!

@mrchrisadams
Copy link
Member Author

OK, I've got the usage_scenario.yml files setting up an environment and sending requests, and I'm seeing the readings in the running instance of GMT.

I think this is in shape to run in the calibrated setup, to at least give us some indicative readings via the hosted GMT service.

TLS is terminated by our reverse proxy server, Caddy
These are the equivalent of `-it` on the
command line in docker , but not compatible with
GMT
"description": "Integration tests for the Greencheck API, intended for use with the Green Metrics Tool from Green Code Berlin. (https://www.green-coding.io/projects/green-metrics-tool/)",
"main": "index.js",
"scripts": {
"test": "mocha '/*.spec.js'"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When run in root, **/*.spec.js will try to read procfs on the default node container, which triggers the permissions error we saw

depends_on:
- django
setup-commands:
- cp /tmp/repo/green_metric_tests/greencheck_test.spec.js .
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we needed to add tmp/repo to fetch this from the checked out repo on the GMT system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants