CEDA Near-Line Data Store

This is the server code for the CEDA Near-Line Data Store (NLDS), consisting of an HTTP API and a cluster of rabbit consumer microservices. The NLDS client is required to communicate with the API, either via the command line interface or python client library.

The NLDS is a unified storage solution, allowing easy use of disk, s3 object storage, and tape from a single interface. It utilises object storage as a cache for the tape backend allowing for low-latency backup

The NLDS server is built upon FastAPI for the API, RabbitMQ for the message broker, minio for the s3 client, SQLAlchemy for the database client and xrootd for the tape interactions.

Documentation can be found here.

Installation

If installing locally we strongly recommend the use of a virtual environment to manage the dependencies.

Create a Python virtual environment:
```
python3 -m venv nlds-venv
```
Activate the nlds-venv:
```
source nlds-venv/bin/activate
```
You could either install the nlds package with editing capability from a locally cloned copy of this repo (note the inclusion of the editable flag -e), e.g.
```
pip install -e ~/Coding/nlds
```
or install this repo directly from github:
```
pip install git+https://github.com/cedadev/nlds.git
```
(Optional) There are several more requirements/dependencies defined:
- requirements-dev.txt - contains development-specific (i.e. not production appropriate) dependencies. Currently this consists of a psycopg2 binary python package for interacting with PostgeSQL from a local NLDS instance.
- requirements-deployment.txt - contains deployment-specific dependencies, excluding XRootD. Currently this consists of the psycopg2 package but built from source instead of a precompiled binary.
- requirements-tape.txt - contains tape-specific dependencies, notably XRootD.
- tests/requirements.txt - contains the dependencies for the test suite.
- docs/requirements.txt - contains the dependencies required for building the documentation with sphinx.

Server Config

To interface with the JASMIN accounts portal, for the OAuth2 authentication, a .server_config file has to be created. This contains infrastructure information and so is not included in the GitHub repository. See the relevant documentation and examples for more information.

A Jinja-2 template for the .server_config file can also be found in the templates/ directory.

Running the Server

The NLDS API requires something to serve the API, usually uvicorn in a local development environment:
```
uvicorn nlds.main:nlds --reload
```
This will create a local NLDS API server at http://127.0.0.1:8000/. FastAPI displays automatically generated documentation for the REST-API, to browse this go to http://127.0.0.1:8000/docs/
To run the microservices, you have three options:
1. In individual terminals, after activating the virtual env, (e.g. source ~/nlds-venv/bin/activate), start each of the microservice consumers:
```
nlds_q
index_q
catalog_q  
transfer_put_q   
transfer_get_q
logging_q
monitor_q
archive_put_q
archive_get_q
```
  This will send the output of each consumer to its own terminal (as well as whatever is configured in the logger).
2. Alternatively, you can use the scripts in the test_run/ directory: start_test_run.py to start and stop_test_run.py to stop. This will start a screen session with all 8 processors (+ api server) in, sending each output to a log in the ./nlds_log/ directory. This method is good for getting a whole NLDS infrastructure up and running quickly, but is not so great for debugging.
3. Also in the test_run/ directory is a script called nlds-up which will activate the virtual environment and run one of the microservices listed above. It requires the enviroment variable NLDS_VENV to be set to point to the virtual environemnt created above (e.g. ~/nlds-venv/ - however, user paths have to be expanded so, for the Mac, this would be /Users/<your_username>/nlds-venv). It is recommended to set the environment variable in your shell profile. For bash this is ~/.bash_profile, and the environment can be set using the line:
```
export NLDS_VENV="/Users/<your_username>/python-venvs/nlds-venv"
```
Running nlds-up then takes the name of one of the microservices above: e.g. nlds-up catalog_q

Tests

The NLDS uses pytest for its unit test suite. Once test/requirements.txt have been installed, you can run the tests with

pytest

in the root directory. Pytest is also used for integration testing in the separate nlds-test repo.

The pytest test-coverage report can (hopefully) be found here.

License

The NLDS is available on a BSD 2-Clause License, see the license for more info.

Acknowledgements

NLDS was developed at the Centre for Environmental Data Analysis and supported through the ESiWACE2 project. The project ESiWACE2 has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 823988.

Name		Name	Last commit message	Last commit date
Latest commit History 1,095 Commits
.github/workflows		.github/workflows
docs		docs
migrations		migrations
nlds		nlds
nlds_processors		nlds_processors
nlds_utils		nlds_utils
scripts		scripts
server_config		server_config
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
alembic.ini		alembic.ini
requirements-deployment.txt		requirements-deployment.txt
requirements-dev.txt		requirements-dev.txt
requirements-tape.txt		requirements-tape.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CEDA Near-Line Data Store

Installation

Server Config

Running the Server

Tests

License

Acknowledgements

About

Releases 3

Packages

Contributors 7

Languages

License

cedadev/nlds

Folders and files

Latest commit

History

Repository files navigation

CEDA Near-Line Data Store

Installation

Server Config

Running the Server

Tests

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 7

Languages

Packages