-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Containerize and publish gaia-core
and gaia-db
#340
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these fantastic contributions, Jared!
From what I can tell, both of the images build correctly. Thanks for fixing the gaia-core Dockerfile and updating the README in gaia-db.
Looks good to merge, please feel free to do so when you're ready.
(Ah, I now see a TODO... )
Hi @jshoughtaling and @kzollove, I just successfully built both images on a PC (no 'sudo'). This looks really cool and seems that we could integrate it with some of the work in the /inst/gaiadb directory (existing postGIS dockerfile) and inst/repository (some other work I was doing with the catalog). I also have a more updated version of some of these files including a start on networking between gaia-core, gaia-db, and the catalog ... perhaps worth merging at some point, and considering a refactor on where to put the Dockerfiles in the repo ... I am also thinking about helm charts and rancher for more cloud native kubernetes installs .... this approach would be in addition to the docker compose approach ... more on this later :) thoughts? |
@jshoughtaling to message Lee Evans ASAP about Broadsea integrations. |
Thoughts on the current (working) process flow:
|
@jshoughtaling I am still ending up with Broadsea, Hades installed in R, but no gaiaCore. I am following instructions above fairly closely: docker build -t gaia-db-test -f docker/gaia-db/Dockerfile .;\
docker build -t gaia-core-test -f docker/gaia-core/Dockerfile .;\
docker run -itd --rm --env POSTGRES_PASSWORD=SuperSecret --name gaia-db gaia-db-test;\
docker run --rm -e USER="ohdsi" -e=PASSWORD="mypass" -itd -p 8787:8787 --name gaia-core gaia-core-test I've tried this on WSL2 (on Tufts laptop) and personal laptop (Intel Mac) I'm assuming it quietly fails when image is built. When I try to install from in the container, I get Sorry for terrible formatting> remotes::install_github("OHDSI/GIS") Downloading GitHub repo OHDSI/GIS@HEAD Installing 2 packages: sf, rpostgis Installing packages into ‘/usr/local/lib/R/site-library’ (as ‘lib’ is unspecified) trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/sf_1.0-17.tar.gz' Content type 'binary/octet-stream' length 3863868 bytes (3.7 MB) ================================================== downloaded 3.7 MB trying URL 'https://packagemanager.rstudio.com/cran/__linux__/focal/latest/src/contrib/rpostgis_1.5.1.tar.gz' Content type 'application/x-tar' length 1672998 bytes (1.6 MB) ================================================== downloaded 1.6 MB * installing *source* package ‘sf’ ... ** package ‘sf’ successfully unpacked and MD5 sums checked ** using staged installation configure: CC: gcc configure: CXX: g++ -std=gnu++14 checking for gdal-config... /usr/bin/gdal-config checking gdal-config usability... yes configure: GDAL: 3.0.4 checking GDAL version >= 2.0.1... yes checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether the compiler supports GNU C... yes checking whether gcc accepts -g... yes checking for gcc option to enable C11 features... none needed checking for stdio.h... yes checking for stdlib.h... yes checking for string.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for strings.h... yes checking for sys/stat.h... yes checking for sys/types.h... yes checking for unistd.h... yes checking for gdal.h... yes checking GDAL: linking with --libs only... yes checking GDAL: /usr/share/gdal/pcs.csv readable... no checking GDAL: checking whether PROJ is available for linking:... yes checking GDAL: checking whether PROJ is available for running:... yes configure: GDAL: 3.0.4 configure: pkg-config proj exists, will use it configure: using proj.h. configure: PROJ: 6.3.1 checking PROJ: checking whether PROJ and sqlite3 are available for linking:... yes checking for geos-config... /usr/bin/geos-config checking geos-config usability... yes configure: GEOS: 3.8.0 checking GEOS version >= 3.4.0... yes checking for geos_c.h... yes checking geos: linking with -L/usr/lib/x86_64-linux-gnu -lgeos_c... yes configure: Package CPP flags: -DHAVE_PROJ_H -I/usr/include/gdal -I/usr/include configure: Package LIBS: -lproj -L/usr/lib -lgdal -L/usr/lib/x86_64-linux-gnu -lgeos_c configure: creating ./config.status config.status: creating src/Makevars ** libs g++ -std=gnu++14 -I"/usr/local/lib/R/include" -DNDEBUG -DHAVE_PROJ_H -I/usr/include/gdal -I/usr/include -I'/usr/local/lib/R/site-library/Rcpp/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c RcppExports.cpp -o RcppExports.o g++ -std=gnu++14 -I"/usr/local/lib/R/include" -DNDEBUG -DHAVE_PROJ_H -I/usr/include/gdal -I/usr/include -I'/usr/local/lib/R/site-library/Rcpp/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c bbox.cpp -o bbox.o g++ -std=gnu++14 -I"/usr/local/lib/R/include" -DNDEBUG -DHAVE_PROJ_H -I/usr/include/gdal -I/usr/include -I'/usr/local/lib/R/site-library/Rcpp/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c gdal.cpp -o gdal.o gdal.cpp: In function ‘Rcpp::NumericVector CPL_transform_bounds(Rcpp::NumericVector, Rcpp::List, int)’: gdal.cpp:713:9: error: ‘ret’ was not declared in this scope 713 \| return ret; \| ^~~ make: *** [/usr/local/lib/R/etc/Makeconf:177: gdal.o] Error 1 ERROR: compilation failed for package ‘sf’ * removing ‘/usr/local/lib/R/site-library/sf’ ERROR: dependency ‘sf’ is not available for package ‘rpostgis’ * removing ‘/usr/local/lib/R/site-library/rpostgis’ The downloaded source packages are in ‘/tmp/Rtmpoc3MTG/downloaded_packages’ Running `R CMD build`... * checking for file ‘/tmp/Rtmpoc3MTG/remotes15154f767f/OHDSI-GIS-71e4b83/DESCRIPTION’ ... OK * preparing ‘gaiaCore’: * checking DESCRIPTION meta-information ... OK * checking for LF line-endings in source and make files and shell scripts * checking for empty or unneeded directories Omitted ‘LazyData’ from DESCRIPTION * building ‘gaiaCore_0.0.0.9000.tar.gz’ Installing package into ‘/usr/local/lib/R/site-library’ (as ‘lib’ is unspecified) ERROR: dependencies ‘rpostgis’, ‘sf’ are not available for package ‘gaiaCore’ * removing ‘/usr/local/lib/R/site-library/gaiaCore’ Warning messages: 1: In i.p(...) : installation of package ‘sf’ had non-zero exit status 2: In i.p(...) : installation of package ‘rpostgis’ had non-zero exit status 3: In i.p(...) : installation of package ‘/tmp/Rtmpoc3MTG/file1515702ae0/gaiaCore_0.0.0.9000.tar.gz’ had non-zero exit status -- > | > > |
Launch build process
Fix issues blocking simple loadVariable operations
* Fix lat/lon on EPA datasources * Add createExposure function
Pull TuftsCTSI/containerize into OHDSI/containerize
* modifications for Broadsea builds from github * remove repository docker from inst
Soooo ...... after playing for a while both from the Tufts/Broadsea_GISland#gaia-core and directly from OHDSI/GIS#containerize, here's a network observation and an error. But first, I can build the images from either repo with no errors, I can run R and import gaiaCore no problem, and (after a network tweak from OHDSI/GIS) I can connect the database from R - whoot!! As always, the last note at the very bottom is perhaps thee most important ... Network tweak for running containers from OHDSI/GIS.after the builds (directly from Kyles note above), but before running the images, we need to create a docker network:
then we need to run the containers referencing that network:
Error on trying to load variableFrom R after loading the libraries we must set the databaseConnector (NOTE: on the OHDSI/GIS#containerize build has the database name as "postgres" not "gaiaDB"):
The using loadVariable() seems to have a bug still:
I have tried this on several different variables, always the same error. The SQL error that gets dumped is:
and finally, there are two things that look puzzling:
in the SQL statement it looks like there might be a blank TYPE: |
Hi @tibbben thanks for testing this! The loadVariable bug was handled earlier but the changes to gaia-core/Dockerfile (i.e. to install gaiaCore from OHDSI/containerize) did not propagate to OHDSI/containerize until just now, thanks for catching that. Upon testing that update, I was getting a weird error with Andromeda. Looks like Broadsea Hades image has not been updated in years. Fixed in the gaia-core/Dockerfile by just updating Andromeda, but I wonder if we shouldnt ask Broadsea team to update that image? Anyways, I am going to push the current OHDSI/containerize to main at EOD unless anyone objects. We can continue to develop but I want to change gaia-core/Dockerfile to install from OHDSI/main sooner than later. Also, see OHDSI/containerize/README.md/Getting Started for most up to date install instructions, and feel free to add there when ready |
it all works!! I loaded a variable from the catalog container (in Broadsea). Let me know when you move to the main branch and I can with the url in OHDSI. I will add some readme study soon (to the containerize branch) |
This PR containerizes the main elements of the OHDSI GIS repository, namely
gaia-core
andgaia-db
.Note
The proposed CI/CD pipeline to build and push images with GitHub actions relies on a user account with elevated permissions. The following lines will need to be updated in the workflow yml files to complete the build and push action:
build_gaia_core.yml
: 40build_gaia_db.yml
: 44A GitHub action secret, named
GH_TOKEN
containing a scoped personal access token for the user above, needs to be added to the repository so it can be referenced in the action.Moreover, the process assumes these images will be hosted on the GitHub Container Registry (ghcr.io) rather than on Docker Hub. I have seen OHDSI images hosted in both places and am not sure what the current status or approach is in that regard. GHCR is cleaner with regard to authentication, IMO, but the code can be easily adapted to the docker.io space if needed. In either case, the image build pipeline requires elevated credentials.
GAIA-CORE
This image simply installs the gaia R package into a pinned (4.2.1) ohdsi/broadsea-hades base image. I have not added any sort of entrypoint; more work needs to be done to script processes that can load and subsequently geocode data in a single workflow. I will discuss further with @kzollove.
GAIA-DB
This image is based on the alpine flavored postgis base image. The initialization script for the database combines and modifies existing sql scripts used in both the catalog initialization (via the backbone schema) and the vocabulary integration.
Once deployed and auto-initialized, the containerized Postgres database includes:
backbone
schema)vocabulary
schema)tiger
schema)In order to build the images locally, clone this branch of the repo and run the following commands at the top level:
sudo docker build -t gaia-db-test -f docker/gaia-db/Dockerfile .
sudo docker build -t gaia-core-test -f docker/gaia-core/Dockerfile .
In order to run them locally after building, you can execute the following:
Warning
I needed to update the staging vocabulary csv files in order to create a relationally consistent
vocabulary
schema. This meant adding certain entries and assigning 2B+ concept id values. I will need to discuss in more detail with @p-talapova to make sure these changes align with her efforts.TODO
docker-compose.yml
file with both services and their associated configuration parametersCONCEPT_RELATIONSHIP
entities here that reference codes in proprietary vocabularies