# Apache Solr Setup Guide for OHDSI WebAPI The following guide is used to setup and configure [Apache Solr](https://solr.apache.org/) for use with the OHDSI WebAPI. Apache Solr is used to improve performance for the vocaublary search capabilities of WebAPI/Atlas. ## Install the Solr Windows Service We will use Apache Procrun to wrap Solr 8.11.2 in a Windows service to ensure we can control start up/shutdown like other services. To do this, follow the following steps - Downloaded and install Apache SOLR 8.11.2 from https://lucene.apache.org/solr/downloads.html. For this guide, we will use `E:\solr\solr-8.11.2` as the install location. - Download Apache Commons Procrun for Windows (Procrun): http://www.apache.org/dist/commons/daemon/binaries/windows/. Extract the install to `E:\temp\procrun`. - Copy `prunsrv.exe` from the procrun directory `E:\solr\solr-8.11.2\bin` - Create a new file called `service.bat` in `E:\solr\solr-8.11.2\bin` and copy in the following text: ```` @echo off set SERVICE_NAME=Solr set SERVICE_HOME=E:\solr\solr-8.11.2 set PR_INSTALL=%SERVICE_HOME%\bin\prunsrv.exe @REM Service Log Configuration set PR_LOGPREFIX=%SERVICE_NAME% set PR_LOGPATH=%SERVICE_HOME%\logs set PR_STDOUTPUT=auto set PR_STDERROR=auto set PR_LOGLEVEL=Debug set PR_STARTUP=auto set PR_STARTMODE=exe set PR_STARTIMAGE=%SERVICE_HOME%\bin\solr.cmd set PR_STARTPARAMS=start @REM Shutdown Configuration set PR_STOPMODE=exe set PR_STOPIMAGE=%SERVICE_HOME%\bin\solr.cmd set PR_STOPPARAMS=stop;-all %PR_INSTALL% //IS/%SERVICE_NAME% ^ --Description="Apache Solr 8.11.2" ^ --DisplayName="%SERVICE_NAME%" ^ --Install="%PR_INSTALL%" ^ --Startup="%PR_STARTUP%" ^ --LogPath="%PR_LOGPATH%" ^ --LogPrefix="%PR_LOGPREFIX%" ^ --LogLevel="%PR_LOGLEVEL%" ^ --StdOutput="%PR_STDOUTPUT%" ^ --StdError="%PR_STDERROR%" ^ --StartMode="%PR_STARTMODE%" ^ --StartImage="%PR_STARTIMAGE%" ^ ++StartParams="%PR_STARTPARAMS%" ^ --StopMode="%PR_STOPMODE%" ^ --StopImage="%PR_STOPIMAGE%" ^ ++StopParams="%PR_STOPPARAMS%" if not errorlevel 1 goto installed echo Failed to install "%SERVICE_NAME%" service. Refer to log in %PR_LOGPATH% exit /B 1 :installed echo The Service "%SERVICE_NAME%" has been installed exit /B 0 ```` **NOTE**: Adjust the `SERVICE_HOME` setting to match your install location. Run a Windows Command Prompt in Administrator mode and then run `E:\solr\solr-8.11.2\bin\service.bat`. The service will be installed in the list of Windows Services as "Solr". Before moving forward, confirm the service is created but not running. ## Creating the Solr core for WebAPI vocabulary search **NOTE**: The name of the Solr core used **must** match the vocabulary version you plan to use in ATLAS & WebAPI with an underscore. For this example, the vocabulary version is "v5.0 17-JUN-19" and the corresponding folder name to hold this vocabulary is "v5.0_17-JUN-19". You should verify your vocabulary by running the following query on the CDM(s) you plan to use with WebAPI: `select vocabulary_version from vocabulary where vocabulary_id = 'None';` If your vocabulary version and core do not match, WebAPI will not find the Solr core and it will continue to use the DB when querying the vocabulary. ### Solr core creation - Verify that the JAR files for your RDMBS are located in `E:\solr\solr-8.11.2\server\lib` otherwise you will face issues when attempting to build the SOLR core. - Created 2 directories for the core: - `E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19` - `E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19\data` - Copy the contents of `WebAPI\src\main\resources\solr` into `E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19`. Next make the following edits to the files in `E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19`: - `data-config.xml`: Edit the data source & references in the query to match the database holding your vocabulary - `core.properties`: Edit the name to match the directory: `v5.0_17-JUN-19` - `conf\solrconfig.xml`: Edit the connection information in the `` block to provide it the details to connect the database holding your vocabulary. ### Building the Solr core Next, start up the Solr windows service and verify connect to http://localhost:8983. If there are problems starting up the service, please review the logs found in `E:\solr\solr-8.11.2\server\logs`. - From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the `v5.0 17-JUN-19` core from the drop down, and then in the sub-menu that appears, I selected `Dataimport` and then used the 'Execute' button. The service will build the core from the concepts per the query in the `data-config.xml` - Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is: `query:metformin` #### WebAPI Configuration In the settings.xml for WebAPI, add the following XML to your profile: ```` http://localhost:8983/solr ```` Recompile and deploy WebAPI.war. Once deployed, verify that the SOLR service is available for search by going to the endpoint `WebAPI/info`. The output will look similar to this: ```` { "version": "2.8.0", "buildInfo": { "artifactVersion": "WebAPI 2.8.0-SNAPSHOT", "build": "NA", "timestamp": "Thu Dec 12 16:53:22 UTC 2019" }, "configuration": { "security": { "enabled": true }, "vocabulary": { "cores": [ "v5.0 17-JUN-19" ], "solrEnabled": true }, "person": { "viewDatesPermitted": false }, "heracles": { "smallCellCount": "5" } } } ```` Note in the JSON above, the `vocabulary` section shows the Solr is enabled and lists the core(s) that are available for search. ## Install the Solr Service on RedHat (tested on v7) Download the binary and install (sudo is needed) ``` cd /opt # Download the binary (https://solr.apache.org/downloads.html) -- v8.11.1 tgz file tar xzf solr-8.11.1.tgz solr-8.11.1/bin/install_solr_service.sh --strip-components=2 bash ./install_solr_service.sh solr-8.11.1.tgz ``` ### Solr core creation 1. Verify that the JAR files for your RDMBS are located in /opt/solr/server/lib otherwise you will face issues when attempting to build the SOLR core. 2. Create 2 directories for the core: * /var/solr/data/v5.0_12-FEB-21 * /var/solr/data/v5.0_12-FEB-21/data 3. Copy the contents of WebAPI/src/main/resources/solr into /var/solr/data/v5.0_12-FEB-21. 4. Next make the following edits to the files in /var/solr/data/v5.0_12-FEB-21: - data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary - core.properties: Edit the name to match the directory: v5.0_12-FEB-21, remove the "conf\\\\" prefix from config and schema values. - conf\solrconfig.xml: Edit the connection information in the block to provide it the details to connect the database holding your vocabulary. - if using Spark, add this to this /dataimport block: ```true``` ### Building the Solr core Next, start up the Solr service (`systemctl start solr`) and verify connect to http://localhost:8983. If there are problems starting up the service, please review the logs found in /var/solr/logs. - From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the v5.0_12-FEB-21 core from the drop down, and then in the sub-menu that appears, I selected Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml - Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is to enter this in the q field: query:metformin **Then, follow the rest of the WebAPI instructions in the [section above](#webapi-configuration)** ## Install SOLR using Docker ### On Windows/Mac: Download Docker desktop app (https://www.docker.com/products/docker-desktop). ### On Debian/Ubuntu Linux: ``` sudo apt-get install docker ``` ### On RHEL: ``` sudo yum install docker ``` #### Solr core creation 1. Create a directory to store the Dockerfile and configuration files, this directory is known as the docker build context folder. Navigate to it in your command line interface. 2. Create a file named "Dockerfile" in this build context folder and use this as the content: ``` FROM solr:8.11.1 # argument variables to define ARG vocabulary_version ARG jdbc_file_name # copy the solr configset from WebAPI COPY --chown=solr /WebAPI/src/main/resources/solr /var/solr/data/$vocabulary_version # copy your JDBC file COPY --chown=solr $jdbc_file_name /opt/solr-8.11.1/server/lib/$jdbc_file_name ``` 3. Clone the WebAPI Git repo (using your desired commit or release) into the docker build context folder: ``` git clone https://github.com/OHDSI/WebAPI.git ``` 4. Copy your JDBC jar file (for connecting to your vocabulary's database platform) into the build context folder. 5. Next make the following edits to the files in /WebAPI/src/main/resources/solr: - data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary - core.properties: Edit the name to match the vocabulary version (e.g. v5.0_20-MAY-21), remove the "conf\\\\" prefix from config and schema values. - conf\\solrconfig.xml: Edit the connection information in the block to provide it the details to connect the database holding your vocabulary. - if using Spark, add this to this /dataimport block: ```true``` #### Build/create/start the docker container 5. Run the container build step, specifying the vocabulary version and the name of the JDBC file needed for connecting to your database platform: ``` docker build --build-arg vocabulary_version=v5.0_20-MAY-21 --build-arg jdbc_file_name=SparkJDBC41.jar --no-cache -t solr . ``` 6. Run the create step, which will then create the container: ``` docker create --restart=always --name=solr -p 8983:8983 -t solr ``` 7. Start the container: ``` docker start solr ``` 8. Test the container has started by going to http://localhost:8983/solr/#/ (substitute the server name for localhost) #### Building the Solr core - From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the vocabulary core from the drop down, and then in the sub-menu that appears, select Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml - Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is to enter this in the q field: query:metformin **Then, follow the rest of the WebAPI configuration instructions in the [section above](#webapi-configuration).**