-
Notifications
You must be signed in to change notification settings - Fork 169
Apache Solr Setup Guide
The following guide is used to setup and configure Apache Solr for use with the OHDSI WebAPI. Apache Solr is used to improve performance for the vocaublary search capabilities of WebAPI/Atlas.
We will use Apache Procrun to wrap Solr 8.11.2 in a Windows service to ensure we can control start up/shutdown like other services. To do this, follow the following steps
- Downloaded and install Apache SOLR 8.11.2 from https://lucene.apache.org/solr/downloads.html. For this guide, we will use
E:\solr\solr-8.11.2
as the install location. - Download Apache Commons Procrun for Windows (Procrun): http://www.apache.org/dist/commons/daemon/binaries/windows/. Extract the install to
E:\temp\procrun
. - Copy
prunsrv.exe
from the procrun directoryE:\solr\solr-8.11.2\bin
- Create a new file called
service.bat
inE:\solr\solr-8.11.2\bin
and copy in the following text:
@echo off
set SERVICE_NAME=Solr
set SERVICE_HOME=E:\solr\solr-8.11.2
set PR_INSTALL=%SERVICE_HOME%\bin\prunsrv.exe
@REM Service Log Configuration
set PR_LOGPREFIX=%SERVICE_NAME%
set PR_LOGPATH=%SERVICE_HOME%\logs
set PR_STDOUTPUT=auto
set PR_STDERROR=auto
set PR_LOGLEVEL=Debug
set PR_STARTUP=auto
set PR_STARTMODE=exe
set PR_STARTIMAGE=%SERVICE_HOME%\bin\solr.cmd
set PR_STARTPARAMS=start
@REM Shutdown Configuration
set PR_STOPMODE=exe
set PR_STOPIMAGE=%SERVICE_HOME%\bin\solr.cmd
set PR_STOPPARAMS=stop;-all
%PR_INSTALL% //IS/%SERVICE_NAME% ^
--Description="Apache Solr 8.11.2" ^
--DisplayName="%SERVICE_NAME%" ^
--Install="%PR_INSTALL%" ^
--Startup="%PR_STARTUP%" ^
--LogPath="%PR_LOGPATH%" ^
--LogPrefix="%PR_LOGPREFIX%" ^
--LogLevel="%PR_LOGLEVEL%" ^
--StdOutput="%PR_STDOUTPUT%" ^
--StdError="%PR_STDERROR%" ^
--StartMode="%PR_STARTMODE%" ^
--StartImage="%PR_STARTIMAGE%" ^
++StartParams="%PR_STARTPARAMS%" ^
--StopMode="%PR_STOPMODE%" ^
--StopImage="%PR_STOPIMAGE%" ^
++StopParams="%PR_STOPPARAMS%"
if not errorlevel 1 goto installed
echo Failed to install "%SERVICE_NAME%" service. Refer to log in %PR_LOGPATH%
exit /B 1
:installed
echo The Service "%SERVICE_NAME%" has been installed
exit /B 0
NOTE: Adjust the SERVICE_HOME
setting to match your install location.
Run a Windows Command Prompt in Administrator mode and then run E:\solr\solr-8.11.2\bin\service.bat
. The service will be installed in the list of Windows Services as "Solr". Before moving forward, confirm the service is created but not running.
NOTE: The name of the Solr core used must match the vocabulary version you plan to use in ATLAS & WebAPI with an underscore. For this example, the vocabulary version is "v5.0 17-JUN-19" and the corresponding folder name to hold this vocabulary is "v5.0_17-JUN-19". You should verify your vocabulary by running the following query on the CDM(s) you plan to use with WebAPI:
select vocabulary_version from vocabulary where vocabulary_id = 'None';
If your vocabulary version and core do not match, WebAPI will not find the Solr core and it will continue to use the DB when querying the vocabulary.
- Verify that the JAR files for your RDMBS are located in
E:\solr\solr-8.11.2\server\lib
otherwise you will face issues when attempting to build the SOLR core. - Created 2 directories for the core:
E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19
E:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19\data
- Copy the contents of
WebAPI\src\main\resources\solr
intoE:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19
. Next make the following edits to the files inE:\solr\solr-8.11.2\server\solr\v5.0_17-JUN-19
:-
data-config.xml
: Edit the data source & references in the query to match the database holding your vocabulary -
core.properties
: Edit the name to match the directory:v5.0_17-JUN-19
-
conf\solrconfig.xml
: Edit the connection information in the<requestHandler name="/dataimport" class="solr.DataImportHandler">
block to provide it the details to connect the database holding your vocabulary.
-
Next, start up the Solr windows service and verify connect to http://localhost:8983. If there are problems starting up the service, please review the logs found in E:\solr\solr-8.11.2\server\logs
.
- From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the
v5.0 17-JUN-19
core from the drop down, and then in the sub-menu that appears, I selectedDataimport
and then used the 'Execute' button. The service will build the core from the concepts per the query in thedata-config.xml
- Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is:
query:metformin
In the settings.xml for WebAPI, add the following XML to your profile:
<solr.endpoint>http://localhost:8983/solr</solr.endpoint>
Recompile and deploy WebAPI.war. Once deployed, verify that the SOLR service is available for search by going to the endpoint WebAPI/info
. The output will look similar to this:
{
"version": "2.8.0",
"buildInfo": {
"artifactVersion": "WebAPI 2.8.0-SNAPSHOT",
"build": "NA",
"timestamp": "Thu Dec 12 16:53:22 UTC 2019"
},
"configuration": {
"security": {
"enabled": true
},
"vocabulary": {
"cores": [
"v5.0 17-JUN-19"
],
"solrEnabled": true
},
"person": {
"viewDatesPermitted": false
},
"heracles": {
"smallCellCount": "5"
}
}
}
Note in the JSON above, the vocabulary
section shows the Solr is enabled and lists the core(s) that are available for search.
Download the binary and install (sudo is needed)
cd /opt
# Download the binary (https://solr.apache.org/downloads.html) -- v8.11.1 tgz file
tar xzf solr-8.11.1.tgz solr-8.11.1/bin/install_solr_service.sh --strip-components=2
bash ./install_solr_service.sh solr-8.11.1.tgz
-
Verify that the JAR files for your RDMBS are located in /opt/solr/server/lib otherwise you will face issues when attempting to build the SOLR core.
-
Create 2 directories for the core:
- /var/solr/data/v5.0_12-FEB-21
- /var/solr/data/v5.0_12-FEB-21/data
-
Copy the contents of WebAPI/src/main/resources/solr into /var/solr/data/v5.0_12-FEB-21.
-
Next make the following edits to the files in /var/solr/data/v5.0_12-FEB-21:
- data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary
- core.properties: Edit the name to match the directory: v5.0_12-FEB-21, remove the "conf\\" prefix from config and schema values.
- conf\solrconfig.xml: Edit the connection information in the block to provide it the details to connect the database holding your vocabulary.
- if using Spark, add this to this /dataimport block:
<str name="autoCommit">true</str>
- if using Spark, add this to this /dataimport block:
Next, start up the Solr service (systemctl start solr
) and verify connect to http://localhost:8983. If there are problems starting up the service, please review the logs found in /var/solr/logs.
- From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the v5.0_12-FEB-21 core from the drop down, and then in the sub-menu that appears, I selected Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml
- Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is to enter this in the q field: query:metformin
Then, follow the rest of the WebAPI instructions in the section above
Download Docker desktop app (https://www.docker.com/products/docker-desktop).
sudo apt-get install docker
sudo yum install docker
-
Create a directory to store the Dockerfile and configuration files, this directory is known as the docker build context folder. Navigate to it in your command line interface.
-
Create a file named "Dockerfile" in this build context folder and use this as the content:
FROM solr:8.11.1
# argument variables to define
ARG vocabulary_version
ARG jdbc_file_name
# copy the solr configset from WebAPI
COPY --chown=solr /WebAPI/src/main/resources/solr /var/solr/data/$vocabulary_version
# copy your JDBC file
COPY --chown=solr $jdbc_file_name /opt/solr-8.11.1/server/lib/$jdbc_file_name
- Clone the WebAPI Git repo (using your desired commit or release) into the docker build context folder:
git clone https://github.com/OHDSI/WebAPI.git
-
Copy your JDBC jar file (for connecting to your vocabulary's database platform) into the build context folder.
-
Next make the following edits to the files in /WebAPI/src/main/resources/solr:
- data-config.xml: Edit the data source & references in the query to match the database holding your vocabulary
- core.properties: Edit the name to match the vocabulary version (e.g. v5.0_20-MAY-21), remove the "conf\\" prefix from config and schema values.
- conf\solrconfig.xml: Edit the connection information in the block to provide it the details to connect the database holding your vocabulary.
- if using Spark, add this to this /dataimport block:
<str name="autoCommit">true</str>
- if using Spark, add this to this /dataimport block:
- Run the container build step, specifying the vocabulary version and the name of the JDBC file needed for connecting to your database platform:
docker build --build-arg vocabulary_version=v5.0_20-MAY-21 --build-arg jdbc_file_name=SparkJDBC41.jar --no-cache -t solr .
- Run the create step, which will then create the container:
docker create --restart=always --name=solr -p 8983:8983 -t solr
- Start the container:
docker start solr
- Test the container has started by going to http://localhost:8983/solr/#/ (substitute the server name for localhost)
- From the SOLR Admin screen (http://localhost:8983/solr/#/), build the core by using the 'Core Selector' dropdown in the left-hand menu. Select the vocabulary core from the drop down, and then in the sub-menu that appears, select Dataimport and then used the 'Execute' button. The service will build the core from the concepts per the query in the data-config.xml
- Once the execution of the core indexing is complete, you can use the Solr "Query" tool under the core sub-menu to make sure the core is working properly before moving to WebAPI. A sample query you can use is to enter this in the q field: query:metformin
Then, follow the rest of the WebAPI configuration instructions in the section above.