Skip to content

Latest commit

 

History

History
182 lines (130 loc) · 6.85 KB

DEVELOPER_GUIDE.md

File metadata and controls

182 lines (130 loc) · 6.85 KB

Developer Guide

So you want to contribute code to OpenSearch k-NN? Excellent! We're glad you're here. Here's what you need to do.

Getting Started

Fork OpenSearch k-NN Repo

Fork opensearch-project/OpenSearch k-NN and clone locally.

Example:

git clone https://github.com/[your username]/OpenSearch.git

Install Prerequisites

JDK 14

OpenSearch builds using Java 14 at a minimum. This means you must have a JDK 14 installed with the environment variable JAVA_HOME referencing the path to Java home for your JDK 14 installation, e.g. JAVA_HOME=/usr/lib/jvm/jdk-14.

One easy way to get Java 14 on *nix is to use sdkman.

curl -s "https://get.sdkman.io" | bash
source ~/.sdkman/bin/sdkman-init.sh
sdk install java 14.0.2-open
sdk use java 14.0.2-open

Use an Editor

IntelliJ IDEA

When importing into IntelliJ you will need to define an appropriate JDK. The convention is that this SDK should be named "14", and the project import will detect it automatically. For more details on defining an SDK in IntelliJ please refer to this documentation. Note that SDK definitions are global, so you can add the JDK from any project, or after project import. Importing with a missing JDK will still work, IntelliJ will report a problem and will refuse to build until resolved.

You can import the OpenSearch project into IntelliJ IDEA as follows.

  1. Select File > Open
  2. In the subsequent dialog navigate to the root build.gradle file
  3. In the subsequent dialog select Open as Project

Build

OpenSearch k-NN uses a Gradle wrapper for its build. Run gradlew on Unix systems, or gradlew.bat on Windows in the root of the repository.

Build OpenSearch k-NN using gradlew build

./gradlew build

JNI Library

The plugin relies on a JNI library to perform approximate k-NN search. For plugin installations from archive(.zip), it is necessary to ensure .so file for Linux and .jnilib file for Mac OS are present in the Java library path. This can be possible by copying .so/.jnilib to either $ES_HOME or by adding manually -Djava.library.path=<path_to_lib_files> in jvm.options file

To build the JNI Library, follow these steps:

cd jni
cmake .
make

The library will be placed in the jni/release directory.

To build an RPM or DEB of the JNI library, follow these steps:

cd jni
cmake .
make package

The artifacts will be placed in the jni/packages directory.

JNI Library Artifacts

We build and distribute binary library artifacts with OpenSearch. We build the library binary, RPM and DEB in this GitHub action. We use Centos 7 with g++ 4.8.5 to build the DEB, RPM and ZIP. Additionally, in order to provide as much general compatibility as possible, we compile the library without optimized instruction sets enabled. For users that want to get the most out of the library, they should follow this section and build the library from source in their production environment, so that if their environment has optimized instruction sets, they take advantage of them.

Run OpenSearch k-NN

Run Single-node Cluster Locally

Run OpenSearch k-NN using gradlew run.

./gradlew run

That will build OpenSearch and start it, writing its log above Gradle's status message. We log a lot of stuff on startup, specifically these lines tell you that plugin is ready.

[2020-05-29T14:50:35,167][INFO ][o.e.h.AbstractHttpServerTransport] [runTask-0] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2020-05-29T14:50:35,169][INFO ][o.e.n.Node               ] [runTask-0] started

It's typically easier to wait until the console stops scrolling, and then run curl in another window to check if OpenSearch instance is running.

curl localhost:9200

{
  "name" : "runTask-0",
  "cluster_name" : "runTask",
  "cluster_uuid" : "oX_S6cxGSgOr_mNnUxO6yQ",
  "version" : {
    "number" : "1.0.0-SNAPSHOT",
    "build_type" : "tar",
    "build_hash" : "0ba0e7cc26060f964fcbf6ee45bae53b3a9941d0",
    "build_date" : "2021-04-16T19:45:44.248303Z",
    "build_snapshot" : true,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  }
}

Run Multi-node Cluster Locally

It can be useful to test and debug on a multi-node cluster. In order to launch a 3 node cluster with the KNN plugin installed, run the following command:

./gradlew run -PnumNodes=3

In order to run the integration tests with a 3 node cluster, run this command:

./gradlew :integTest -PnumNodes=3

Debugging

Sometimes it is useful to attach a debugger to either the OpenSearch cluster or the integration test runner to see what's going on. For running unit tests, hit Debug from the IDE's gutter to debug the tests. For the OpenSearch cluster, first, make sure that the debugger is listening on port 5005. Then, to debug the cluster code, run:

./gradlew :integTest -Dcluster.debug=1 # to start a cluster with debugger and run integ tests

OR

./gradlew run --debug-jvm # to just start a cluster that can be debugged

The OpenSearch server JVM will connect to a debugger attached to localhost:5005 before starting. If there are multiple nodes, the servers will connect to debuggers listening on ports 5005, 5006, ...

To debug code running in an integration test (which exercises the server from a separate JVM), first, setup a remote debugger listening on port 8000, and then run:

./gradlew :integTest -Dtest.debug=1

The test runner JVM will connect to a debugger attached to localhost:8000 before running the tests.

Additionally, it is possible to attach one debugger to the cluster JVM and another debugger to the test runner. First, make sure one debugger is listening on port 5005 and the other is listening on port 8000. Then, run:

./gradlew :integTest -Dtest.debug=1 -Dcluster.debug=1

Submitting Changes

See CONTRIBUTING.