Neogrok is a frontend for zoekt, a fast and scalable code search engine. Neogrok exposes zoekt's search APIs in the form of a modern, snappy UI. Neogrok is a SvelteKit application running on Node.js and in the browser.
This repository contains some features not present in the fantastic original neogrok repository:
- Secured access to the search interface via Keycloak/OIDC
- Repository access control to limit search results to only those git repositories, to which the logged in user has access to. Requires a custom version of zoekt, which uses the userId passed by neogrok to select the user-accessible shards.
- The possibility to change the title to something else than 'neogrok'
Together, neogrok and zoekt provide:
- Fast, live code search with a syntax based on regular expressions.
- Easy deployments. A single deployment of zoekt can performantly index and
serve thousands of source repositories, using one of the many available
indexers like
zoekt-git-index
to produce binary index files (called "shards") that are served byzoekt-webserver
. Neogrok is just a veneer on top ofzoekt-webserver
; the only necessary configuration for neogrok is the URL of a runningzoekt-webserver
. - Low resource utilization. The demo (which just indexes the neogrok and zoekt repos) happily runs on the smallest instances Fly can provision. Indexing the Linux kernel produces about 2.7GiB of index shards, and serving those shards uses just under 1 GiB of RAM.
Building from source is easy. Clone the repository,
yarn install && yarn run build && yarn run start
. You can of course run the server
without intermediation by yarn
, by doing whatever yarn run start
does directly;
but the relevant commands may change in the future, whereas yarn run start
will
not.
The demo deployment is configured in this repository. This configuration can serve as a guide for your own deployments of neogrok together with zoekt.
Neogrok may be configured using a JSON configuration file, or, where possible, environment variables. Configuration options defined in the environment take precedence over those defined in the configuration file.
The configuration file is read from /etc/neogrok/config.json
by
default, but the location may be customized using the environment variable
NEOGROK_CONFIG_FILE
. The file is entirely optional, as all of the required
configuration may be defined in the environment. If it is present, the file's
contents must be a JSON object with zero or more entires, whose keys correspond
to the option names tabulated below.
Option name | Environment variable name | Required Y/N | Description |
---|---|---|---|
zoektUrl |
ZOEKT_URL |
Y | The base zoekt URL at which neogrok will make API requests, at e.g. /api/search and /api/list |
neogrokTitle |
NEOGROK_TITLE |
N | The title to be displayed on the front page, defaults to NEOGROK |
openGrokProjectMappings |
N/A | N | An object mapping OpenGrok project names to zoekt repository names; see below |
Note that you can also configure, among other things, which ports/addresses will be bound, using SvelteKit's Node environment variables. See the list here.
This version of neogrok requires authentication in order to access the search interface.
So far, only Keycloak is supported, with OpenID Connect as protocol. The library being used, Auth.js, has many more options.
Authentication and authorization are configured through these environment variables:
Environment variable name | Required Y/N | Description |
---|---|---|
AUTH_KEYCLOAK_ISSUER |
Y | The URL of your Keycloak issuer endpoint. E.g. https://your.keycloak.com/realms/master |
AUTH_KEYCLOAK_REFRESH |
N | The URL of your refresh token endpoint. Defaults to $AUTH_KEYCLOAK_ISSUER/protocol/openid-connect/token |
AUTH_KEYCLOAK_ID |
Y | The clientId configured in your Keycloak instance for this neogrok service |
AUTH_KEYCLOAK_SECRET |
Y | The client secret in your Keycloak instance for this neogrok service |
AUTH_KEYCLOAK_USER_ID_ATTRIBUTE |
N | The attribute in the profile claim holding the user name to be passed to zoekt for access control. Defaults to preferred_username |
AUTH_KEYCLOAK_GROUPS_ATTRIBUTE |
N | The attribute in the access token holding the group memberships of the user. Only needed with AUTH_KEYCLOAK_REQUIRED_GROUP . Defaults to groups |
AUTH_KEYCLOAK_REQUIRED_GROUP |
N | Optionally, the name of a group the user must be a member of, before access is granted. Used with AUTH_KEYCLOAK_GROUPS_ATTRIBUTE |
Note 1: The above only configures access to the neogrok web interface. Access control wrt the search base is implemented in this custom version of zoekt, which evalutes the userId passed from neogrok against an access control list, so that only search results to the user-accessible repositories are returned.
Note 2: In order to use authorization based on group membership, you may need to configure the client scope in your Keycloak instance to provide the user's group memberships as an attribute in the access token.
Neogrok exports some basic Prometheus
metrics on an opt-in basis, by setting a
PROMETHEUS_PORT
or PROMETHEUS_SOCKET_PATH
, plus an optional
PROMETHEUS_HOST
. These variables have the exact same semantics as the
above-described SvelteKit environment variables, but the port/socket must be
different than those of the main application. When opting in with these
variables, /metrics
will be served at the location they describe.
/metrics
is required to be served with a different port/socket so as to not
expose it on the main site; serving one port to end users and another to the
prometheus scraper is the easiest way to ensure proper segmentation of the
neogrok site from internal infrastructure concerns, without having to run a
particularly configured HTTP reverse proxy in front of neogrok.
As an added bonus, neogrok can serve as a replacement for existing deployments of OpenGrok, a much older, more intricate, slower, and generally jankier code search engine than zoekt. Neogrok strives to provide URL compatibility with OpenGrok by redirecting OpenGrok URLs to their neogrok equivalents: simply deploy neogrok at the same origin previously hosting your OpenGrok instance, and everything will Just Work™. To the best of our ability, OpenGrok Lucene queries will be rewritten to the most possibly equivalent zoekt queries. (Perfect compatibility is not possible as the feature sets of each search engine do not map one-to-one.)
If your OpenGrok project names are not identical to their equivalent zoekt
repository names, you can run neogrok
with the appropriate
openGrokProjectMappings
configuration, which maps OpenGrok
project names to zoekt repository names. With this data provided, neogrok can
rewrite OpenGrok queries that include project names appropriately.