Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local dockerized solr and RSpec configuration #112

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
b9acec7
add solr config :/
mksndz Oct 5, 2020
3f51b6c
move around solr conf files
mksndz Oct 7, 2020
953cbb8
revise BL sorl config file to look for dockerized Solr in dev and tes…
mksndz Oct 9, 2020
7834984
fix error in README.md
mksndz Oct 9, 2020
01b283c
install rspec-rails gem. remove obsolete test directory. minor solrco…
mksndz Oct 9, 2020
edf25ca
add files generated by rspec install
mksndz Oct 9, 2020
3a8e8e5
add initial specs. utilize cuprite to drive headless browser. add dra…
mksndz Oct 9, 2020
13fa6b6
remove sprockets manifest file from public/assets. this directory is …
mksndz Oct 9, 2020
c92ba8d
update README with instructions for starting solr in Docker and creat…
mksndz Oct 20, 2020
17d0f72
Merge remote-tracking branch 'origin/master' into spec
mksndz Oct 20, 2020
c163584
add landofile
mksndz Nov 12, 2020
1bce3cb
initial lando-ification
mksndz Nov 12, 2020
7f7d625
update README with better dev setup instructions and info about runni…
mksndz Nov 12, 2020
6c77a52
remove the solr config from project
mksndz Nov 16, 2020
013e301
add rake task to pull solr config from new repo and create collections
mksndz Nov 16, 2020
0090709
update README with local solr config instructions
mksndz Nov 16, 2020
910017e
ignore contents of solr_conf dir. handle error if franklin:solrconfig…
mksndz Nov 19, 2020
31df166
create solr_conf directory in rake task. add enhancement note.
mksndz Nov 20, 2020
5ce0fba
lib-ify shelled out lando commands to clean up franklin dev rake tasks
mksndz Nov 24, 2020
2d0a658
update README. add newline to lando lib file
mksndz Nov 24, 2020
7f72823
ignore entire solr_conf directory
mksndz Nov 24, 2020
73022d9
keep downloaded solr config in tmp
mksndz Dec 8, 2020
1449f40
use BROWSER_PATH to specify chrome executable for cuprite. update REA…
mksndz Dec 8, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .lando.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: franklin
services:
gibneysolr:
type: compose
scanner: false # solr isn't running after lando start
services:
user: root
image: quay.io/upennlibraries/upenn_solr:7.7.0
ports:
- '8983:8983'
environment:
LANDO_DROP_USER: solr
proxy:
gibneysolr:
- franklin.solr.lndo.site:8983
1 change: 1 addition & 0 deletions .rspec
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
--require rails_helper
6 changes: 6 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,10 @@ end

group :development, :test do
gem 'byebug', platforms: :ruby
gem 'rspec-rails'
end

group :test do
gem 'capybara'
gem 'cuprite'
end
48 changes: 48 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,15 @@ GEM
sprockets (>= 3.6.0)
builder (3.2.4)
byebug (11.0.1)
capybara (3.15.1)
addressable
mini_mime (>= 0.1.3)
nokogiri (~> 1.8)
rack (>= 1.6.0)
rack-test (>= 0.6.3)
regexp_parser (~> 1.2)
xpath (~> 3.2)
cliver (0.3.2)
coffee-rails (4.2.1)
coffee-script (>= 2.2.0)
railties (>= 4.0.0, < 5.2.x)
Expand All @@ -139,6 +148,9 @@ GEM
confstruct (1.0.2)
hashie (~> 3.3)
crass (1.0.6)
cuprite (0.11)
capybara (>= 2.1, < 4)
ferrum (~> 0.9.0)
debug_inspector (0.0.3)
deprecation (1.0.0)
activesupport
Expand All @@ -150,6 +162,7 @@ GEM
warden (~> 1.2.3)
devise-guests (0.5.0)
devise
diff-lcs (1.4.4)
dot-properties (0.1.3)
dotenv (2.7.6)
dotenv-rails (2.7.6)
Expand All @@ -162,6 +175,11 @@ GEM
nokogiri (~> 1.6)
faraday (1.0.1)
multipart-post (>= 1.2, < 3)
ferrum (0.9)
addressable (~> 2.5)
cliver (~> 0.3)
concurrent-ruby (~> 1.1)
websocket-driver (>= 0.6, < 0.8)
ffi (1.13.1)
font-awesome-rails (4.7.0.5)
railties (>= 3.2, < 6.1)
Expand Down Expand Up @@ -220,6 +238,7 @@ GEM
multi_xml (0.6.0)
multipart-post (2.1.1)
mysql2 (0.4.10)
nio4r (2.5.2)
nokogiri (1.10.10)
mini_portile2 (~> 2.4.0)
oga (3.3)
Expand All @@ -231,6 +250,8 @@ GEM
orm_adapter (0.5.0)
parslet (2.0.0)
public_suffix (4.0.6)
puma (4.3.5)
nio4r (~> 2.0)
rack (1.6.13)
rack-test (0.6.3)
rack (>= 1.0)
Expand Down Expand Up @@ -264,11 +285,29 @@ GEM
ffi (~> 1.0)
rdoc (4.3.0)
ref (2.0.0)
regexp_parser (1.7.1)
responders (2.4.1)
actionpack (>= 4.2.0, < 6.0)
railties (>= 4.2.0, < 6.0)
rsolr (1.1.2)
builder (>= 2.1.2)
rspec-core (3.9.2)
rspec-support (~> 3.9.3)
rspec-expectations (3.9.2)
diff-lcs (>= 1.2.0, < 2.0)
rspec-support (~> 3.9.0)
rspec-mocks (3.9.1)
diff-lcs (>= 1.2.0, < 2.0)
rspec-support (~> 3.9.0)
rspec-rails (4.0.1)
actionpack (>= 4.2)
activesupport (>= 4.2)
railties (>= 4.2)
rspec-core (~> 3.9)
rspec-expectations (~> 3.9)
rspec-mocks (~> 3.9)
rspec-support (~> 3.9)
rspec-support (3.9.3)
ruby-ll (2.1.2)
ansi
ast
Expand Down Expand Up @@ -333,6 +372,11 @@ GEM
binding_of_caller (>= 0.7.2)
railties (>= 4.0)
sprockets-rails (>= 2.0, < 4.0)
websocket-driver (0.7.3)
websocket-extensions (>= 0.1.0)
websocket-extensions (0.1.5)
xpath (3.2.0)
nokogiri (~> 1.8)
yell (2.2.2)

PLATFORMS
Expand All @@ -348,7 +392,9 @@ DEPENDENCIES
blacklight_solrplugins!
browserify-rails
byebug
capybara
coffee-rails (= 4.2.1)
cuprite
devise (~> 4)
devise-guests (= 0.5.0)
dotenv-rails
Expand All @@ -361,8 +407,10 @@ DEPENDENCIES
mysql2 (~> 0.4.10)
nokogiri (~> 1.10.0)
oga
puma
rails (~> 4.2.0)
rsolr (= 1.1.2)
rspec-rails
sass-rails (= 5.0.6)
sdoc (= 0.4.2)
sqlite3 (= 1.3.12)
Expand Down
78 changes: 28 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,13 @@

# Nouveau Franklin

Installation:
## Development Setup

- Checkout this repo.

- Make sure you have ruby 2.3.3 installed. It's recommended that you
use [rbenv](https://github.com/rbenv/rbenv), but it may be
quicker/easier to get running with [rvm](https://rvm.io/).
- Clone this repo
- Install Ruby `2.3.3` ([rbenv](https://github.com/rbenv/rbenv) recommended)
- You may have issues installing Ruby 2.3.3 in recent Linux distributions due to an OpenSSL version incompatibility. See [this guide](https://www.garron.me/en/linux/install-ruby-2-3-3-ubuntu.html) for help.

- Run `bundle install` to install all gem dependencies.

- Run `npm install` to install javascript libraries.

- Edit the `local_dev_env` file and populate the variables with
appropriate values. Then source it in your shell.

Expand All @@ -25,52 +19,23 @@ Installation:
also have run this again whenever you pull code that includes new
migrations (if you forget, Rails will raise an exception when serving
requests because there are unloaded migrations.)

- If there isn't a Solr instance you can use, you'll need to install
Solr and add the
[solrplugins](https://github.com/upenn-libraries/solrplugins)
extensions to it. The following line should be added to the file
`solr-x.x.x/server/contexts/solr-jetty-context.xml` inside the
'Configure' tag:

```
<Set name="extraClasspath">/path/to/solrplugins-0.1-SNAPSHOT.jar</Set>
```

Add the solr core from the
[library-solr-schema](https://gitlab.library.upenn.edu/discovery/library-solr-schema)
repo. You can copy the core's directory into `solr-x.x.x/server/solr`

Load some test marc data into Solr:

```bash
bundle exec rake solr:marc:index_test_data
```

This pulls 30 sample records from
[the Blacklight-Data repository](https://github.com/projectblacklight/blacklight-data).

If the test data is successfully indexed, you should see output
something like:

```
2016-03-03T12:29:40-05:00 INFO Traject::SolrJsonWriter writing to 'http://127.0.0.1:8983/solr/blacklight-core/update/json' in batches of 100 with 1 bg threads
2016-03-03T12:29:40-05:00 INFO Indexer with 1 processing threads, reader: Traject::MarcReader and writer: Traject::SolrJsonWriter
2016-03-03T12:29:41-05:00 INFO Traject::SolrJsonWriter sending commit to solr at url http://127.0.0.1:8983/solr/blacklight-core/update/json...
2016-03-03T12:29:41-05:00 INFO finished Indexer#process: 30 records in 0.471 seconds; 63.8 records/second overall.
```

- Configure Solr
- You can get the production Solr URL and use that, assuming you're on the Penn VPN
- Otherwise, you can run Penn's custom Solr locally using Lando
- `franklin:start` to pull and start the Solr container and load the `stable` Solr config from the [Franklin Solr Config repo](https://gitlab.library.upenn.edu/franklin/franklin-solr-config) and create dev and test collections
- `franklin:solrconfig` can read the `SOLR_CONFIG_BRANCH` env var to pull from a different branch when and initialize new collections
- `franklin:stop` when you're done working
- `franklin:clean` when things get weird and you need to start fresh
- Start the rails server:

```bash
bundle exec rails s
```

- Open up [localhost:3000](localhost:3000) in a browser. If
everything went well, you should see the generic Blacklight homepage
and have 30 faceted records to search.
everything went well, you should see the Franklin homepage.

# Solr Indexing
## Solr Indexing

This repository also contains Traject code for indexing MARC records
into Solr. It isn't separate because we want to consolidate the MARC
Expand All @@ -84,7 +49,7 @@ The commands in this section can be run directly, or in an application
container. See the `run_in_container.sh` wrapper script in the ansible
repository.

## Full exports
### Full exports

Transfer the *.tar.gz files created by the Alma publishing job to the
directory where they will be preprocessed and indexed. Run these commands:
Expand All @@ -95,7 +60,7 @@ directory where they will be preprocessed and indexed. Run these commands:
./index_solr.sh /var/solr_input_data/alma_prod_sandbox/20170412_full/processed
```

## Incremental updates (OAI)
### Incremental updates (OAI)

This runs via a cron job, which fetches the updates available via OAI
since the last time the job was run.
Expand All @@ -113,7 +78,7 @@ you can do so like this:
./index_and_deletions.sh /var/solr_input_data/alma_prod_sandbox/oai/allTitles/2017_04_10_00_00 allTitles
```

# Docker
## Building Docker Images

There is a `build_docker_image.sh` script you can use to build docker
images from specific branches that have been freshly pulled from
Expand All @@ -131,6 +96,19 @@ See the
repository for Ansible scripts that build Docker images and deploy containers
to the test and production environments.

# Running Tests

Tests require a locally-installed version of Chrome to support feature specs

The usual ENV variables need to be set, for now

- DL Chrome @ `https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Linux_x64/737173/`
- Extract to `PATH_OF_YOUR_CHOOSING`
- Load the required environment variables (e.g., `source local_dev_env`)
- include chrome path with ENV var: BROWSER_PATH=`PATH_OF_YOUR_CHOOSING`
- Start dockerized UPenn Solr `rake franklin:start`
mksndz marked this conversation as resolved.
Show resolved Hide resolved
- Run suite: `RAILS_ENV=test rspec`

# Auditing Secrets

You can use [Gitleaks](https://github.com/upenn-libraries/gitleaks) to check the repository for unencrypted secrets that have been committed.
Expand Down
16 changes: 2 additions & 14 deletions config/blacklight.yml
Original file line number Diff line number Diff line change
@@ -1,21 +1,9 @@
# = jetty_path key
# each environment can have a jetty_path with absolute or relative
# (to app root) path to a jetty/solr install. This is used
# by the rake tasks that start up solr automatically for testing
# and by rake solr:marc:index.
#
# jetty_path is not used by a running Blacklight application
# at all. In general you do NOT need to deploy solr in Jetty, you can deploy it
# however you want.
# jetty_path is only required for rake tasks that need to know
# how to start up solr, generally for automated testing.

development:
adapter: solr
url: <%= ENV['SOLR_URL'] || "http://127.0.0.1:8983/solr/blacklight-core" %>
url: <%= ENV['SOLR_URL'] || "http://franklin.solr.lndo.site:8983/solr/franklin-dev" %>
test: &test
adapter: solr
url: <%= "http://127.0.0.1:#{ENV['TEST_JETTY_PORT'] || 8888}/solr/blacklight-core" %>
url: <%= ENV['TEST_SOLR_URL'] || "http://franklin.solr.lndo.site:8983/solr/franklin-test" %>
production:
adapter: solr
url: <%= ENV['SOLR_URL'] || "http://127.0.0.1:8983/solr/blacklight-core" %>
66 changes: 66 additions & 0 deletions lib/penn_lib/lando.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
module PennLib
# helper methods for interacting with Lando for development
module Lando
SOLR_CONTAINER_NAME = 'gibneysolr'.freeze
SOLR_CONTAINER_USER = 'solr'.freeze
class << self
# Run a command in a Lando container
# @param [String] command
# @param [Hash] options
def run(command, options = {})
cmd = "lando ssh #{options[:container]}"
cmd += " -u #{options[:user]}" if options[:user]
cmd += " '-c #{command}'"
system cmd
end

# Run a command in the Solr container as solr user
# @param [String] command
def solr_run(command)
run command, solr_run_opts
end

# Start GibneySolr
# @return [TrueClass, FalseClass, nil]
def start_solr
solr_run '/opt/solr/bin/solr start -c -m 2g -p 8983 -Dsolr.jetty.request.header.size=65536'
end

# Create a Solr collection
# @param [String] name
# @param [String] config
# @return [TrueClass, FalseClass, nil]
def create_collection(name, config)
solr_run "/opt/solr/bin/solr create_collection -c #{name} -d #{config}"
end

# Delete an existing collection
# @param [String] name
# @return [TrueClass, FalseClass, nil]
def delete_collection(name)
solr_run "/opt/solr/bin/solr delete -c #{name}"
end

# Copy Solr config from Lando mount to Solr configset dir
# @param [String] name
# @return [TrueClass, FalseClass, nil]
def copy_config(name)
solr_run "cp -r /app/tmp/solr_conf/#{name} /opt/solr/server/solr/configsets/#{name}"
end

# Check if collections exist in solr container
# @return [TrueClass, FalseClass]
def collections_exist?
status = Open3.capture2e("lando ssh #{SOLR_CONTAINER_NAME} -u #{SOLR_CONTAINER_USER} -c '/opt/solr/bin/solr status'")
status.join.include? '"collections":"2"'
end

private

# @return [Hash{Symbol->String}]
def solr_run_opts
{ container: SOLR_CONTAINER_NAME, user: SOLR_CONTAINER_USER }
end
end
end
end
Loading