Skip to content

Commit

Permalink
Merge branch 'development' into 'master'
Browse files Browse the repository at this point in the history
Merge development into master; RDF generation changes

See merge request sdbm/sdbmss!39
  • Loading branch information
emeryr-upenn committed Sep 1, 2022
2 parents d8698c2 + 0e655be commit ce96c5f
Show file tree
Hide file tree
Showing 39 changed files with 862 additions and 311 deletions.
8 changes: 6 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
# or operating system, you probably want to add a global ignore instead:
# git config --global core.excludesfile '~/.gitignore_global'

# ignore nginx server config
/nginx/nginx.conf
# ignore nginx configuration
/nginx/

# Ignore bundler config.
/.bundle
Expand Down Expand Up @@ -38,6 +38,9 @@
.ruby-version

VERSION
# don't track .env files
.env
.lando.env
.env*
*.key
*.crt
Expand All @@ -46,3 +49,4 @@ rdf_data
*.tgz
sdbm_data
*.bak
test.ttl*
2 changes: 1 addition & 1 deletion .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -240,4 +240,4 @@ remove_nginx_image:
rules:
- when: always
tags:
- build
- build
2 changes: 1 addition & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -309,7 +309,7 @@ GEM
rspec-mocks (~> 3.1.0)
rspec-support (~> 3.1.0)
rspec-support (3.1.2)
rubyzip (1.2.2)
rubyzip (1.3.0)
sanitize (5.0.0)
crass (~> 1.0.2)
nokogiri (>= 1.8.0)
Expand Down
179 changes: 179 additions & 0 deletions README-development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
Installation
=====

**1. Clone the repository**

DEPRECATED -- USING ANSIBLE

git clone https://github.com/upenn-libraries/sdbmss.git

**2. Create file docker environment file `.env` in the root folder of the cloned repository. Define the following environment variables:**

This is the password for the public access point for the Jena server, both for making updates and for downloading RDF data file

ADMIN_PASSWORD=<jena_admin_password>

Login for private staging server deployment, if used

RAILS_USER=<rails_staging_username>
RAILS_PASS=<rails_staging_password>

MYSQL database setup. MYSQL_HOST is the name of the docker service defined in docker-compose.yml - everything else is up to you

MYSQL_HOST=db
MYSQL_DATABASE=<mysql_database_name>
MYSQL_ROOT_PASSWORD=<mysql_root_password>
MYSQL_USER=<mysql_username>
MYSQL_PASSWORD=<mysql_password>

RabbitMQ user and password; used for RDF live-update messaging system. Define here, then set the correct values when instantiating the RabitMQ service **(done later)**

RABBIT_USER=<rabbitmq_username>
RABBIT_PASSWORD=<rabbitmg_password>

Rails email setup. Depends on where the application is running and what mailing service is available

SDBMSS_APP_HOST=<email_app_host>
SDBMSS_SMTP_HOST=<email_smtp_host>
SDBMSS_EMAIL_FROM=<email_sender>
SDBMSS_NOTIFY_EMAIL=<notification_email_sender>
SDBMSS_NOTIFY_EMAIL_PASSWORD=<notification_email_password>
# email address
SDBMSS_EMAIL_EXCEPTIONS_TO=<send_exceptions_to_email>

Generate separate keys (for each) by running rails task secret.

SDBMSS_BLACKLIGHT_SECRET_KEY=<KEY>
SDBMSS_DEVISE_SECRET_KEY=<KEY>
SDBMSS_SECRET_KEY_BASE=<KEY>
SDBMSS_SECRET_TOKEN=<KEY>

URL for SOLR server, using relative location of docker service (using name 'solr' from docker-compose.yml)

SOLR_URL=http://solr:8982/solr/development

**3. Build and Run (First Time)**

Start everything up:

docker-compose up --build

**4. RabbitMQ First Time Setup**

First time we need to create user and grant permissions. Use the same values for USER/PASS as set in your .docker-environment file

docker-compose -f docker-compose-dev.yml rabbitmq /bin/bash
rabbitmqctl add_user <RABBIT_USER> <RABBIT_PASSWORD>
rabbitmqctl set_user_tags <RABBIT_USER> adminstrator
rabbitmqctl set_permissions -p / <RABBIT_USER> ".*" ".*" ".*"

Then restart dependent containers:

docker-compose -f docker-compose-dev.yml restart interface
docker-compose -f docker-compose-dev.yml restart rails


**5. First Time Setup: Rails and SOLR**

Setup database - perform setup:

docker exec $(docker ps -q -f name=sdbmss_rails) bundle exec rake db:setup

(Optional: Load data from .sql dump)

```bash
docker cp sdbm.sql.gz $(docker ps -q -f name=sdbmss_db):/tmp/sdbm.sql.gz
docker exec -it $(docker ps -q -f name=sdbmss_db) bash
cd /tmp
gunzip sdbm.sql.gz
mysql -u <MYSQL_USER> -p <MYSQL_DATABASE> < sdbm.sql
rm sdbm.sql # remove the sql file (it's very big)
exit # exit the MySQL container
docker exec $(docker ps -q -f name=sdbmss_rails) bundle exec rake db:migrate
```

**NOTE**: If you are importing from a data file that includes **Page** objects, the database records will be copied, but not the page files. You will need to move these manually to the appropriate place in the public/static folder (uploads/, tooltips/ or docs/)

```
docker cp docs $(docker ps -q -f name=sdbmss_rails):/usr/src/app/public/static/
docker cp tooltips $(docker ps -q -f name=sdbmss_rails):/usr/src/app/public/static/
docker cp uploads $(docker ps -q -f name=sdbmss_rails):/usr/src/app/public/static/
```

Index in Solr:

docker exec $(docker ps -q -f name=sdbmss_rails) bundle exec rake sunspot:reindex

**6. Jena First Time Setup**

```
docker exec -t $(docker ps -q -f name=sdbmss_rails) bundle exec rake sparql:test
```
File should be in `.`; gzip it.

```
gzip test.ttl
```

Copy file to Jena container and gunzip it

```
docker cp test.ttl.gz $(docker ps -q -f name=sdbmss_jena):/tmp/
docker exec -t $(docker ps -q -f name=sdbmss_jena) gunzip /tmp/test.ttl.gz
```

Load the data into Jena

```
docker exec -t $(docker ps -q -f name=sdbmss_jena) sh -c 'cd /jena-fuseki && ./tdbloader --loc=/fuseki/databases/sdbm /tmp/test.ttl'
```

Clean up the files.

```
$ docker exec -t $(docker ps -q -f name=sdbmss_jena) rm /tmp/test.ttl
rm ~/deployments/sdbmss/test.ttl.gz
```

Create the datset in Jena Fuseki.

Go here and create the sdbm dataset: <https://localhost/sparql/manage.html>

- Click 'add new data set'
- Enter 'sdbm'
- Select 'Persistent – dataset will persist across Fuseki restarts'
- Click 'create dataset'

Scale the services:

```
docker-compose -f docker-compose-dev.yml restart jena
docker-compose -f docker-compose-dev.yml restart rabbitmq
docker-compose -f docker-compose-dev.yml restart rails
```

Run the Jena verify task to confirm that it works. Be sure to hide the debugging output.

```
docker-compose -f docker-compose-dev.yml exec rails bundle exec rake jena:verify | grep -v DEBUG
```

NB: You may need to run the command more than once.

```
sdbm01[~]$ docker-compose -f docker-compose-dev.yml exec rails bundle exec rake jena:verify | grep -v DEBUG
Starting Queue Listening
No more messages in queue.
Remaining responses: 764
$ docker-compose -f docker-compose-dev.yml exec rails bundle exec rake jena:verify | grep -v DEBUG
Starting Queue Listening
Parsed contents: {"id"=>300122, "code"=>"200", "message"=>"OK"}
Jena Update was Successful!
Parsed contents: {"id"=>300211, "code"=>"200", "message"=>"OK"}
Jena Update was Successful!
Parsed contents: {"id"=>300212, "code"=>"200", "message"=>"OK"}
Jena Update was Successful!
Parsed contents: {"id"=>300213, "code"=>"200", "message"=>"OK"}
Jena Update was Successful!
# ... etc.
```
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ DEPRECATED -- USING ANSIBLE

URL for SOLR server, using relative location of docker service (using name 'solr' from docker-compose.yml)

SOLR_URL=http://solr:8983/solr/development
SOLR_URL=http://solr:8982/solr/development

**3. (Optional) Create a file VERSION in the root folder of the cloned repository.**

Expand Down Expand Up @@ -103,7 +103,7 @@ DEPRECATED -- USING ANSIBLE

**5. First Time Setup: Rails and SOLR**

Before you begin, have the folliwing:
Before you begin, have the following:

- Database backup from previous version sdbm.sql.gz

Expand Down
66 changes: 62 additions & 4 deletions app/models/concerns/tell_bunny.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,73 @@ module TellBunny
has_many :jena_responses, as: :record
end

SINGLE_QUOTE_REGEXP = Regexp.new "'"

##
# Prepare the string for ingestion in to Jena:
#
# - Remove control characters Jena can't process
# - Escape leading and trailing +'+ as +\'+
#
#
# @param value [String] a string value
# @return [String]
def rdf_string_prep value
return unless value.present?
# the next two steps have been moved here from `export_rdf.rake`
value.gsub!("\r\n", '') # remove CrLf
value.gsub!("\\", '') # remove "\" to avoid illegal control characters
return value unless value =~ SINGLE_QUOTE_REGEXP
value.strip!
# replace initial or final "'" with "\'"
value.gsub(%r{^'|'$}, %q{\\\'})
end

##
# If +value+ is present, return a formatted rdf object string based on +data_type+;
# otherwise +nil+ is returned.
#
# format_triple_object 3, :integer # => "'3'^^xsd:integer"
# url_base = 'https://sdbm.library.upenn.edu/names/'
# format_triple_object 22104, :uri, url_base # => "<https://sdbm.library.upenn.edu/names/22104>"
#
# Note that when the data type is +uri+, +url_base+ is required and +value+ is appended to +url_base+.
#
# @param [Object] value the value of the property object
# @param [Symbol] data_type one of +:integer+, +:decimal+, +:boolean',
# +:string+, or +:uri+
# @param [String] url_base a string like +https://sdbm.library.upenn.edu/names/+;
# required if +data_type+ is +:uri+
# @return [String, nil] returns a formatted RDF object string or +nil+ if +value+ is blank
# @raise [RuntimeError] if +data_type+ is +:uri+ and +url_base+ is blank
def format_triple_object value, data_type, url_base=nil
return unless value.present?
return if value =~ %r{\A[^[:alnum:]]*\z}
case data_type
when :integer
"'#{value}'^^xsd:integer"
when :decimal
"'#{value}'^^xsd:decimal"
when :boolean
"'#{value}'^^xsd:boolean"
when :string
"'''#{rdf_string_prep value.to_s}'''"
when :uri
raise "No `url_base` supplied for #{value}" unless url_base.present?
"<#{url_base}#{value}>"
else
raise "Unknown triple object data_type: '#{data_type}'; expected one of :string, :decimal, :boolean, :string, :uri"
end
end

# inherited and overriden by relevent models.
# NOTE: use triple single quotes to enclose string literals, to avoid confusion with quotes in the strings themsleves

def to_rdf
%Q(
# sdbm:names/#{id} sdbm:names_id #{id}
# sdbm:names/#{id} sdbm:names_id #{id}
)
end

#private

def update_bunny(jena_response_id = nil)
Expand Down Expand Up @@ -54,7 +112,7 @@ def update_bunny(jena_response_id = nil)

rescue Bunny::TCPConnectionFailed => e
#puts "(Update) - Connection to RabbitMQ server failed"
self.jena_responses.destroy_all
self.jena_responses.destroy_all
JenaResponse.create!(record: self, status: 0, message: "404: Failed to connect from Rails to RabbitMQ: #{e}")
#rescue StandardError => e
# self.jena_responses.destroy_all
Expand Down
6 changes: 3 additions & 3 deletions app/models/dericci_link.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ def to_rdf
fields: {}
}

map[:fields][:dericci_record_id] = "<https://sdbm.library.upenn.edu/dericci_records/#{dericci_record_id}>" if dericci_record_id.present?
map[:fields][:name_id] = "<https://sdbm.library.upenn.edu/names/#{name_id}>" if name_id.present?
map[:fields][:other_info] = "'''#{other_info}'''" if other_info.present?
map[:fields][:dericci_record_id] = format_triple_object dericci_record_id, :string
map[:fields][:name_id] = format_triple_object name_id, :uri, 'https://sdbm.library.upenn.edu/names/'
map[:fields][:other_info] = format_triple_object other_info, :string

map
end
Expand Down
22 changes: 11 additions & 11 deletions app/models/dericci_record.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ class DericciRecord < ActiveRecord::Base

has_many :bookmarks, as: :document, dependent: :destroy
has_many :comments, as: :commentable
include Watchable

include Watchable
include UserFields
include TellBunny

Expand Down Expand Up @@ -48,15 +48,15 @@ def to_rdf
fields: {}
}

map[:fields][:dates] = "'''#{dates}'''" if dates.present?
map[:fields][:name] = "'''#{name}'''" if name.present?
map[:fields][:place] = "'''#{place}'''" if place.present?
map[:fields][:url] = "'''#{url}'''" if url.present?
map[:fields][:cards] = "'#{cards}'^^xsd:integer" if cards.present?
map[:fields][:size] = "'''#{size}'''" if size.present?
map[:fields][:other_info] = "'''#{other_info}'''" if other_info.present?
map[:fields][:senate_house] = "'''#{senate_house}'''" if senate_house.present?
map[:fields][:out_of_scope] = "'#{out_of_scope}'^^xsd:boolean" unless out_of_scope.nil?
map[:fields][:dates] = format_triple_object dates, :string
map[:fields][:name] = format_triple_object name, :string
map[:fields][:place] = format_triple_object place, :string
map[:fields][:url] = format_triple_object url, :string
map[:fields][:cards] = format_triple_object cards, :integer
map[:fields][:size] = format_triple_object size, :string
map[:fields][:other_info] = format_triple_object other_info, :string
map[:fields][:senate_house] = format_triple_object senate_house, :string
map[:fields][:out_of_scope] = format_triple_object out_of_scope, :boolean

map
end
Expand Down
Loading

0 comments on commit ce96c5f

Please sign in to comment.