The Earudite data flow server functions as the central piece of the back-end, handling requests for changing and getting data in common but specific manners. Examples include pre-screening audio recordings of question transcripts, selecting a question to answer or record, and enabling asynchronous processing of audio. It is written in the Flask framework for Python 3.8 and uses Firebase Storage to store audio files and the MongoDB Quizzr Atlas to store data.
Prior to installing the server, either through Docker (see Using Docker) or directly onto your machine, make sure you have addressed the following prerequisites:
- Python 3.8 with the latest version of
pip
installed (does not apply with Docker). - A MongoDB Atlas with the following (see Get Started with Atlas, parts 1-5, for more information):
- An Atlas account;
- A cluster on version 4.4.x with a database that contains the "Audio", "RecordedQuestions", "UnrecordedQuestions", "Games", "LeaderboardArchive", and "Users" collections, all of which are not capped;
- A database user with permission to read and write to all collections in the database; and
- A connection through the database user with a Python driver version 3.6 or later.
- A Firebase project with the Cloud Storage service enabled and a connection to the project through an Admin SDK set up (see Cloud Storage for Firebase and Add the Firebase Admin SDK to your server respectively for more information).
- Clone this repository.
- Install all the necessary dependencies by executing
pip install -r requirements.txt
in the folder of the repository. It may be a good idea to set up a virtual environment prior to doing this step to avoid conflicts with already installed packages. - Install Gentle by following the instructions in the associated README.md document. If you are installing it through the source code on a Linux operating system, you may need to change
install_deps.sh
to be based on your distribution. - Create a directory for the instance path of the server. By default, it is
~/quizzr_server
, but it can be overridden by theQ_INST_PATH
environment variable or thetest_inst_path
parameter in the app factory function,create_app
. In the instance path, create another directory calledsecrets
. - Generate a private key for the Firebase Admin SDK service account and store it at
secrets/firebase_storage_key.json
. - Create a JSON file named
email_credentials.json
located in thesecrets
directory. Replace<email_address>
with the email address to use when sending emails to registering users and<password>
with the corresponding password (most likely either an app password or the ordinary password for the account).
{
"user": "<email_address>"
"pass": "<password>"
}
To update the repository on your machine, either use git pull
(requires you to commit your changes) or reinstall the repository.
To uninstall this repository, simply delete its directory and the contents defining its associated virtual environment, along with the instance path.
Creating a JSON file named sv_config.json
in the config
subdirectory of the instance path allows for specifying a set of overrides to merge on top of the default configuration.
All configuration fields must use purely capital letters to be recognized by the server. The following is a list of configuration fields and their descriptions:
UNPROC_FIND_LIMIT
The maximum number of unprocessed audio documents to find in a single batch.DATABASE
The name of the database to use in MongoDB.BLOB_ROOT
The name of the root folder to use in Firebase Storage.Q_ENV
The type of environment to use. A value ofdevelopment
ortesting
makes the server identify unauthenticated users asdev
and allows access to the/uploadtest
endpoint.SUBMISSION_FILE_TYPES
The file extensions to look for when deleting submissions.DIFFICULTY_DIST
The fractional distribution of recordings by difficulty. Example:[0.6, 0.3, 0.1]
makes the 60% least difficult recordings have a "0" difficulty, followed by the next 30% at difficulty "1", and the rest at difficulty "2".VERSION
The version of the software. Used in audio document definitions for cases where the schema changes.MIN_ANSWER_SIMILARITY
The program marks a given answer at the/answer
GET
endpoint as correct if the similarity between the answer and the correct answer exceeds this value.PROC_CONFIG
Configuration for the recording processor. Includes:checkUnk
Check for unknown words along with unaligned words when calculating accuracy.unkToken
The value of the aligned word to look for when detecting out-of-vocabulary words.minAccuracy
The minimum acceptable accuracy of a submission.queueLimit
The maximum number of submissions to pre-screen at once.standardizedSampleRate
Sets the sample rate of incoming submissions to the given valueconvertToMono
If true, convert all incoming submissions to mono audio.
DEV_UID
The default user ID of an unauthenticated user in adevelopment
environment.LOG_PRIVATE_DATA
Redact sensitive information, such as Firebase ID tokens, in log messages.VISIBILITY_CONFIGS
A set of configurations that determine which collection to retrieve a profile from and what projection to apply. Projections are objects with the key being the field name and the value being 1 or 0, representing whether to include or exclude the field.USE_ID_TOKENS
A configuration option specifying whether to use a Firebase ID token or to use the raw contents for identifying the user. Has no effect whenTESTING
is False.MAX_LEADERBOARD_SIZE
The maximum number of entries allowable on the leaderboard.DEFAULT_LEADERBOARD_SIZE
The default number of entries on the leaderboard.MAX_USERNAME_LENGTH
The maximum allowable number of characters in a username.USERNAME_CHAR_SET
A string containing all allowable characters in a username.DEFAULT_RATE_LIMITS
An array containing request rate limits (in a string format) for all server endpoints. Examples: "200 per day", "50 per hour", "1/second"ENDPOINT_RATE_LIMITS
For each event used in the server's code, the maximum number of calls per unit time. See the default configuration for more details.REGISTRATION_EMAIL
Configuration for the system that sends emails to registering users. Contains:enabled
If true, send emails to people who have registered. All fields below must be properly configured in order for the mailer to run.subject
The subject line for the emailtextBodyPath
The path to a file containing a text body.htmlBodyPath
The path to a file containing an HTML body. At least one of these paths must be configured.sender
Information about the sender:display
Display nameemail
Display email address
It is also possible to override configuration fields through environment variables or through a set of overrides passed into the test_overrides
argument for the app factory function. Currently, overrides with environment variables only work with fields that have string values.
The following JSON data shows the default values of each configuration field. You may also view the default configuration in server.py
.
{
"UNPROC_FIND_LIMIT": 32,
"DATABASE": "QuizzrDatabase",
"BLOB_ROOT": "production",
"Q_ENV": "production",
"SUBMISSION_FILE_TYPES": ["wav", "json", "vtt"],
"DIFFICULTY_DIST": [0.6, 0.3, 0.1],
"VERSION": "0.2.0",
"MIN_ANSWER_SIMILARITY": 50,
"PROC_CONFIG": {
"checkUnk": true,
"unkToken": "<unk>",
"minAccuracy": 0.5,
"queueLimit": 32,
"standardizedSampleRate": 44100,
"convertToMono": true
},
"DEV_UID": "dev",
"LOG_PRIVATE_DATA": false,
"VISIBILITY_CONFIGS": {
"basic": {
"projection": {"pfp": 1, "username": 1, "usernameSpecs": 1},
"collection": "Users"
},
"leaderboard": {
"projection": {"pfp": 1, "username": 1, "usernameSpecs": 1, "ratings": 1},
"collection": "Users"
},
"leaderboardAudio": {
"projection": {"pfp": 1, "username": 1, "usernameSpecs": 1, "recordingScore": 1},
"collection": "Users"
},
"public": {
"projection": {"pfp": 1, "username": 1, "usernameSpecs": 1},
"collection": "Users"
},
"private": {
"projection": null,
"collection": "Users"
}
},
"USE_ID_TOKENS": true,
"MAX_LEADERBOARD_SIZE": 200,
"DEFAULT_LEADERBOARD_SIZE": 10,
"MAX_USERNAME_LENGTH": 16,
"USERNAME_CHAR_SET": "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789",
"DEFAULT_RATE_LIMITS": [],
"ENDPOINT_RATE_LIMITS": {
"/audio POST": {"maxCalls": 1, "unitTime": 20}
},
"REGISTRATION_EMAIL": {
"enabled": true,
"subject": null,
"textBodyPath": null,
"htmlBodyPath": null,
"sender": {
"display": null,
"email": null
}
}
}
Prior to running the server, get the connection string for the MongoDB Client from the Quizzr Atlas (accessed through the Quizzr Google account).
To start the server, enter the following commands into the terminal, replacing your-connection-string
with the connection string you obtained earlier and passwd
with a long and secure password:
$ export CONNECTION_STRING=your-connection-string
$ export DF_SERVER_AUDIO_PSWD=passwd
$ export FLASK_APP=server
$ flask run
Alternatively, you can run the server.py module (see python 3 server.py -h
for more information):
$ export CONNECTION_STRING=your-connection-string
$ export DF_SERVER_AUDIO_PSWD=passwd
$ python3 server.py
You can view the website through http://127.0.0.1:5000/.
Stop the server using Ctrl + C.
To run the server in debug mode, set FLASK_ENV
to development
in the terminal. By default, the debugger is enabled. To disable the debugger, add --no-debugger
to the run command.
There is a separate repository for running automated tests on the server. See the quizzr-server-test repository for more information.
There is a Dockerfile that you can use to build the Docker image for this repository. Alternatively, you can pull from the Docker Hub repository for the image.
The following command includes notable arguments for running this image:
$ docker run -p 5000:5000 \
-v <config-volume>:/root/quizzr_server/config \
-v <secrets-volume>:/root/quizzr_server/secrets \
-v <storage-volume>:/root/quizzr_server/storage \
-e CONNECTION_STRING=<your-connection-string> \
<quizzr-server-image-name>
Notes:
- Each volume specified is either a named volume or a path for a bind mount, and
<your-connection-string>
is the connection string for the MongoDB Client (see Running the Server). - You will need to have the private Firebase key in the mounting directory (see Installation).
The following contains potential problems you may encounter while installing or running this server:
- The installation execution for Gentle fails due to the certificate expiring: Modify the
wget
command ininstall_models.sh
to include the--no-check-certificate
flag. - The execution for installing Gentle fails to create
kaldi.mk
: Run./configure --enable-static
,make
, andmake install
ingentle/ext/kaldi/tools/openfst
. Then, runinstall.sh
again.
A recent update has added the requirement for a batchUUID
field in segmented audio documents. A script has been added in the maintenance folder to retroactively add this field to old audio documents.
NOTICE: Endpoint documentation will no longer be maintained until this software exits initial development. All documentation for the endpoints has been moved to reference/backend.yaml, which is in an OpenAPI format. You can view it with the Swagger UI or a similar OpenAPI GUI generator.