-
Notifications
You must be signed in to change notification settings - Fork 7
Select runs and get values
Contents:
To experiment with the examples on this page, one can download daily recreated SQLite database: https://halldweb.jlab.org/dist/rcdb.sqlite
Using connection string:
sqlite:///<path to file>/rcdb.sqlite
Or connect to readonly mysql:
mysql://[email protected]/rcdb
Suppose, one wants to get all event_count-s and beam_current-s for production runs:
import rcdb
# Connect to database
db = rcdb.RCDBProvider("mysql://[email protected]/rcdb")
# Select runs and get values
table = db.select_runs("@is_production")\
.get_values(['event_count', 'beam_current'], insert_run_number=True)
print table
As the result one gets something like:
[
[1023, 3984793, 0.145]
[1024, 4569873, 0.230]
...
]
The first column is a run number (we set insert_run_number=True
above). The other two columns are 'event_count' and 'beam_current' as we gave it above.
If run number is not needed insert_run_number
may be skipped:
table = db.select_runs("@is_production")
.get_values(['event_count', 'beam_current'])
A nice way to iterate the values:
for row in table:
event_count, beam_current = tuple(row)
print event_count, beam_current
If one wants to apply a run range, say for a particular run period:
table = db.select_runs("@is_production", 10000, 20000)\
.get_values(['event_count', 'beam_current'], True)
To get values for all runs without filtration a search pattern may be skipped:
table = db.select_runs(run_min=10000, run_max=20000)\
.get_values(['event_count', 'beam_current'], insert_run_number=True)
(note that parameter names are used here, so the python could figure function parameters out)
The table is always sorted by run number. It is just a 'feature' of getting runs DB query (that is under the hood). However, the order in with run numbers are sorted could be changed:
table = db.select_runs(run_min=10000, run_max=20000, sort_desc=True)\
.get_values(['event_count', 'beam_current'], insert_run_number=True)
sort_desc=True
- makes rows to be sorted by descending run_number
select_runs
function returns RunSelectionResult
object that contains all selected runs and some other information about how the runs where selected. The RunSelectionResult implements list
interface returning ```Run`-s. Thus one can do:
import rcdb
db = rcdb.RCDBProvider("mysql://[email protected]/rcdb")
result = db.select_runs("@is_production")
for run in result:
print run.number
As one could guess the selected run numbers are printed as the result.
Run
has the next useful functions:
def get_conditions_by_name(self):
# Get all conditions and returns dictionary of condition.name -> condition
def get_condition(self, condition_name):
# Gets Condition object by name if such name condition exist or None
def get_condition_value(self, condition_name):
# Gets the condition value if such condition exist or None
So one can iterate selected runs and get any desired condition:
import rcdb
db = rcdb.RCDBProvider("mysql://[email protected]/rcdb")
result = db.select_runs("@is_production")
for run in result:
print run.get_condition_value('event_count')
In the performance point of view, the fastest way to get values by using
db.select_runs(...).get_values(...)
Because get_values
makes just a single database call to retrieve all values for selected runs.
In case of iterating:
result = db.select_runs("@is_production")
for run in result:
print run.get_condition_value('event_count')
Database is queried on each get_condition_value
Getting started & basic usage:
- Installation
- Select values tutorial (python)
- Query syntax
- Add data (python)
- CLI Basics
RCDB Explained:
- Connection
- DB and APIs structure
- SQL examples
- Creating condition types
- Adding condition values
- Saving files
- SQLAlchemy
- Logging
- Performance
Command line tools:
DAQ: