Skip to content

Select runs and get values

Dmitry Romanov edited this page Mar 17, 2016 · 23 revisions
  • [Selecting runs and getting values])(#selecting-runs-and-getting-values)

To experiment with the examples on this page, one can download daily recreated SQLite database: https://halldweb.jlab.org/dist/rcdb.sqlite

Using connection string:

sqlite:///<path to file>/rcdb.sqlite

Or connect to readonly mysql:

mysql://[email protected]/rcdb


### Selecting runs and getting values
####Get values fast Suppose, one wants to get all event_count-s and beam_current-s for production runs:
import rcdb
db = rcdb.RCDBProvider("mysql://[email protected]/rcdb")
table = db.select_runs("@is_production").get_values(['event_count', 'beam_current'], insert_run_number=True)
print table

As the result one gets something like:

[
[1023, 3984793, 0.145]
[1024, 4569873, 0.230]
...
]

The first column is a run number (we set insert_run_number=True above), and other columns are the conditions in a given order.


####Run range If one wants to apply a run range, say for a particular run period: ```python table = db.select_runs("@is_production", run_min=10000, run_max=20000)\ .get_values(['event_count', 'beam_current'], insert_run_number=True) ```
####No filtration To get values for all runs without filtration a search pattern may be skipped: ```python table = db.select_runs(run_min=10000, run_max=20000)\ .get_values(['event_count', 'beam_current'], insert_run_number=True) ```

(note that parameter names are used here, so the python could figure function parameters out)


####Sort order The table is always sorted by run number. It is just a 'feature' of getting runs DB query (that is under the hood). However, the order in with run numbers are sorted could be changed: ```python table = db.select_runs(run_min=10000, run_max=20000, sort_desc=True)\ .get_values(['event_count', 'beam_current'], insert_run_number=True) ```

sort_desc=True - makes rows to be sorted by descending run_number



### Iterating over runs and selecting any values
####Getting runs ```select_runs``` function returns ```RunSelectionResult``` object that contains all selected runs and some other information about how the runs where selected. The RunSelectionResult implements ```list``` interface returning ```Run`-s. Thus one can do:
import rcdb
db = rcdb.RCDBProvider("mysql://[email protected]/rcdb")
result = db.select_runs("@is_production")
for run in result:
    print run.number

As one could guess the selected run numbers are printed as the result.

Run has the next useful functions:

def get_conditions_by_name(self):              
    # Get all conditions and returns dictionary of condition.name -> condition
def get_condition(self, condition_name):       
    # Gets Condition object by name if such name condition exist or None
def get_condition_value(self, condition_name): 
    # Gets the condition value if such condition exist or None

####Get any condition of the run So one can iterate selected runs and get any desired condition:
import rcdb
db = rcdb.RCDBProvider("mysql://[email protected]/rcdb")
result = db.select_runs("@is_production")
for run in result:
    print run.get_condition_value('event_count')



##Performance

In the performance point of view, the fastest way to get values by using

db.select_runs(...).get_values(...)

Because get_values makes just a single database call to retrieve all values for selected runs.

In case of iterating:

result = db.select_runs("@is_production")
for run in result:
    print run.get_condition_value('event_count')

Database is queried on each get_condition_value