-
Notifications
You must be signed in to change notification settings - Fork 1
Query optimization in ExistDB
vogelsgesang edited this page Apr 24, 2014
·
4 revisions
One problem that we faced with it was query execution time . To solve this problem we apply Range Indexing to our document Marc21 . without a range index, eXist has to do a full scan over the context nodes to look up an element value, which severly limits performance and scalability.
Here you can find configuration.xconf file :
<collection xmlns="http://exist-db.org/collection-config/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<index xmlns:MARC21="http://www.loc.gov/MARC21/slim">
<fulltext default="none"/>
<!-- Range indexes -->
<range>
<create qname="MARC21:controlfield" type="xs:string"/>
<create qname="MARC21:datafield">
<field name="tag" match="@tag" type="xs:string"/>
<field name="subfield" match="MARC21:subfield" type="xs:string"/>
<field name="code" match="@code" type="xs:string"/>
</create>
</range>
</index>
</collection>
We try with different queries as a following , execution time reduce to from 13 sec to millisecond.
First Query :
xquery version "3.0";
declare namespace MARC21="http://www.loc.gov/MARC21/slim";
let $col:=collection("/db/book")
return
$col/MARC21:collection/MARC21:record[MARC21:datafield[@tag="653"]/MARC21:subfield[. = 'algebraic'][@code='a']]
Second Query :
xquery version "3.0";
declare default element namespace 'http://www.loc.gov/MARC21/slim';
let $col:=collection("/db/book")
return
$col/collection/record[controlfield[@tag='003'][.='SzGeCERN']][datafield[@tag='260'][subfield[@code='a'][.='Rockville, MD']][subfield[@code='b'][.='Computer Science Press']]]
Third Query :
declare default element namespace 'http://www.loc.gov/MARC21/slim';
let $col:=collection("/db/book")
let $records:=$col/collection/record[datafield[subfield='Rockville, MD'][subfield='Computer Science Press']]
return
$records[controlfield[@tag='003'][.='SzGeCERN']]