Skip to content

Search interface

Irene Vagionakis edited this page Feb 24, 2023 · 10 revisions

Search

Thanks to the functionality provided by Solr, EFES comes with an inbuilt search engine that can perform lemma searches (provided you have lemmatized your inscriptions before uploading them into EFES), grouped searches and boolean queries. It includes:

  • Text search (supporting wildcards like *; 'Search lemmatised text' option: to search in the lemmatised text rather than in the full text)
  • Faceted browse search
  • Time slider

To customise the search interface, you can edit webapps/ROOT/assets/templates/search.xml and webapps/ROOT/stylesheets/solr/results-to-html.xsl.

Facets

Facets allow us to do complex searches on our data using filters that group the inscriptions into categories by a certain feature (such as the origin, support material etc.). By applying multiple filters we can narrow down the results we get from our search and thus quickly find what we're looking for.

EFES comes with the following facets already created:

  • Author
  • Publication date
  • Found provenance
  • Mentioned people
  • Mentioned places
  • Place of origin
  • Source repository
  • Support material
  • Support object - type

Facets can be optionally linked to authority lists. Some of the pre-existing facets - e.g. support material, place of origin - require the use of authority lists; to use them without authority lists, edit them in tei-to-solr.xsl (see below).

Removing facets

If you would like to remove any of the existing facets, you could simply comment out or delete the <facet.field> element containing the facet from facet_query.xml in webapps/ROOT/assets/queries/solr/. This file defines the search query sent to Solr. Therefore by removing a <facet.field> from it Solr will no longer facet on that field.

Renaming facets

To rename facets, provide translations in webapps/ROOT/assets/translations/messages_xx.xml, using as value of @key 'facet-' followed by the facet name, e.g. <message key="facet-source_repository">Repository</message>.

Creating facets

These are the steps you need to follow when creating a new facet:

  1. Inspect the encoding of your inscription for the precise markup representing the data you want to include in your facets;
  2. In webapps/solr/conf/schema.xml define a new field for indexing.
  3. RESTART EFES.
  4. In tei-to-solr.xslNOT kiln core, but the one in webapps/ROOT/stylesheets/solr — open the empty <extra_field> and within it put the <xsl:call-template> (don't forget to name and match!).
  5. Index through Admin.
  6. In webapps/ROOT/assets/queries/solr/facet_query.xml create a new <facet.field>.
  7. When using authority files, add the new field name in <rdf-facet-lookup-fields> in webapps/ROOT/sitemaps/config.xmap (c. line 70).
  8. Change authority-to-rdf.xsl if the markup hasn’t been already harvested before.

Here are all the steps explained in detail:

  1. Decide on the facet you want to create and determine the XML markup that is used to identify the elements in that category. Let's assume we want to create a new facet with the category of 'inscriptions'. In our sample IOSPE files this information is in a @corresp within a <summary> .

  2. schema.xml — you need to change Solr's schema and create a new field for storing your data. This file lives in webapps/solr/conf/ and the list of fields in it starts at approximately line 111. Make a copy of any existing field and change the values of the attributes if needed. For the inscription category we will need to create the following field: <field name="inscription_category" required="false" indexed="true" multiValued="true" stored="true" type="string"/> Here is a short description of each attribute's purpose and possible values:

    • The field's name is arbitrary but it should, for your own convenience, be descriptive enough while containing as few words as possible. Note that you must be consistent in naming! The name you give your field here will be referenced multiple times in other documents as part of the process of creating a facet. Hyphens should be avoided!
    • required — only things that are common for all documents created in the index should be required="true", or an error will occur when indexing. For all indices and facets it makes sense to always have this attribute with the value "false".
    • indexed — this attribute tells Solr whether the field is available for it to operate on. If we would like to be able to query on that field, sort by it or include it in facets, then the value of this attribute should be "true". For material that is only for display purposes, such as the instance location, for example, the value should be "false".
    • multivalued - this attribute indicates whether one index item can have multiple things of the same type associated with it. For example, the value of this attribute for the field "location" is true - otherwise, if we had multiple occurrences of an index item within a single document, we would only get the first one. However, for some fields it makes sense to only ever have one value — for example the field "index item name" should only ever contain one value, and hence be multivalued="false". The best approach towards choosing which value we need for any new field we create is to think whether this could appear with different values within the same document — one monument could have multiple inscriptions on it, each one within its own category.
    • stored — this attribute allows us to run queries, get results and display them, therefore its value should be "true" for all new fields representing indices or facets.
    • type — this attribute defines Solr's internal processing of the field's contents and what it does with the value. For the most part we need this attribute's value to be "string", unless we plan on performing some manipulations such as removing stop words, white-spacing etc.
  3. RESTART EFES — You can restart the server running EFES with Ctrl + C in your Command line / Terminal. This step is needed so that Solr receives its instructions regarding the new field we've created in the schema. Failing to do this will result in unsuccessful indexing.

  4. tei-to-solr.xsl — We now need to change the stylesheet that converts our EpiDoc TEI documents into Solr index documents. There are two files with this name; one lives in the kiln folder and therefore should never be modified. We need the file that is in webapps/ROOT/stylesheets/solr/ — this file imports the Kiln-core tei-to-solr stylesheet and allows us to safely customize it without risking breaking any of EFES' functionality. Let's take a look at what this stylesheet contains. Right below the namespace declarations is the import of the Kiln-core stylesheet; then there's a description of what this file is; further down we have a template that is matching on the root of the document — we should leave this template as it is! Near the end, right under the comment, there is an empty template — <xsl:template name="extra_fields" />. This template is called in the core-kiln tei-to-solr.xml, so we need to open it and inside it call the templates responsible for all of our new facets. The three steps we need to follow here are naming, matching and calling.

    • Name your template under <xsl:template name="extra_fields" /> and above the closing </xsl:stylesheet>. The naming pattern should represent the field you have already created in the schema.xml. In our case our new template has name="field_inscription_category". Inside we need to use xsl:apply-templates selecting the xpath to the markup containing the data we want to make a facet for. It is useful to add a mode attribute here (with an arbitrary name, reflecting that this is a facet of inscription categories, in this particular example) so that we avoid clashing templates. This is what we want to have by the end of this step:

      <xsl:template name="field_inscription_category">
        <xsl:apply-templates mode="facet_inscription_category" select="//tei:summary/@corresp"/>
      </xsl:template>
    • Match your new template below the template matching on "/" and above the comment. The match attribute should also contain the path to the markup containing your data. We should again take the precaution of using the mode attribute. Inside we want to put the field with the name we've defined in the schema and in the field we want to get the information that will be feeding the facet. This is what we should get:

      <xsl:template match="tei:summary/@corresp" mode="facet_inscription_category">
        <field name="inscription_category">
          <xsl:value-of select="."/>
        </field>
      </xsl:template>
    • Call your new template by opening the empty template named extra_fields and adding a call to it inside:

      <xsl:template name="extra_fields" >
         <xsl:call-template name="field_inscription_category"/>
      </xsl:template>
  5. Index through Admin to collect the data for the facet.

  6. facet_query.xml — We've taken care of the indexing part of the process and Solr has received and stored the data that will feed the facet. We now need to fix the querying part and ask Solr to give us this data back in a respective facet field. The list of facet fields that display the results given from Solr is in facet_query.xml that lives in webapps/ROOT/assets/queries/solr. Create a new <facet.field> element containing the name of the facet - <facet.field>inscription_category</facet.field>.

  7. If you are using authority files, you also need to add a new lookup field in config.xmap that lives in webapps/ROOT/sitemaps/. The list of fields is in the <rdf-facet-lookup-fields> element, c. line 70. Add the name of the field separated from the rest with a comma and no white-space.

  8. If that markup hasn’t been already harvested before, we need to change authority-to-rdf.xsl to harvest the rdf from the authority lists. Currently EFES supports the TEI markup used in IOSPE's authority lists (tei:place and children, tei:person and children, tei:org and children, tei:item with tei:term and tei:gloss, etc.)

Date slider

The date slider can be customised by modifying the HTML markup in webapps/ROOT/assets/templates/search.xml, particularly the attribute values on the div#date-slider-widget. data-range-min and data-range-max specify the bounds of the slider, while data-value-min and data-value-max specify the initial values of the slider. data-step specifies the increments of each slider handle, while data-label-prefix and data-label-suffix provide the text that goes before and after the label's indication of the current values.