Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single and Multiple wild card search do not work as expected #494

Open
htpvu opened this issue Mar 18, 2022 · 3 comments
Open

Single and Multiple wild card search do not work as expected #494

htpvu opened this issue Mar 18, 2022 · 3 comments

Comments

@htpvu
Copy link

htpvu commented Mar 18, 2022

Related to UAT-3 (https://docs.google.com/document/d/1-ZGnCSrrXFJZTvBqm4ytGfMO-2LMaf0T02BgnJnmgVg/edit#), test case 3.9 ad 3.10.

When the wildcard is used in the middle of the search string, the result return is not as expected (0-1 result vs a few hundreds expected. We suspect this behavior is caused by the UI implementation or Solr, but by something in between.

@jabrah
Copy link

jabrah commented Mar 23, 2022

Wildcard characters seem to work as expected only if they are at the end of a word, as opposed to placed in the middle of a word, as described above.

https://solr.apache.org/guide/8_11/the-standard-query-parser.html#wildcard-searches

A few search layers:

  • idc-search component from the idc-ui-theme
    User input query terms should be unchanged, passed to the REST Export layer
  • Solr Search content view defines a REST Export view -- a simple REST endpoint at /search_rest_endpoint
  • Drupal's search_api / solr modules
  • Solr running in the backend somewhere

@jhu-alistair
Copy link

Will discuss with stakeholders to understand the goal here. Having a indeterminate wildcard as opposed to a truncation or character wildcard in boolean search seems very strange.

@jhu-alistair
Copy link

I think the authoritative statement of search behavior should be the search tips next to the search tools at https://digital.library.jhu.edu

Based on that description "?" should replace only a single character not any number of characters. Here is the whole set of tips.

Use double quotes (") to search as a phrase
Use the ? wildcard character to search for words with one alternate character. For example, te?t should match test and text
Use the * wildcard character to search for words with multiple alternate characters. Searching for test* should match test, tester, testing, etc
Use the proximity search syntax if you want to search for two terms within a certain number of words of each other or you can add a proximity search term on the Advanced Search page. The term "farm goat"~10, including the quotes, should match an item that has the words "farm" and "goat" within 10 words of each other
Use the AND, OR, NOT boolean operators in your searches, or combine search terms on the Advanced Search page. For example, in the global search, you can search for farm AND goat to look for items with both terms. If you manually enter these operators, make sure they are capitalized, as you see in this example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants