Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample SOLR 9 Upgrade for CKAN #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

JVickery-TBS
Copy link
Collaborator

misc(dev): temp solr 9 upgrade commit;

  • Sample of schema and config changes.

- Sample of schema and config changes.
@thriuin
Copy link

thriuin commented Jul 25, 2024

I didn't examine every line change in great detail because there is a lot of repetition, but this looks good in general to me. Please go ahead.

@@ -74,39 +104,37 @@
<field name="id" type="string" indexed="true" stored="true" required="true" />
<field name="site_id" type="string" indexed="true" stored="true" required="true" />
<field name="title" type="text" indexed="true" stored="true" />
<field name="title_ngram" type="text_ngram" indexed="true" stored="true" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we storing the title_ngram for debugging purposes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know what the ngram is, these are just the core schemas from the ckan repo. So if ngram is for debug, we can remove with inline comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know what the ngram is, these are just the core schemas from the ckan repo. So if ngram is for debug, we can remove with inline comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looks like ngram in Solr is for better text matching? But there are apparently some performance issues using ngrams, so I don't think we really need them eh @thriuin ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless if we are using ngram fields in Django Searches? Might as well use it in CKAN too.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why you would use a ngram with a title field like this - I thought this was more for fixed width code strings. I am not using ngrams in Search so I think you could drop it

@@ -146,23 +173,13 @@
<dynamicField name="res_extras_*" type="text" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="vocab_*" type="string" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="*" type="string" indexed="true" stored="false"/>

<!-- Extra Canada fields -->
<field name="catalog_type" type="string" indexed="true" stored="true" multiValued="true"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are used in the ckan search, why are they removed here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there were a fair amount of solr schema changes. So will have to add these back in. I think I am most(ish) worried about the core_ati changes if you are able to check that one out?? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants