-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search: index document by custom metadata #3174
Comments
Hmm, I'm unsure whether this is something that a lot of users need. Let's leave it open for a while and see whether there's some feedback from other users.
Could you provide some reproducible examples? I'd be curious to learn if we can improve the default settings. |
I want to generalize this issue to check whether users would find it helpful to index author-define document metadata with the help of the search plugin. This would address the things mentioned in the OP and more, maybe something like:
|
It's a little tough to share the actual site that I'm working with. Let me try to create a small demo site that demonstrates the issue.
That makes sense to me! |
There's no response from other users, so I guess this is currently not worth pursuing. I may re-evaluate this in the future, though. Closing for now. |
This would be really useful for us at the section/heading level. For one example, our FAQ has many entries on each FAQ category's page. I'm writing an entry about disabling accounts. I don't want to say 'delete' anywhere in the text of the entry because I don't want the user to have any possibility to misunderstand and think the account actually ceases to exist. But I do want to help the user find this entry when they search the site for words like 'delete' or 'GDPR'. I'm envisioning something like |
Thanks for the input. We're actually considering making arbitrary metadata searchable with the next iteration of search. If somebody searches for "foo", but it's not contained in the text (so there's nothing to highlight), how would you imagine the interface to tell the user that it's a legit match? |
Here's my concrete issue where I think this will help me. I have a page with a filename My workaround is to put this invisible content at the top of my page:
which renders like this; it doesn't know that the paragraph is hidden, and I think that that result looks OK: I'd rather put keywords in the front matter / metadata at the top of the page, or just index the In terms of rendering, if I put this in my front matter:
I think showing that summary tag and the matching excerpt as the highlight would make it clear that it was a summary and wouldn't be super confusing when they land on the page itself. Sort of off-topic from this issue, but related: It would be great if the markdown document filename could also participate in the search index; I think that might save me from having to write keywords at all in this instance. |
It's a little bit inelegant, but the mkdocs-material folks seem to have a nicer solution in the works, so this will do for now. closes: #3305 refs: squidfunk/mkdocs-material#3174
We have the use case to search for numbers. We have a documentation where pages have an integer ID. When searching for the ID, lets say 10, you find all kinds of pages that contain a 10. We already boost the search importance of the page a bit but we would like to boost that ID, so when you search for the ID, the page about that ID shows up on top. I would be fine to add the ID as a keyword to the page and having search to prioritize keywords somehow. Edit: Adding the ID to the page headline achieves this. However, I would prefer to not have the ID in the headline. |
@smartYSC could you create a minimal reproduction that showcases what you imagine? |
Sure, here is an example: If you search for For page I added a |
Thanks! We'll investigate when working on the new search. |
I've started working on the next big search update which will include the ability to list and better select available metadata like tags, authors, etc. This will allow to customize the search in a way that wasn't possible before. Sneak peak here. |
That sneak peak looks pretty cool. Are you considering being able to tag a heading/section for search? Something like this, maybe?
|
Could be done. Could you elaborate some use cases? Would the keywords be visible? Would they be a specific category? I'm very interested in learning about different use cases, so we can fulfill them all, at best Edit: just scrolled through this issue and saw my earlier post from Nov 2021:
So I imagine that was what you're looking for 😅 |
Similar... but I believe you're talking there about keywords at the page level. I'm talking about per heading within a page. Similar to how you can exclude sections from search as well as whole pages, I'd like to be able to tag sections by a similar method of adding some pragma or other to the end of a Markdown heading. I could partially solve this by including a keywords paragraph with Examples from my situation:
In both these cases, yes, I would want it to show as a keyword somehow in the search results, something like how the tags do today, so that the user would know the result was actually relevant: |
I like @feasgal idea, usage is also similar to the exclusion of sections already present. |
Great input, I also though of that. Either via the attribute extension, or via custom blocks.
All search results are tied to headings, as documents are disassembled into sections. If you boost a custom field, e.g. However, I'm pretty sure that the new search will make boosting less necessary and should provide much more relevant results without much configuration. That's at least my goal, but use cases may differ, so providing degrees of freedom is absolutely essential.
I would take this one step further: we will add the ability to define precisely what metadata property will actually be shown to the user when searching. For example you would always want to show author, title, text, tags, but you would not want to show keywords. This will be completely configurable. We need to think of a good way to denote that a certain keyword matched an article, because we're not doing any highlighting if the term is not contained in the text but show it as a result nonetheless, but I'm very sure we'll find a good way to do this.
Same thing. The name of the old product could be defined as a keyword. E.g. "installation" will bring up the new product, but "installation foo" will bring up the old product before the new one. We might even extend the functionality to exclude certain entries when keywords are used, because now you could scope your search to specific parts of the documentation, but we'll leave that to after we shipped the first few versions. I'm really excited about the new approach, because my testing already shows that it will be so much more powerful and customizable than what we currently have. I want to make it as awesome as possible with the help of you and other users after shipping the first iterations. It'll take some time, as it's a pretty big fish to fry, but I think it'll be worth it 😊 Same as with the new social cards, the second and third iteration are always way, way better than what we had before, offering tons of new options and flexibility that we previously didn't have, and didn't know we need. |
Sorry, are you saying I could do this now? Or in the future when you release the new search you're working on? Have I missed that I could attach a custom field to a heading and then boost that field? |
In the very near future. Currently only |
@squidfunk where does this search overhaul stand as of today? Just wondering since the last message here was 2 months ago. |
We're currently busy finishing the refactoring of the blog plugin. I'm sorry for the delay, but all the topics we're currently working on are pretty complex. Once the blog is stable (and 9.2 is out), we'll continue working on the search. |
Hi I created the issue https://wezfurlong.org/wezterm/config/lua/config/selection_word_boundary.html Do you agree that this would be the suitable feature to use to get a "snake_case" title searchable? Is work ongoing on search now or are there still other things to fix? (9.2 seems to be out?) |
@kaddkaka if you add
9.2 is out, jup (actually we're at 9.4 already), but we had to squeeze in some other stuff, particularly restructuring our documentation to account for the growing number of options (still ongoing), setting up our examples repository and preparing everything to grow our team. With the funds, we're able to add further people to our core team that help out on discussion, issues, and other things, but you can probably imagine that it's quite an effort to scale from 1 person to more people, given that processes need to be established and some technicalities need to be put into place. Additionally, day-to-day ops like bugfixing, refactoring and issue triage eat up quite a significant portion of my time. With more people, I'll be able to focus on the search again |
@squidfunk For the search, how long does it usually take before you go from prototype to making it available for all? |
@karengermond as I already mentioned in my previous comment, later this year. I'm actively working on it right now, but I also need to set aside time for answering questions like this, fixing bugs and keeping the project in shape. Right now, I don't have 100% of my time available to work on it. If the funding situation would improve, we could hire more people to help. That being said, I have a working prototype but it's still some way to got to turn it into being production ready, because we need to support the 60 languages that we currently have + all functionality that we implemented in the current solution. Rebuilding a central feature like search from scratch takes time. If you need to have another solution now that solves your problems better, you may check paid solutions like Algolia which several users have integrated successfully. Once we release our new version, trying it out and switching back is trivial – just add it back to What specifically are you missing in the current implementation? |
Our doc set is huge and we're getting a lot of noise in our search
results. Nothing else specific.
…On Thu, Oct 12, 2023 at 9:42 AM Martin Donath ***@***.***> wrote:
@karengermond <https://github.com/karengermond> as I already mentioned in my
previous comment
<#3174 (comment)>,
later this year. I'm actively working on it right now, but I also need to
set aside time for answering questions like this, fixing bugs and keeping
the project in shape. Right now, I don't have 100% of my time available to
work on it. If the funding situation would improve, we could hire more
people to help.
That being said, I have a working prototype but it's still some way to got
to turn it into being production ready, because we need to support the 60
languages that we currently have + all functionality that we implemented in
the current solution.
Rebuilding a central feature like search from scratch takes time. If you
need to have another solution now that solves your problems better, you may
check paid solutions like Algolia which several users have integrated
successfully. Once we release our new version, trying it out and switching
back is trivial – just add it back to mkdocs.yml.
What specifically are you missing in the current implementation?
—
Reply to this email directly, view it on GitHub
<#3174 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A2OIWPZBTXEAQG6P47XBW3TX67XVHANCNFSM5HI36KWQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@karengermond if you can share your docs, or at least how huge it is, i.e., some metrics, that would be very helpful. |
Sorry, our docs are enterprise and I can't share them. When I generate
them as a PDF, it's 1800 pages.
One specific issue people have mentioned is that a two-term search query is
showing up with a term missing while other matches that have both terms
appear later in the query results.
We'd also like faceted search categories which would help limit the search
results. Not sure if you're looking into that.
Thanks! Love your stuff.
…On Thu, Oct 12, 2023 at 10:01 AM Martin Donath ***@***.***> wrote:
@karengermond <https://github.com/karengermond> if you can share your
docs, or at least how huge it is, i.e., some metrics, that would be very
helpful.
—
Reply to this email directly, view it on GitHub
<#3174 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A2OIWPZPTM64VJMKC7RFTITX67Z2ZANCNFSM5HI36KWQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
We're working hard on all things you requested. On a side note, if your company is not a sponsor of the project, you might consider sponsoring the project on the organization tier, as this would directly help us to speed up development by allowing us to compensate other users to help out on issues, discussions and questions. |
hi all :-) interesting long read. Use case: we use the swagger-ui-tag plugin to embed swagger docs. But as it's just a tag in a markdown document, the search index is empty. Should i open another issue ticket for this? 🤔 |
@andy-apptweak please open a new ticket, as it has nothing to do with the matters discussed in this issue. It should be possible to implement, but we will definitely need a minimal reproduction that we can work with. Please also explain the workarounds you're currently undertaking, so we have a complete and clear picture. Thank you! |
Please see the announcement in #6307. |
Contribution guidelines
I want to suggest an idea and checked that ...
Description
It would be really helpful if keywords could be included within the metadata of a page and used to help provide more accurate search results. Oftentimes a generic search term (e.g "service" for me) will return a ton of irrelevant results, seeming to overweight the use of one of the words in a heading.
I see that the boost feature does exist, however I'm looking for something more granular. It's not so much that I have one really important page that should always come up, but in certain contexts I want to ensure the right page does.
Tags are the other feature that are possibly related, however I don't really want to group content together or make this information prominent on the page. Also I don't want to give the impression that the rest of the content has been accurately tagged if folks try to click into them.
Example:
Use Cases
This would allow finer-grained control of search results. Coupled with search analytics (related: #3169), it would allow me to see what folks are searching and where they might have had trouble finding the right results.
Screenshots / Mockups
No response
The text was updated successfully, but these errors were encountered: