Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update 00-quick-start.ipynb #164

Closed

Conversation

iuliaferoli
Copy link

Added a few more beginner-friendly concepts and examples before introducing semantic search.
Added mapping example without using the encoder.
Index examples with the bulk and helper bulk as reusable functions.
Added 'summary' as missing feature in the dataset
Response to this issue requesting more examples: elastic/elasticsearch-py#2139

@miguelgrinberg
Copy link
Collaborator

Since it is messy to insert feedback in the middle of a notebook raw JSON, I'm going to add my feedback here:

  • You are creating a mapping for all the fields, which I think is not necessary since ES figures out the types from the data itself. Is there a reason? It also seems inconsistent that when we get into the second part the mapping is just the dense vector field.
  • What is the purpose of mentioning both client.bulk and helpers.bulk? This is actually something I do not know, why would you prefer one over the other? Wouldn't it make more sense to only show the best of the two? Or at least explain when to use one vs. the other?
  • The 01-keyword-querying-filter.ipynb notebook has an example of match. Is there a need to show another example that is nearly identical in this notebook?

@iuliaferoli
Copy link
Author

  • I agree that the mapping isn't necessary, however, we don't really have examples of setting and assigning the mapping with the Python client (this was the main example mentioned in the issue I referenced). I think the best place to have it is in the beginning of the first notebook since it's one of the core concepts (this also aligns with the order in the elasticsearch training courses)
  • I can add some more explanation (although I am not sure there's much other than preference / whichever people find online first). The issue I have with this is that indeed we are not consistent with what we use across documents so new users trying to figure out how to bulk index will always find "contradicting" examples. I am trying to illustrate you can use either one and how the syntax differs (when I first ran into these myself I thought it was different versions of the same function and I kept trying to combine the two syntaxes for example)
  • Similar explanation to above - I think a simple match / search query should be in the beginning of the first notebook in an example series. I think it's a bit confusing if the first search example ever referenced is directly semantic search / knn with a whole different syntax. I don't think we need to separate examples but I do strongly believe we should start with basics and work up to more complex examples / edge cases. -- Maybe this is not 100% necessary if it was just about search-labs (which focuses on semantic search) however these labs are also other main examples that the official elasticsearch-py docs send users to. Because of that I would expect them to start with basic concepts first.

@elastic elastic deleted a comment from smith-kyle Feb 8, 2024
@iuliaferoli iuliaferoli deleted the iuliaferoli-update-00-notebook branch March 20, 2024 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants