Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add notebook for VAIS automatic URL recrawl #161

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

hossein-mansour
Copy link
Contributor

Pull-Request Template

Thank you for your contribution! Please provide a brief description of your changes and ensure you've completed the checklist below.

Description

What does this PR do? Why is it necessary?

Fixes # (if applicable)

Checklist

  • Contribution Guidelines: I have read the Contribution Guidelines.
  • CLA: I have signed the CLA.
  • Authorship: I am listed as the author (if applicable).
  • Conventional Commits: My PR title and commit messages follow the Conventional Commits spec.
  • Code Format: I have run nox -s format to format the code.
  • Spelling: I have fixed any spelling errors, and added false positives to .github/actions/spelling/allow.txt if necessary.
  • Template: I have followed the aaie_notebook_template.ipynb if submitting a new jupyter notbook.
  • Sync: My Fork is synced with the upstream.
  • Documentations: I have updated relevant documentations (if applicable) in the docs folder.

Automatically recrawl URLs in a JSON file upon upload to GCS via a Cloud Function
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will provide feedback shortly. In the meantime, here's a summary for you and other reviewers to quickly get up to speed:

This pull request, authored by hossein-mansour, adds a new Jupyter notebook demonstrating how to automatically recrawl URLs for Vertex AI Search Advanced Website Datastores. The intent is to enhance the documentation and provide a practical example of this functionality.

Here's a breakdown of the changes:

  • Added a new notebook: genai-on-vertex-ai/vertex_ai_search/manual_recrawl_urls_with_trigger.ipynb This notebook details the process of triggering automatic URL recrawls.
  • Updated docs/mkdocs.yml: Added a navigation entry (line 252) for the new notebook to the documentation site's navigation.
  • Updated genai-on-vertex-ai/vertex_ai_search/README.md: Added a link (line 24) to the new notebook in the README.

The changes are well-organized and clearly documented. The additions to the documentation make it easy to find and use the new notebook.


And now, a little haiku to lighten the mood:

Code flows like a stream,
New features add their bright hues,
Review, then merge it!

Copy link

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request adds a new notebook demonstrating event-based triggering of manual recrawls for Vertex AI Search Advanced Website Datastores. The PR title and description clearly explain the purpose of the change. The addition of the notebook is valuable for users looking to implement this functionality. I have a few suggestions based on commonly used style guides for documentation and code, as no specific style guide was provided.

Here's a summary of the suggestions from the review:

  • Documentation: Ensure consistent capitalization and phrasing in documentation.
  • README Consistency: Update the README files to reflect the changes in the code and documentation.

@@ -249,6 +249,7 @@
- Inline Ingestion of Documents: genai-on-vertex-ai/vertex_ai_search/inline_ingestion_of_documents.ipynb
- Parsing and Chunking: genai-on-vertex-ai/vertex_ai_search/parsing_and_chunking_with_BYO.ipynb
- Query Level Boosting: genai-on-vertex-ai/vertex_ai_search/query_level_boosting_filtering_and_facets.ipynb
- Event-based Triggering of Manual Recrawl for Vertex AI Search Advanced Website Datastores: genai-on-vertex-ai/vertex_ai_search/manual_recrawl_urls_with_trigger.ipynb

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

For consistency with other entries, consider capitalizing "Event-based" and "Trigger". Also, consider rephrasing to be more concise, perhaps "Triggering Manual Recrawls for Vertex AI Search (Advanced Website Datastores)"

Suggested change
- Event-based Triggering of Manual Recrawl for Vertex AI Search Advanced Website Datastores: genai-on-vertex-ai/vertex_ai_search/manual_recrawl_urls_with_trigger.ipynb
- Triggering Manual Recrawls for Vertex AI Search (Advanced Website Datastores): genai-on-vertex-ai/vertex_ai_search/manual_recrawl_urls_with_trigger.ipynb

@@ -21,3 +21,4 @@ We will try to use REST APIs which will hopefully make the codes easier to under
3. [Defining custom attributes based on URL patterns in Vertex AI Search Website Datastores](./custom_attributes_by_url_pattern.ipynb)
4. [Query-Level Boosting, Filtering, and Facets for Vertex AI Search Website Datastores](./query_level_boosting_filtering_and_facets.ipynb)
5. [Inline Ingestion of Documents into Vertex AI Search](./inline_ingestion_of_documents.ipynb)
6. [Event-based Triggering of Manual Recrawl for Vertex AI Search Advanced Website Datastores](./manual_recrawl_urls_with_trigger.ipynb)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

For consistency with other entries, consider capitalizing "Event-based" and "Trigger". Also, consider rephrasing to be more concise, perhaps "Triggering Manual Recrawls for Vertex AI Search (Advanced Website Datastores)"

Suggested change
6. [Event-based Triggering of Manual Recrawl for Vertex AI Search Advanced Website Datastores](./manual_recrawl_urls_with_trigger.ipynb)
6. [Triggering Manual Recrawls for Vertex AI Search (Advanced Website Datastores)](./manual_recrawl_urls_with_trigger.ipynb)

Copy link

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will provide my feedback shortly. In the meantime, here's a summary for you and other reviewers to quickly get up to speed:

This pull request, authored by hossein-mansour, adds a new Jupyter notebook demonstrating how to automatically recrawl URLs for Vertex AI Search Advanced Website Datastores. The intent is to provide a practical example of how to trigger manual recrawls based on events.

Here's a breakdown of the changes:

  • docs/mkdocs.yml: This file was updated to add a link to the new notebook in the documentation's navigation menu. Specifically, line 252 was added to include the new notebook in the navigation.
  • genai-on-vertex-ai/vertex_ai_search/README.md: The README file was updated on line 24 to include a link to the new notebook in the list of available notebooks.

The overall impact is the addition of a new, valuable resource for users wanting to learn how to perform automatic URL recrawls within Vertex AI Search.


And now, a little haiku to lighten the mood:

Code flows like a stream,
New features add to the source,
Reviews make it gleam.

Copy link

Gemini encountered an error creating the review. You can try again by commenting @code-review-assist review.

Copy link

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will provide feedback shortly. In the meantime, here's a summary for you and other reviewers to quickly get up to speed:

This pull request, authored by hossein-mansour, adds a new Jupyter notebook demonstrating how to automatically recrawl URLs for Vertex AI Search Advanced Website Datastores. The intent is to enhance the documentation and provide a practical example of this functionality.

Here's a breakdown of the changes:

  • Added a new notebook: genai-on-vertex-ai/vertex_ai_search/manual_recrawl_urls_with_trigger.ipynb This notebook details the process of triggering automatic URL recrawls.
  • Updated docs/mkdocs.yml: Line 249-252 was modified to include the new notebook in the documentation navigation.
  • Updated genai-on-vertex-ai/vertex_ai_search/README.md: Line 21-24 was updated to add a link to the new notebook in the README file.

The changes are well-organized and clearly documented. The addition of this notebook significantly improves the project's documentation by providing a concrete example of an important feature.


And now, a little haiku to lighten the mood:

Code flows like a stream,
New features, a welcome sight,
Reviews make it gleam.

Copy link

Gemini encountered an error creating the review. You can try again by commenting @code-review-assist review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant