Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Indexing CSV Data from S3 and Using Text Splitter & Embedding Skills Directly on Azure Platform: Column/Key Mismatch Error #267

Open
uergash1 opened this issue Sep 3, 2024 · 2 comments

Comments

@uergash1
Copy link

uergash1 commented Sep 3, 2024

I'm encountering an issue while trying to index CSV data from an S3 bucket directly through the Azure platform using Azure Search's capabilities, without resorting to Python code. Specifically, when I attempt to apply text splitter and embedding skills on top of the indexed data, I consistently receive a column/key mismatch error.

It seems that the current samples provided in this repository do not include a workflow or a hands-on example that demonstrates how to properly index CSV data from S3 in the Azure portal. Moreover, there is no clear guidance on how to correctly configure and use text splitter and embedding skills in this context.

Steps to Reproduce:

  1. Upload a CSV file to an S3 bucket.
  2. Attempt to index the CSV data directly through the Azure platform.
  3. Apply text splitter and embedding skills on the indexed data.
  4. Observe the column/key mismatch error.

Expected Behavior:

The platform should correctly index the CSV data from S3, and text splitter and embedding skills should function without any errors, allowing for seamless data processing.

Actual Behavior:

When applying text splitter and embedding skills, the platform throws a column/key mismatch error, indicating a potential issue with how the CSV data is being indexed or how the skills are being applied.

Suggestion:

It would be beneficial to have a sample or a detailed guide included in this repository that demonstrates the correct process for indexing CSV data from S3, along with instructions on properly configuring and applying text splitter and embedding skills directly through the Azure platform.

Additional Context:

This issue is critical for those looking to leverage Azure Search's capabilities without diving into Python code, as the current documentation and samples do not adequately cover this use case.

@mattgotteiner
Copy link
Member

Got it - my understanding is you want to do this through the portal only without running any code?

@uergash1
Copy link
Author

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants