feat: add func to generate multiple quries #1009

e7217 · 2024-11-27T09:33:12Z

description
Currently, the autorag generate one query per corpus during the QA generation stage. However, to more thoroughly validate performance, I wanted to explore the possibility of generating multiple query candidates. To this end, I have experimentally introduced a feature called multiple_queries_gen, which includes a parameter n to specify how many queries to generate. I have confirmed that it produces satisfactory output on my local machine.

If there was an intention behind not adding this feature, please let me know. Additionally, I acknowledge that the code I added may not be clean, so if there are any optimization suggestions, I would be happy to implement it.

Please review it at your convenience. Thank you.

qa = initial_corpus.sample(random_single_hop, n=len(initial_corpus.data), random_state=random.randint(1,100)).map(
        lambda df: df.reset_index(drop=True),
    ).make_retrieval_gt_contents().batch_apply(
        multiple_queries_gen,  # query generation
        llm=llm,
        lang="ko",
        n=3,
    ).batch_apply(
        make_basic_gen_gt,  # answer generation (basic)
        llm=llm,
        lang="ko",
    )....

sample
before:

after:

etc

add .vscode/ to .gitignore

vkehfdl1 · 2024-11-27T12:10:28Z

@e7217 Hi!

Thanks for the contribution.

I like the idea, but if we implement this feature, I want that feature can be used in the most of the query gen functions.
I think it will be okay to add this feature to llama_index_generate_base directly!

e7217 · 2024-11-27T12:32:06Z

@vkehfdl1

Thank you for checking.

I agree that improving it to be usable in most query functions is a good idea. However, at this stage, I wanted to take a more conservative approach to minimize the impact on the existing code you've already implemented. Would it be okay to do the additional work in the next phase?

vkehfdl1 · 2024-11-27T14:14:05Z

@e7217
Okay I agree. I will merge this issue after some adjustments. (for Linter and formatting)

And just do not close the issue in this PR.
Thank you!

vkehfdl1

Great:)
Thanks for waiting my review

e7217 · 2024-11-29T17:12:38Z

close #1006

feat: add func to generate multiple quries

c2fb24b

vkehfdl1 self-requested a review November 27, 2024 14:14

vkehfdl1 added 2 commits November 29, 2024 20:21

Merge branch 'main' into feat/multiple-question

d918325

formatting the code

d893296

vkehfdl1 approved these changes Nov 29, 2024

View reviewed changes

vkehfdl1 enabled auto-merge (squash) November 29, 2024 12:43

vkehfdl1 merged commit 7740a82 into Marker-Inc-Korea:main Nov 29, 2024
1 check passed

vkehfdl1 mentioned this pull request Nov 29, 2024

[Feature Request] Create multiple questions #1006

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add func to generate multiple quries #1009

feat: add func to generate multiple quries #1009

e7217 commented Nov 27, 2024 •

edited by vkehfdl1

Loading

vkehfdl1 commented Nov 27, 2024

e7217 commented Nov 27, 2024

vkehfdl1 commented Nov 27, 2024

vkehfdl1 left a comment

e7217 commented Nov 29, 2024

feat: add func to generate multiple quries #1009

feat: add func to generate multiple quries #1009

Conversation

e7217 commented Nov 27, 2024 • edited by vkehfdl1 Loading

vkehfdl1 commented Nov 27, 2024

e7217 commented Nov 27, 2024

vkehfdl1 commented Nov 27, 2024

vkehfdl1 left a comment

Choose a reason for hiding this comment

e7217 commented Nov 29, 2024

e7217 commented Nov 27, 2024 •

edited by vkehfdl1

Loading