Research Log - 2024-09-15 - Generative infinite high quality questions #37
daveshap
announced in
Announcements
Replies: 1 comment 3 replies
-
great spec sheet so far, just curious, would human-in-the-loop aided backtracking to find an optimal prompt or set of thinking steps by domain to generate the dataset be feasible. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Video documentation: https://youtu.be/JZfzo4SrPIs?si=M9RerSV08rC4_pBu
Primary result: question generator: https://github.com/daveshap/raspberry_experiments/blob/main/generate_many_questions.py
Research Log: Raspberry Project - Automated Question Generation
Overview
In this project, I developed an automated question generation system for the Raspberry project. My goal was to create a system capable of generating complex, domain-specific questions suitable for AI benchmarking and training.
Process Development
I began by outlining the requirements for the question generation system. I wanted to create questions that were answerable without external resources, required multiple reasoning steps, spanned diverse fields, and targeted high difficulty levels from graduate to world expert.
To achieve this, I designed a multi-step process for question generation. This included creating lists of main topics, generating subtopics, defining question parameters, and formulating the final questions. I created four key lists to provide randomization and diversity: main topics, difficulty levels, problem types, and conceptual connectors.
Prompt Engineering
I developed two main prompts: one for subtopic generation and another for final question generation. I designed these prompts to guide Claude in creating specific, challenging subtopics and questions based on randomly selected parameters from my lists.
API Integration and Script Development
I integrated the Anthropic API to interact with Claude for question generation. This involved setting up API calls, handling responses, and implementing error checking. I developed two main Python scripts: generate_question.py for generating a single question, and generate_many_questions.py for generating multiple questions in batch.
In these scripts, I included functions for reading lists from files, generating question parameters, formatting prompts, querying Claude, and saving the generated questions and logs. I also implemented error handling and added delays between API calls to avoid rate limiting.
Testing and Refinement
I went through several iterations of testing and refinement. I encountered and resolved issues such as API refusals and formatting problems. To improve the randomization process, I added random seed initialization to the scripts.
Results
The system successfully generated complex, domain-specific questions. I was particularly pleased with examples like a question about counterfactual European power dynamics involving Liechtenstein and another about reframing global power dynamics through fractal patterns in political science.
Future Directions
Looking ahead, I've identified several next steps for the project:
I believe this work is crucial in solving the data problem for AI training, and I see parallels between my approach and methods potentially used by organizations like OpenAI. I'm excited to continue refining and expanding this system to generate even more sophisticated and diverse questions.
Beta Was this translation helpful? Give feedback.
All reactions