Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can the answers be answered in another language? #47

Open
jhangmez opened this issue Aug 26, 2024 · 6 comments
Open

Can the answers be answered in another language? #47

jhangmez opened this issue Aug 26, 2024 · 6 comments

Comments

@jhangmez
Copy link

Hi, I've been using this tool for the past two months and I really like it. Now, I'm interested in experimenting with some code modifications. Specifically, I'd like to know if it's possible to send or add a prompt that can alter the responses in a different language, such as Spanish. This is important because I'm currently training a model with Spanish words, although I've previously trained a model using English conversations and it works too.

@KlausikPL
Copy link

you probaly must change prompt and you model must know this language

@jhangmez
Copy link
Author

what do you mean? All the plain texts that I put in raw text are in Spanish, but the answers that the model gives me are in English.

@e-p-armstrong
Copy link
Owner

@jhangmez Yes, in order to change the language of the questions and answers I would advise going into the ./prompts/ folder and modifying the prompts to say "write in Spanish" (and possibly translate the few-shot examples as well. If you do not want to translate the examples you can probably just add a note to the last user message saying "IMPORTANT: WRITE IN SPANISH" and it will likely work most of the time.

@jhangmez
Copy link
Author

Thanks for your reply, do you refer translate all of files in ./propmts/ ?
image

all of this? if so, it still doesn't work. I've tried before without without success, I though because there are keywords that you use on ./augmentoolkit/control_flow_functions.py like "\n\n**QUESTION:**" (line 50) on def extract_qa_tuples(text): and when I've tried to run processing.py, it gave me a lot of issues.

I'll try do the last thing that you mentioned and I'll update this issue.

Also I've tried change the message from system post use your tool, but it doesn't work when I test the model after training.

@R3xpook
Copy link

R3xpook commented Sep 27, 2024

@jhangmez any updates ?

@johnr14
Copy link

johnr14 commented Oct 18, 2024

I have it working mostly working by adding those lines to 3 files.
EDIT: Some more tweaks should be made in other files ? Sorry, must put this on hold for now, playtime is over.

Write the conversation in the same language that the questions and the answers where given in
Write the questions and the answers in the same language that the text was written in.
Write the questions and the answers in the same language that the text was written in.

Some files still output in english.

Here is the patch :

diff --git a/original/prompts/multi_turn_assistant_conversation.yaml b/original/prompts/multi_turn_assistant_conversation.yaml
index c78d79b..031d6ab 100644
--- a/original/prompts/multi_turn_assistant_conversation.yaml
+++ b/original/prompts/multi_turn_assistant_conversation.yaml
@@ -1,6 +1,6 @@
 - role: system
   content: |
-    You are an expert at creative writing and educational material. You will write a short conversation between a curious user and an AI assistant, in which the user asks some questions and the AI assistant answers them. The questions the user asks will be provided; the answers the assistant should return will also be provided. You must use these questions and answers directly in your conversation.
+    You are an expert at creative writing and educational material. You will write a short conversation between a curious user and an AI assistant, in which the user asks some questions and the AI assistant answers them. The questions the user asks will be provided; the answers the assistant should return will also be provided. You must use these questions and answers directly in your conversation. Write the conversation in the same language that the questions and the answers where given in. Keep **AI Assistant:** and **User:** untranslated and in english.
     
     **Rules for conversation writing:**
 
@@ -141,4 +141,4 @@
     {question_answer_pairs_string}
     
     -- AI Assistant Instructions --
-    {conversation_instructions}
\ No newline at end of file
+    {conversation_instructions}
diff --git a/original/prompts/qatuples_gen_filenames.yaml b/original/prompts/qatuples_gen_filenames.yaml
index 2629a6e..44f12f0 100644
--- a/original/prompts/qatuples_gen_filenames.yaml
+++ b/original/prompts/qatuples_gen_filenames.yaml
@@ -1,6 +1,6 @@
 - role: system
   content: |
-    You are an expert educational AI that, given a paragraph or two from a text, will create suitable educational questions based on the paragraphs, and *only* based on the paragraphs. You are focusing on understanding, application, analysis, and synthesis of ideas (cognitive levels). The questions you create will lean towards longer, more difficult questions that require some thought to solve — but can still be solved given the paragraphs provided. Essentially: the questions will test comprehension of real information that would be worthy to teach. After the question, you will also write its answer.
+    You are an expert educational AI that, given a paragraph or two from a text, will create suitable educational questions based on the paragraphs, and *only* based on the paragraphs. You are focusing on understanding, application, analysis, and synthesis of ideas (cognitive levels). The questions you create will lean towards longer, more difficult questions that require some thought to solve — but can still be solved given the paragraphs provided. Essentially: the questions will test comprehension of real information that would be worthy to teach. After the question, you will also write its answer. Write the questions and the answers in the same language that the text was written in.
     
     Do not explicitly mention the paragraphs in the questions themselves — just ask about the concepts related to the questions. BE CAREFUL NOT TO ASK QUESTIONS ABOUT THINGS THAT DO NOT APPEAR IN THE TEXT.
     
diff --git a/original/prompts/qatuples_gen_no_filenames.yaml b/original/prompts/qatuples_gen_no_filenames.yaml
index 7162d1e..7cfd8ee 100644
--- a/original/prompts/qatuples_gen_no_filenames.yaml
+++ b/original/prompts/qatuples_gen_no_filenames.yaml
@@ -1,6 +1,6 @@
 - role: system
   content: |
-    You are creating a logically-consistent series of questions about different domains, based on provided information. Given some information about something specific (it could be anything, from a README to a book excerpt to sales copy) you will create suitable questions based on the text, and *only* based on the text. You are focusing on understanding, application, analysis, and synthesis of ideas (cognitive levels). The questions will test comprehension of real information that would be worthy to teach in order for people to understand more about the specific material. The questions you create will lean towards longer, more difficult questions that require some thought to solve — but can still be solved given the paragraphs provided. After each question, you will also write its answer.
+    You are creating a logically-consistent series of questions about different domains, based on provided information. Given some information about something specific (it could be anything, from a README to a book excerpt to sales copy) you will create suitable questions based on the text, and *only* based on the text. You are focusing on understanding, application, analysis, and synthesis of ideas (cognitive levels). The questions will test comprehension of real information that would be worthy to teach in order for people to understand more about the specific material. The questions you create will lean towards longer, more difficult questions that require some thought to solve — but can still be solved given the paragraphs provided. After each question, you will also write its answer. Write the questions and the answers in the same language that the text was written in.
     
     **You Must:**
 
@@ -360,4 +360,4 @@
     {paragraph}
     """
     -----------
-    Reminder: do not mention the text, the provided information, the paragraphs, the work, or the author. Any questions about the author should be changed to be about the answerer ("you")
\ No newline at end of file
+    Reminder: do not mention the text, the provided information, the paragraphs, the work, or the author. Any questions about the author should be changed to be about the answerer ("you")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@johnr14 @R3xpook @jhangmez @e-p-armstrong @KlausikPL and others