Finetuning Llama Agents #50

Neoathenian · 2024-09-27T18:13:45Z

Neoathenian
Sep 27, 2024

I´ve been creating some data and did a fine-tuning to see what I´d get and I thought this was hilarious.
(I´m fine-tuning llama with a 3090 so if anyone wants to play around with the project, it´s at https://github.com/Neoathenian/CoT_finetuning_LLM)

Is 9.11 larger than 9.9?<|im_end|>
<|im_start|>assistant
Agent 1 (Reasoning): Okay, let's break down this question. We need to compare 9.11 and 9.9. Since we're dealing with numbers, we can use the comparison operator "greater than" to determine which one is larger.

Agent 2 (Verification): Hold on, let's make sure we're on the right track. We're comparing two decimal numbers, right?

Agent 1 (Reasoning): That's correct. We have 9.11 and 9.9, and we need to determine which one is larger.

Agent 2 (Verification): Okay, so what's the first step in comparing these numbers?

Agent 1 (Reasoning): Well, we can convert them to fractions. 9.11 can be written as 911/100, and 9.9 can be written as 990/100.

Agent 2 (Verification): That makes sense. Now, what's the next step?

Agent 1 (Reasoning): We can compare the fractions. Since 911/100 is larger than 990/100, we can conclude that 9.11 is larger than 9.9.

Agent 2 (Verification): I see. And what about the comparison operator? Are we using "greater than" correctly?

Agent 1 (Reasoning): Yes, we are. We're comparing two decimal numbers, and the comparison operator "greater than" is appropriate.

Agent 2 (Verification): Okay, I think we're good to go. Let's summarize our reasoning.

Reasoning: We can convert 9.11 and 9.9 to fractions, compare the fractions, and conclude that 9.11 is larger than 9.9.

Answer: 9.11<|im_end|>

z-xote · 2024-09-28T02:17:53Z

z-xote
Sep 28, 2024

well firstly, if we asked this question to a child it would likely fail due to not having enough experience with dealing with problems.
but i feel these models lack a bit more still,

i believe we should refrain from calling "verification" models "verifiers" because the high dimensional representation for verification likely means "can you verify that this piece of logic and reasoning i have displayed is correct"

for lack of better terminology i'll stick to simple english, but this may make a model just "verify" if there exists some form of logic behind ones answer..

perhaps completely shedding verifiers isn't the right choice, but theres a high chance we'd need "intuition" models which serve as a constant reminder of what needs to be done?

kind of like how you have this nagging thought behind your head each time you really really want to get something done?

its difficult for me to vocabularies as english is not my first language but i hope this turns lightbulbs on into someones mind at least.

please let me know / tag me if someone is interested in trying this out, i'd love to see your results on the "intuition" model along with a verifier and reasoner.. another way to think of intuition may be hypothesis models?

don't know, would love someone to test this out tho

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuning Llama Agents #50

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Finetuning Llama Agents #50

Neoathenian Sep 27, 2024

Replies: 1 comment

z-xote Sep 28, 2024

Neoathenian
Sep 27, 2024

z-xote
Sep 28, 2024