Skip to content

Commit

Permalink
Major Changes:
Browse files Browse the repository at this point in the history
added 2 experimental reasoning algorithms

cleaned up codebase to reduce redundant files while retaining the functionality

changed metrics to include new experimental reasoning algorithms when they are used

Updated README.md

Updated LICENSE to 2025
  • Loading branch information
frgmt0 committed Jan 8, 2025
1 parent d9cd8a2 commit 28c60fe
Show file tree
Hide file tree
Showing 15 changed files with 872 additions and 554 deletions.
97 changes: 35 additions & 62 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,51 @@
A reasoning implementation for Claude Desktop that lets you use both Beam Search and Monte Carlo Tree Search (MCTS). tbh this started as a way to see if we could make Claude even better at complex problem-solving... turns out we definitely can.

### Current Version:
**v1.1.0**
**v2.0.0**

#### What's New:

> Added 2 Experimental Reasoning Algorithms:
> - `mcts-002-alpha`
> - Uses the A* Search Method along with an early *alpha* implementation of a Policy Simulation Layer
> - Also includes an early *alpha* implementation of Adaptive Exploration Simulator & Outcome Based Reasoning Simulator
> *NOTE* the implementation of these alpha simulators is not complete and is subject to change
> - `mcts-002alt-alpha`
> - Uses the Bidirectional Search Method along with an early *alpha* implementation of a Policy Simulation Layer
> - Also includes an early *alpha* implementation of Adaptive Exploration Simulator & Outcome Based Reasoning Simulator
> *NOTE* the implementation of these alpha simulators is not complete and is subject to change
>
What happened to `mcts-001-alpha` and `mcts-001alt-alpha`?
> Quite simply: It was useless and near similar to the base `mcts` method. After initial testing the results yielded in basic thought processes was near similar showing that simply adding policy simulation may not have an effect.
So why add Polciy Simulation Layer now?
> Well i think its important to incorporate Policy AND Search in tandem as that is how most of the algorithms implement them.
#### Previous Versions:
**v1.1.0**

> Added model control over search parameters:
>
> beamWidth - lets Claude adjust how many paths to track (1-10)
> numSimulations - fine-tune MCTS simulation count (1-150)
#### Previous Versions:
**v1.0.0**

> Initial release with base Beam Search and MCTS implementations
## Features
- Two search strategies that you can switch between:
- Beam search (good for straightforward stuff)
- MCTS (when stuff gets complex)
- MCTS (when stuff gets complex) with alpha variations (see above)
- Tracks how good different reasoning paths are
- Maps out all the different ways Claude thinks through problems
- Analyzes how the reasoning process went
- Follows the MCP protocol (obviously)

## Installation
```
git clone https://github.com/frgmt0/mcp-reasoner.git
OR clone the original:
git clone https://github.com/Jacck/mcp-reasoner.git
cd mcp-reasoner
npm install
npm run build
Expand All @@ -46,69 +65,23 @@ Add to Claude Desktop config:
}
```

## Search Strategies

### Beam Search
so beam search is pretty straightforward; it keeps track of the most promising solution paths as it goes. works really well when you've got problems with clear right answers, like math stuff or certain types of puzzles.

interesting thing i found while testing: when i threw 50 puzzles from the Arc AGI benchmark at it, it only scored 24%. like, it wasn't completely lost, but... not great. here's how i tested it:

- first, i'd check if claude actually got the pattern from the examples. if it seemed confused, i'd try to nudge it in the right direction (but dock points cause that's not ideal)
- then for the actual test cases, i had this whole scoring system:
- 5 points - nailed it
- 4 points - reasoning was solid but maybe i fucked up following the instructions
- 3 points - kinda got the pattern but didn't quite nail it
- 2 points - straight up failed
- 1 point - at least the initial reasoning wasn't completely off

### Monte Carlo Tree Search
now THIS is where it gets interesting. MCTS absolutely crushed it compared to beam search - we're talking 48% on a different set of 50 Arc puzzles. yeah yeah, maybe they were easier puzzles (this isn't an official benchmark or anything), but doubling the performance? that's not just luck.

the cool thing about MCTS is how it explores different possibilities. instead of just following what seems best right away, it tries out different paths to see what might work better in the long run. claude spent way more time understanding the examples before diving in, which probably helped a lot.

## Why This Matters
adding structured reasoning to claude makes it way better... no der, right? but what's really interesting is how different methods work for different types of problems.

why'd i test on puzzles instead of coding problems? honestly, claude's already proven itself on stuff like polyglot and codeforces. i wanted to see how it handled more abstract reasoning - the kind of stuff that's harder to measure.

## What's Next
got some cool stuff in the pipeline:

### IDDFS (Iterative Deepening Depth-First Search)
basically IDDFS is like... imagine you're exploring a maze but instead of going all in, you check everything 1 step deep, then 2 steps deep, and so on.

pros:
- uses way less memory than regular DFS (which is huge for complex problems)
- guaranteed to find the shortest path to a solution
- works really well when you don't know how deep you need to go

cons:
- might seem slower since you're re-exploring stuff
- not great if your solution is super deep in the tree
- can get stuck in loops if i'm not careful with the implementation, which is hard because im usually not careful hahaha
## Testing

working on `iddfs-exp` right now and the theory is that it might handle certain types of puzzles better than MCTS, especially ones where the solution path isn't too deep but needs systematic exploration.
[More Testing Coming Soon]

### Alpha-Beta Pruning
ok this one's a bit of an experiment and may not work... it's traditionally used in game trees (like chess engines) but i think it could be interesting for reasoning too.
## Benchmarks

pros:
- super efficient at cutting off "dead end" reasoning paths
- works really well when you can evaluate how good a partial solution is
- could be amazing for problems where you need to consider opposing viewpoints
[Benchmarking will be added soon]

cons:
* needs a good evaluation function (which is hard af for general reasoning)
* might miss some creative solutions by cutting off paths too early
* really depends on the order you explore things
Key Benchmarks to test against:

`alphabeta-exp` is definitely gonna be rough at first, but i'm curious to see if we can make it work for non-game-tree problems. might be especially interesting for scenarios where claude needs to reason about competing hypotheses.
- MATH500

also working on letting claude control the MCTS sampling parameters directly - could lead to some interesting adaptive behavior where it adjusts its exploration strategy based on how well it's understanding the problem.
- GPQA-Diamond

will definitely share more test results as we implement these. if you're interested in helping test or have ideas for other algorithms that might work well, hit me up.
- GMSK8

-frgmt0, Jacck
- Maybe Polyglot &/or SWE-Bench

## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
24 changes: 0 additions & 24 deletions dist/engine.d.ts

This file was deleted.

78 changes: 0 additions & 78 deletions dist/engine.js

This file was deleted.

29 changes: 15 additions & 14 deletions dist/reasoner.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@ export class Reasoner {
this.stateManager = new StateManager(CONFIG.cacheSize);
// Initialize available strategies
this.strategies = new Map();
// Initialize base strategies
this.strategies.set(ReasoningStrategy.BEAM_SEARCH, StrategyFactory.createStrategy(ReasoningStrategy.BEAM_SEARCH, this.stateManager, CONFIG.beamWidth));
this.strategies.set(ReasoningStrategy.MCTS, StrategyFactory.createStrategy(ReasoningStrategy.MCTS, this.stateManager, undefined, CONFIG.numSimulations));
// Initialize experimental MCTS strategies
this.strategies.set(ReasoningStrategy.MCTS_002_ALPHA, StrategyFactory.createStrategy(ReasoningStrategy.MCTS_002_ALPHA, this.stateManager, undefined, CONFIG.numSimulations));
this.strategies.set(ReasoningStrategy.MCTS_002_ALT_ALPHA, StrategyFactory.createStrategy(ReasoningStrategy.MCTS_002_ALT_ALPHA, this.stateManager, undefined, CONFIG.numSimulations));
// Set default strategy
const defaultStrategy = CONFIG.defaultStrategy;
this.currentStrategy = this.strategies.get(defaultStrategy) ||
Expand All @@ -16,19 +20,17 @@ export class Reasoner {
async processThought(request) {
// Switch strategy if requested
if (request.strategyType && this.strategies.has(request.strategyType)) {
// Create new strategy instance with current beamWidth if specified
const strategyType = request.strategyType;
// Create new strategy instance with appropriate parameters
if (strategyType === ReasoningStrategy.BEAM_SEARCH) {
this.currentStrategy = StrategyFactory.createStrategy(strategyType, this.stateManager, request.beamWidth);
this.strategies.set(strategyType, this.currentStrategy);
}
else if (strategyType === ReasoningStrategy.MCTS) {
this.currentStrategy = StrategyFactory.createStrategy(strategyType, this.stateManager, undefined, request.numSimulations);
this.strategies.set(strategyType, this.currentStrategy);
}
else {
this.currentStrategy = this.strategies.get(strategyType);
// All MCTS variants (base and experimental) use numSimulations
this.currentStrategy = StrategyFactory.createStrategy(strategyType, this.stateManager, undefined, request.numSimulations);
}
// Update strategy in map
this.strategies.set(strategyType, this.currentStrategy);
}
// Process thought using current strategy
const response = await this.currentStrategy.processThought(request);
Expand Down Expand Up @@ -96,17 +98,16 @@ export class Reasoner {
if (!this.strategies.has(strategyType)) {
throw new Error(`Unknown strategy type: ${strategyType}`);
}
if (strategyType === ReasoningStrategy.BEAM_SEARCH && beamWidth !== undefined) {
// Create new strategy instance with appropriate parameters
if (strategyType === ReasoningStrategy.BEAM_SEARCH) {
this.currentStrategy = StrategyFactory.createStrategy(strategyType, this.stateManager, beamWidth);
this.strategies.set(strategyType, this.currentStrategy);
}
else if (strategyType === ReasoningStrategy.MCTS && numSimulations !== undefined) {
this.currentStrategy = StrategyFactory.createStrategy(strategyType, this.stateManager, undefined, numSimulations);
this.strategies.set(strategyType, this.currentStrategy);
}
else {
this.currentStrategy = this.strategies.get(strategyType);
// All MCTS variants (base and experimental) use numSimulations
this.currentStrategy = StrategyFactory.createStrategy(strategyType, this.stateManager, undefined, numSimulations);
}
// Update strategy in map
this.strategies.set(strategyType, this.currentStrategy);
}
getAvailableStrategies() {
return Array.from(this.strategies.keys());
Expand Down
4 changes: 3 additions & 1 deletion dist/strategies/factory.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ import { StateManager } from '../state.js';
import { BaseStrategy } from './base.js';
export declare enum ReasoningStrategy {
BEAM_SEARCH = "beam_search",
MCTS = "mcts"
MCTS = "mcts",
MCTS_002_ALPHA = "mcts_002_alpha",
MCTS_002_ALT_ALPHA = "mcts_002_alt_alpha"
}
export declare class StrategyFactory {
static createStrategy(type: ReasoningStrategy, stateManager: StateManager, beamWidth?: number, numSimulations?: number): BaseStrategy;
Expand Down
8 changes: 8 additions & 0 deletions dist/strategies/factory.js
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
import { BeamSearchStrategy } from './beam-search.js';
import { MonteCarloTreeSearchStrategy } from './mcts.js';
import { MCTS002AlphaStrategy } from './experiments/mcts-002-alpha.js';
import { MCTS002AltAlphaStrategy } from './experiments/mcts-002alt-alpha.js';
export var ReasoningStrategy;
(function (ReasoningStrategy) {
ReasoningStrategy["BEAM_SEARCH"] = "beam_search";
ReasoningStrategy["MCTS"] = "mcts";
ReasoningStrategy["MCTS_002_ALPHA"] = "mcts_002_alpha";
ReasoningStrategy["MCTS_002_ALT_ALPHA"] = "mcts_002_alt_alpha";
})(ReasoningStrategy || (ReasoningStrategy = {}));
export class StrategyFactory {
static createStrategy(type, stateManager, beamWidth, numSimulations) {
Expand All @@ -12,6 +16,10 @@ export class StrategyFactory {
return new BeamSearchStrategy(stateManager, beamWidth);
case ReasoningStrategy.MCTS:
return new MonteCarloTreeSearchStrategy(stateManager, numSimulations);
case ReasoningStrategy.MCTS_002_ALPHA:
return new MCTS002AlphaStrategy(stateManager, numSimulations);
case ReasoningStrategy.MCTS_002_ALT_ALPHA:
return new MCTS002AltAlphaStrategy(stateManager, numSimulations);
default:
throw new Error(`Unknown strategy type: ${type}`);
}
Expand Down
Loading

0 comments on commit 28c60fe

Please sign in to comment.