CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It processes both user input and knowledge base content through progressive semantic layers, combining:
- Dynamic concept extraction and relationship mapping
- Adaptive semantic similarity analysis
- Context-aware knowledge graph integration
- Multi-layer processing pipeline
- Real-time learning and adaptation
- Python 3.10+
- OpenAI Compatible LLM Provider
- Clone the repository:
git clone https://github.com/severian42/Cascade-of-Semantically-Integrated-Layers.git
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env with your configuration
-
Start your local LLM server (e.g., Ollama)
-
Run the system:
python main.py --debug # (Shows full internal insight and metrics)
# Start the system
python main.py --debug
# Available commands:
help # Show available commands
graph # Display knowledge graph statistics
concepts # List extracted concepts
relations # Show concept relationships
exit # Exit the program
CaSIL processes input through four sophisticated layers:
- Advanced concept extraction using TF-IDF
- Named entity recognition
- Custom stopword filtering
- Semantic weight calculation
- Dynamic similarity matrix computation
- Graph-based relationship discovery
- Concept clustering and community detection
- Temporal relationship tracking
- Historical context weighting
- Knowledge base integration
- Dynamic context windows
- Adaptive threshold adjustment
- Multi-source information fusion
- Style-specific processing
- Dynamic temperature adjustment
- Context-aware response generation
CaSIL maintains two interconnected graph systems:
- Tracks temporary concept relationships
- Maintains conversation context
- Updates in real-time
- Handles recency weighting
- Stores long-term concept relationships
- Tracks concept evolution over time
- Maintains relationship weights
- Supports community detection
-
Multi-Stage Processing
- Input text β Concept Extraction β Relationship Analysis β Context Integration β Response
- Each stage maintains its own similarity thresholds and processing parameters
- Adaptive feedback loop adjusts parameters based on processing results
-
Semantic Analysis Engine
# Example concept extraction flow text β TF-IDF Vectorization β Weight Calculation β Threshold Filtering β Concepts # Relationship discovery process concepts β Similarity Matrix β Graph Construction β Community Detection β Relationships
-
Dynamic Temperature Control
temperature = base_temp * layer_modifier['base'] * ( 1 + (novelty_score * layer_modifier['novelty_weight']) * (1 + (complexity_factor * layer_modifier['complexity_weight'])) )
-
Dual-Graph System
Session Graph (Temporary) Knowledge Graph (Persistent) ββ Short-term relationships ββ Long-term concept storage ββ Recency weighting ββ Relationship evolution ββ Context tracking ββ Community detection ββ Real-time updates ββ Concept metadata
-
Graph Update Process
# Simplified relationship update flow new_weight = (previous_weight + (similarity * time_weight)) / 2 graph.add_edge(concept1, concept2, weight=new_weight)
- Dynamic threshold adjustment based on:
ββ Input complexity ββ Concept novelty ββ Processing layer ββ Historical performance
- Multi-dimensional similarity calculation:
Combined Similarity = (0.7 * cosine_similarity) + (0.3 * jaccard_similarity)
- Weighted by:
- Term frequency
- Position importance
- Historical context
Context Integration Flow:
Input β Session Context β Knowledge Graph β External Knowledge β Response
β______________________________________________|
(Feedback Loop)
ββββββββββββββββββ
β Knowledge Base β
βββββββββ¬βββββββββ
β
βΌ
User Input β Concept Extraction β Similarity Analysis β Graph Integration
β β β β
β βΌ βΌ βΌ
β Session Graph βββββββΊ Relationship Analysis ββββ Knowledge Graph
β β β β
β ββββββββββββββββββββββββββΌββββββββββββββββββ
β β
ββββββββββββββββββββββ Response βββββββββ
- LRU Cache for similarity calculations
- Concurrent processing with thread pooling
- Batched vectorizer updates
- Adaptive corpus management
-
Progressive Semantic Analysis
- Each layer builds upon previous insights
- Maintains context continuity
- Adapts processing parameters in real-time
-
Dynamic Knowledge Integration
- Combines session-specific and persistent knowledge
- Real-time graph updates
- Community-based concept clustering
-
Adaptive Response Generation
- Context-aware temperature adjustment
- Style-specific processing parameters
- Multi-source information fusion
- CPU: Multi-core processor recommended
- RAM: 8GB minimum, 16GB recommended
- Storage: 1GB for base system
- Python: 3.8 or higher
- OS: Linux, macOS, or Windows
DEBUG_MODE=true
USE_EXTERNAL_KNOWLEDGE=false
LLM_URL=http://0.0.0.0:11434/v1/chat/completions
LLM_MODEL=your_model_name
INITIAL_UNDERSTANDING_THRESHOLD=0.7
RELATIONSHIP_ANALYSIS_THRESHOLD=0.7
CONTEXTUAL_INTEGRATION_THRESHOLD=0.9
SYNTHESIS_THRESHOLD=0.8
from SemanticCascadeProcessing import CascadeSemanticLayerProcessor
processor = CascadeSemanticLayerProcessor(config)
processor.knowledge_base.load_from_directory("path/to/knowledge")
# Get graph statistics
processor.knowledge.print_graph_summary()
# Analyze concept relationships
processor.analyze_knowledge_graph()
Contributions are welcome! Please read our Contributing Guidelines first.
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to the open-source NLP community
- Special thanks to LocalLLaMA for never letting anything slip by them unnoticed; and for just being an awesome community overall