- Support for conversations with message history, including a new
message_history
parameter for LLM interactions. - Ability to include system instructions and override them for specific invocations.
- Summarization of chat history to enhance query embedding and context handling.
- Updated LLM implementations to handle message history consistently across providers.
- The
id_prefix
parameter in theLexicalGraphConfig
is deprecated.
- IDs for the Document and Chunk nodes in the lexical graph are now randomly generated and unique across multiple runs, fixing issues in the lexical graph where relationships were created between chunks that were created by different pipeline runs.
- Integrated
json-repair
package to handle and repair invalid JSON generated by LLMs. - Introduced
InvalidJSONError
exception for handling cases where JSON repair fails. - Ability to create a Pipeline or SimpleKGPipeline from a config file. See the example.
- Added
OllamaLLM
andOllamaEmbeddings
classes to make Ollama support more explicit. Implementations using theOpenAILLM
andOpenAIEmbeddings
classes will still work.
- Updated LLM prompt for Entity and Relation extraction to include stricter instructions for generating valid JSON.
- Added schema functions to the documentation.
- Introduced optional lexical graph configuration for
SimpleKGPipeline
, enhancing flexibility in customizing node labels and relationship types in the lexical graph. - Introduced optional
neo4j_database
parameter forSimpleKGPipeline
,Neo4jChunkReader
andText2CypherRetriever
. - Ability to provide description and list of properties for entities and relations in the
SimpleKGPipeline
constructor.
neo4j_database
parameter is now used for all queries in theNeo4jWriter
.
- Updated all examples to use
neo4j_database
parameter instead of an undocumented neo4j driver constructor. - All
READ
queries are now routed to a reader replica (for clusters). This impacts all retrievers, theNeo4jChunkReader
andSinglePropertyExactMatchResolver
components.
- Made
relations
andpotential_schema
optional inSchemaBuilder
. - Added a check to prevent the use of deprecated Cypher syntax for Neo4j versions 5.23.0 and above.
- Added a
LexicalGraphBuilder
component to enable the import of the lexical graph (document, chunks) without performing entity and relation extraction. - Added a
Neo4jChunkReader
component to be able to read chunk text from the database.
- Vector and Hybrid retrievers used with
return_properties
now also return the node labels (nodeLabels
) and the node's element ID (id
). HybridRetriever
now filters out the embedding property index inself.vector_index_name
from the retriever result by default.- Removed support for neo4j.AsyncDriver in the KG creation pipeline, affecting Neo4jWriter and related components.
- Updated examples and unit tests to reflect the removal of async driver support.
- Resolved issue with
AzureOpenAIEmbeddings
incorrectly inheriting fromOpenAIEmbeddings
, now inherits fromBaseOpenAIEmbeddings
.
- Introduced a
fail_if_exist
option to index creation functions to control behavior when an index already exists. - Added Qdrant retriever in neo4j_graphrag.retrievers.
- Comprehensive rewrite of the README to improve clarity and provide detailed usage examples.
- Fix a bug where
openai
Python client andnumpy
were required to import any embedder or LLM.
- The value associated to the enum field
OnError.IGNORE
has been changed from "CONTINUE" to "IGNORE" to stick to the convention and match the field name.
- Added
SinglePropertyExactMatchResolver
component allowing to merge entities with exact same property (e.g. name) - Added the
SimpleKGPipeline
class, a simplified abstraction layer to streamline knowledge graph building processes from text documents.
- Added
SinglePropertyExactMatchResolver
component allowing to merge entities with exact same property (e.g. name)
- Added AzureOpenAILLM and AzureOpenAIEmbeddings to support Azure served OpenAI models
- Added
template
validation inPromptTemplate
class upon construction. - Examples demonstrating the use of Mistral embeddings and LLM in RAG pipelines.
- Added feature to include kwargs in
Text2CypherRetriever.search()
that will be injected into a custom prompt, if provided. - Added validation to
custom_prompt
parameter ofText2CypherRetriever
to ensure thatquery_text
placeholder exists in prompt. - Introduced a fixed size text splitter component for splitting text into specified fixed size chunks with overlap. Updated examples and tests to utilize this new component.
- Introduced Vertex AI LLM class for integrating Vertex AI models.
- Added unit tests for the Vertex AI LLM class.
- Added support for Cohere LLM and embeddings - added optional dependency to
cohere
. - Added support for Anthropic LLM - added optional dependency to
anthropic
. - Added support for MistralAI LLM - added optional dependency to
mistralai
. - Added support for Qdrant - added optional dependency to
qdrant-client
.
- Resolved import issue with the Vertex AI Embeddings class.
- Fixed bug in
Text2CypherRetriever
usingcustom_prompt
arg where thesearch
method would not inject thequery_text
content. custom_prompt
arg is now converted toText2CypherTemplate
class within theText2CypherRetriever.get_search_results
method.Text2CypherTemplate
andRAGTemplate
prompt templates now requirequery_text
arg and will error if it is not present. Previousquery_text
aliases may be used, but will warn of deprecation.- Resolved issue where Neo4jWriter component would raise an error if the start or end node ID was not defined properly in the input.
- Resolved issue where relationship types was not escaped in the insert Cypher query.
- Improved query performance in Neo4jWriter: created nodes now have a generic
__KGBuilder__
label and an index is created on the__KGBuilder__.id
property. Moreover, insertion queries are now batched. Batch size can be controlled using thebatch_size
parameter in theNeo4jWriter
component.
- Moved the Embedder class to the neo4j_graphrag.embeddings directory for better organization alongside other custom embedders.
- Removed query argument from the GraphRAG class'
.search
method; users must now usequery_text
. - Neo4jWriter component now runs a single query to merge node and set its embeddings if any.
- Nodes created by the
Neo4jWriter
now have an extra__KGBuilder__
label. Nodes from the entity graph also have an__Entity__
label. - Dropped support for Python 3.8 (end of life).
- Updated documentation links in README.
- Renamed deprecated package references in documentation.
- Introduction page to the documentation content tree.
- Introduced a new Vertex AI embeddings class for generating text embeddings using Vertex AI.
- Updated documentation to include OpenAI and Vertex AI embeddings classes.
- Added google-cloud-aiplatform as an optional dependency for Vertex AI embeddings.
- Make
pygraphviz
an optional dependency - it is now only required when callingpipeline.draw
.
- Moved pygraphviz to optional dependencies under [tool.poetry.extras] in pyproject.toml to resolve an issue where pip install neo4j-graphrag incorrectly required pygraphviz as a mandatory dependency.
- Officially renamed neo4j-genai to neo4j-graphrag. For the final release version of neo4j-genai, please visit https://pypi.org/project/neo4j-genai/.
- The
neo4j-genai
package is now deprecated. Users are advised to switch to the new packageneo4j-graphrag
.
- Ability to visualise pipeline with
my_pipeline.draw("pipeline.png")
. LexicalGraphBuilder
component to create the lexical graph without entity-relation extraction.
- Pipelines now return correct results when the same pipeline is run in parallel.
- Pipeline run method now return a PipelineResult object.
- Improved parameter validation for pipelines (#124). Pipeline now raise an error before a run starts if:
- the same parameter is mapped twice
- or a parameter is defined in the mapping but is not a valid component input
- PDF-to-graph pipeline for knowledge graph construction in experimental mode
- Introduced support for Component/Pipeline flexible architecture.
- Added new components for knowledge graph construction, including text splitters, schema builders, entity-relation extractors, and Neo4j writers.
- Implemented end-to-end tests for the new knowledge graph builder pipeline.
- When saving the lexical graph in a KG creation pipeline, the document is also saved as a specific node, together with relationships between each chunk and the document they were created from.
- Corrected the hybrid retriever query to ensure proper normalization of scores in vector search results.
- Add optional custom_prompt arg to the Text2CypherRetriever class.
GraphRAG.search
method first parameter has been renamedquery_text
(wasquery
) for consistency with the retrievers interface.- Made
GraphRAG.search
method backwards compatible with the query parameter, raising warnings to encourage using query_text instead.
- Corrected initialization to allow specifying the embedding model name.
- Removed sentence_transformers from embeddings/init.py to avoid ImportError when the package is not installed.
- Stopped embeddings from being returned when searching with
VectorRetriever
. AddednodeLabels
andid
to the metadata ofVectorRetriever
results. - Added
upsert_vector
utility function for attaching vectors to node properties. - Introduced
Neo4jInsertionError
for handling insertion failures in Neo4j. - Included Pinecone and Weaviate retrievers in neo4j_graphrag.retrievers.
- Introduced the GraphRAG object, enabling a full RAG (Retrieval-Augmented Generation) pipeline with context retrieval, prompt formatting, and answer generation.
- Added PromptTemplate and RagTemplate for customizable prompt generation.
- Added LLMInterface with implementation for OpenAI LLM.
- Updated project configuration to support multiple Python versions (3.8 to 3.12) in CI workflows.
- Improved developer experience by copying the docstring from the
Retriever.get_search_results
method to theRetriever.search
method - Support for specifying database names in index handling methods and retrievers.
- User Guide in documentation.
- Introduced result_formatter argument to all retrievers, allowing custom formatting of retriever results.
- Refactored import paths for retrievers to neo4j_graphrag.retrievers.
- Implemented exception chaining for all re-raised exceptions to improve stack trace readability.
- Made error messages in
index.py
more consistent. - Renamed
Retriever._get_search_results
toRetriever.get_search_results
- Updated retrievers and index handling methods to accept optional database names.
- Removed Pinecone and Weaviate retrievers from init.py to prevent ImportError when optional dependencies are not installed.
- Moved few-shot examples in
Text2CypherRetriever
to the constructor for better initialization and usage. Updated unit tests and example script accordingly. - Fixed regex warnings in E2E tests for Weaviate and Pinecone retrievers.
- Corrected HuggingFaceEmbeddings import in E2E tests.
- Introduced custom exceptions for improved error handling, including
RetrieverInitializationError
,SearchValidationError
,FilterValidationError
,EmbeddingRequiredError
,RecordCreationError
,Neo4jIndexError
, andNeo4jVersionError
. - Retrievers that integrates with a Weaviate vector database:
WeaviateNeo4jRetriever
. - New return types that help with getting retriever results:
RetrieverResult
andRetrieverResultItem
. - Supported wrapper embedder object for sentence-transformers embeddings:
SentenceTransformerEmbeddings
. Text2CypherRetriever
object which allows for the retrieval of records from a Neo4j database using natural language.
- Replaced
ValueError
with custom exceptions across various modules for clearer and more specific error messages.
- Updated documentation to include new custom exceptions.
- Improved the use of Pydantic for input data validation for retriever objects.