Skip to content

Commit

Permalink
Enhances legal information extraction prompt
Browse files Browse the repository at this point in the history
Improves the clarity and structure of the legal document analysis prompt by:
- Adding detailed instructions for different field types (boolean, enum, dates)
- Including validation and objectivity requirements
- Expanding schema with new result-related fields
- Providing more specific guidance on information extraction

Makes extraction more reliable and consistent by enforcing stricter rules for data extraction while maintaining language flexibility.
  • Loading branch information
laugustyniak committed Dec 2, 2024
1 parent 203a37d commit c10d8e8
Showing 1 changed file with 30 additions and 7 deletions.
37 changes: 30 additions & 7 deletions juddges/prompts/information_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,20 +28,41 @@
Format response as JSON:
"""

EXTRACTION_PROMPT_TEMPLATE = """Act as a legal document tool that extracts information and answer questions based on judgements.
EXTRACTION_PROMPT_TEMPLATE = """Act as a highly skilled legal analyst specializing in extracting structured information from court judgments.
Instruction for extracting information from judgements:
- Judgements are in {LANGUAGE} language, please extract information in {LANGUAGE}.
- Do not provide information that are not explicitly mentioned in judgements. If you can't extract information from the text field, leave the field with empty string "".
Your task is to carefully analyze the provided judgment text and extract specific information according to the schema provided.
Follow the following YAML structure to extract information and answer questions based on judgements:
Key instructions:
- Language: Extract information in {LANGUAGE}, maintaining the original language of the judgment
- Accuracy: Only extract information that is explicitly stated in the text
- Empty fields: Use empty string "" when information cannot be found
- Consistency: Ensure extracted values match the specified data types and enums
- Context: Consider the full context when extracting information
- Validation: Double-check that extracted values are supported by the text
- Objectivity: Extract factual information without interpretation
For boolean fields:
- Only mark as true when explicitly confirmed in the text
- Default to false when information is unclear or not mentioned
For enum fields:
- Only use values from the provided options
- Use empty string if none of the options match exactly
For date fields:
- Use ISO 8601 format (YYYY-MM-DD)
- Extract complete dates when available
- Leave empty if date is partial or ambiguous
Schema for extraction:
{SCHEMA}
Judgment text to analyze:
====
{TEXT}
====
Format response as JSON:
Format response as JSON, ensuring all schema fields are included:
"""

EXAMPLE_SCHEMA = """verdict_date: date as ISO 8601
Expand Down Expand Up @@ -95,7 +116,9 @@
zabezpieczenie_udzielone: boolean, description: "Czy udzielono zabezpieczenia", example: true
rodzaj_zabezpieczenia: string, description: "Rodzaj zabezpieczenia", example: "Wstrzymanie egzekucji"
zabezpieczenie_pierwsza_instancja: boolean, description: "Czy zabezpieczenia udzielił sąd I instancji", example: true
czas_trwania_sprawy: string, description: "Czas rozpoznania sprawy – od złożenia pozwu do wydania wyroku", example: "2 lata 3 miesiące"""
czas_trwania_sprawy: string, description: "Czas rozpoznania sprawy – od złożenia pozwu do wydania wyroku", example: "2 lata 3 miesiące
wynik_sprawy: enum [Wygrana kredytobiorcy, Wygrana banku, Częściowe uwzględnienie roszczeń obu stron], description: "Ocena, czy bank czy kredytobiorca wygrał sprawę", example: "Wygrana kredytobiorcy"
szczegoły_wyniku_sprawy: string, description: "Szczegóły dotyczące wyniku sprawy", example: "Kredytobiorca wygrał, umowa uznana za nieważną"""


def prepare_information_extraction_chain_from_user_prompt() -> RunnableSequence:
Expand Down

0 comments on commit c10d8e8

Please sign in to comment.