Skip to content

1.3.8

Compare
Choose a tag to compare
@TimKoornstra TimKoornstra released this 19 Jan 10:18
· 349 commits to master since this release
6c80561

Release Notes for Loghi-HTR Version 1.3.8

Date: 2024-01-19

Overview

Version 1.3.8 of Loghi-HTR introduces a range of new features and updates to enhance testing procedures, handle Out-of-Vocabulary (OOV) vocabulary more effectively, and improve data normalization and validation processes.

New Features

  • Enabling Test List Usage:

    • Added functionality to use a test_list for streamlined testing procedures.
  • OOV Vocabulary Implementation:

    • Implemented handling for Out-of-Vocabulary (OOV) words.
    • Replaced [UNK] tokens with � (a less common character), enabling it to be counted as a single character in Character Error Rate (CER) calculations.
  • Outputting Results to File:

    • Validation and test results can now be outputted to a .csv file in the output folder.

Enhancements

  • Data Normalization and Validation Process Updates:

    • Separated validation and evaluation datasets for more precise control:
      • validation_dataset: Used with the --do_validate option, not normalized.
      • evaluation_dataset: Used for evaluation during training, undergoes normalization.
  • Default OOV Token Settings:

    • OOV tokens are enabled by default for testing and validation, but not for training and evaluation.

Bug Fixes

  • General Stability and Performance Enhancements:
    • Addressed various minor issues to improve overall system stability and performance.

Contributors

  • @TimKoornstra: Responsible for the implementation of OOV vocabulary handling, test list functionality, and enhancements in data normalization and validation processes.

Full Changelog: 1.3.7...1.3.8