Skip to content

Reads a hathifiles text file one line at a time and creates HTRC Feature Extraction metadata json files

License

Notifications You must be signed in to change notification settings

mfall3/HTRC-FE-metadata-formatter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

HTRC-FE-metadata-formatter

Reads a hathifiles text file one line at a time and creates HTRC Feature Extraction metadata json files for use in generating Feature Extraction files as described at https://sandbox.htrc.illinois.edu/HTRC-UI-Portal2/Features.

Usage: python3 hathifiles2FE.py hathifile outDirectory startLine endLine

A log file named hathifiles2FE_log_[timestamp].txt will be created in outDirectory.

Optional arguments are only optional if no other arguments are used after them. For example, if startLine is provided, outDirectory must be provided.

arguments:

  • hathifile is the filename of the tab-delimited text files containing metadata from HathiTrust

  • outDirectory is the destination directory for the ouput metadata json files

    • optional
    • default is the current directory
  • startLine is the first line of the hathifile to be processed

    • optional
    • default is the first line of the hathifile
  • endLine is the last line of the hathifile to be processed

    • optional
    • default is the last line of the hathifile

About

Reads a hathifiles text file one line at a time and creates HTRC Feature Extraction metadata json files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages