-
Notifications
You must be signed in to change notification settings - Fork 13
Home
The METS reader/writer (metsrw) is a library to make it easier to work with METS XML files in Python.
Table of Contents
- Easy to use API
- Abstract away the details of XML
- Read and write a METS file
- Validate a METS file against a METS profile
- Works with METS files from a variety of sources
- Plugins to work with other standards inside METS - eg PREMIS, DublinCore, etc
- Python 2 & 3 support
- Fully tested
- Modular
- Documented
Please add and update use cases! Please suggest use cases for removal if you think that's a bad idea! Please add ideas of what an API for a usecase could look like!
Read a METS file
mets = metsrw.METSDocument.fromfile('path/to/file') # Reads a file
mets = metsrw.METSDocument.fromstring('<mets document>') # Parses a string
mets = metsrw.METSDocument.fromtree(lxml.ElementTree) # Parses an lxml.Element or lxml.ElementTree
Create a new METS file
mets = metsrw.METSDocument()
Note: Find == return one file; filter == return multiple files
Find a file by FILEID, GROUPID, FLocat@href, structMap/div@LABEL
Filter files by fileGrp@USE, structMap/div@TYPE
Find/filter dmdSec/techMD/digiprovMD/sourceMD/rightsMD by tag attributes (eg digiprovMD with DC metadata, the PREMIS:OBJECT, all rightsMDs)
Find/filter objects by above
Add a dmdSec with DublinCore metadata
Add a dmdSec with other metadata
Add multiple dmdSecs for the same object
Replace a dmdSec with an updated one and version the change
Add an amdSec
Add multiple amdSecs for the same object
Add a techMD/digiprovMD/sourceMD/rightsMD
Add multiple techMD/digiprovMD/sourceMD/rightsMD for the same object
Add techMD/digiprovMD/sourceMD/rightsMD to a specific amdSec if multiple
Replace a techMD/digiprovMD/sourceMD/rightsMD with an updated one and version the change
Add a techMD/digiprovMD/sourceMD/rightsMD containing PREMIS metadata (event, agent, object etc)
Add a mdWrap or mdRef inside a techMD/digiprovMD/sourceMD/rightsMD
Add a new object
Add a new object with children
Set an object's: fileSec href, USE & CHECKSUM; structMap LABEL & TYPE; FILEID; GROUPID
Sensibly derive the above from a subset of the information
Set where DMDIDs and ADMIDs appear (structMap vs fileSec)
Define a structMap
Define a second structMap with a different structure
Define a second structMap with the same structure but different labels
Specify a METS profile to affect parsing & validation
Add a new METS profile
Validate a METS file against a profile
Write a METS file
mets.write('path/to/output') # Writes to a file
mets.tofile('path/to/output') # Writes to a file
mets.tostring() # Outputs a string
mets.serialize() # Outputs lxml.Element
mets.totree() # Outputs lxml.Element or lxml.ElementTree
METS can wrap other standards, notably PREMIS & DublinCore. Since this is the METS reader/writer, these should be kept as optional plugins and used as needed.
Possible external libraries: