-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add yapremisrw2, yet another PREMIS reader/writer plugin #34
base: master
Are you sure you want to change the base?
Conversation
02d14c9
to
cd8113c
Compare
cd8113c
to
d5a1645
Compare
After this is merged, a 0.2.1 release should be created so that artefactual/archivematica-storage-service#262 can require it. |
Added yapremisrw, yet another PREMIS reader/writer plugin. This was based on previous work by mcantelon. It has been rebased against current master and modified minimally to make it a dependency (plugin) injectable into `metsrw.fsentry.FSEntry`.
f1a1bea
to
ee177cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I'm happy to see how your plugin solution is working. Do you think we should try to pick @mcantelon's brain to review it? It's been almost two years since his original work but hey I'm sure he remembers. @mcantelon, are you watching this?
setup.py
Outdated
|
||
here = path.abspath(path.dirname(__file__)) | ||
|
||
# Get the long description from the relevant file | ||
with open(path.join(here, 'README.md'), encoding='utf-8') as f: | ||
with open(path.join(HERE, 'README.md'), encoding='utf-8') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You haven't renamed the global yet right? Is still reads here
.
setup.py
Outdated
long_description = f.read() | ||
|
||
|
||
def get_version(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was also done in archivematica-fpr-admin. It looks a bit different - which method you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think raising a RuntimeError
if no version retrievable is probably a good idea, as is allowing for UTF-8-encoded README and init.py files. Also, the regular expression approach is shorter. I'll copy that code to here.
tests/plugins/dc/test_dublincore.py
Outdated
import tests.helpers as h | ||
|
||
|
||
class TestDublinCoreXmlData(TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests live in tests/plugins/dc
. Shouldn't it be tests/plugins/dcrw
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, yes it should.
@@ -0,0 +1,33 @@ | |||
# -*- coding: utf8 -*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With you coverage only gets better! 📈
I realize coveralls can be annoying when it doesn't pass the build just because the coverage decreased 0.01% 😆 - but if this gets on your way again you could also change the threshold settings. I'm not sure what the defaults are but they're sure very small!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know about the threshold settings. The test coverage actually decreased by about 0.5%, but yes it was annoying to have to add more tests to get the CI checks to pass.
tests/test_validate.py
Outdated
def mockisfile(path): | ||
if path == bad_path: | ||
return False | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is this the Pythonic way of return not path == bad_path
? I like explicit, just wondering!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you've written is equally Pythonic. Maybe I'll add some parens so nobody has to think about operator precedence: return not (path == bad_path)
.
Also moved tests/plugins/dc to tests/plugins/dcrw to be consistent with metsrw/plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metsrw/plugins/dcrw/dc.py
Outdated
""" | ||
DC_ELEMENTS = ['title', 'creator', 'subject', 'description', 'publisher', 'contributor', 'date', 'format', 'identifier', 'source', 'relation', 'language', 'coverage', 'rights'] | ||
|
||
def __init__(self, title=None, creator=None, subject=None, description=None, publisher=None, contributor=None, date=None, format=None, identifier=None, source=None, relation=None, language=None, coverage=None, rights=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment by @Hwesta: This is a great use case for kwargs instead of accessing locals() directly. We could replace the list of parameters with **kwargs, which would allow us to accept an arbitrary number of inputs. kwargs is a dictionary. Then the loop could be setattr(self, element, kwargs[element])
parser = etree.XMLParser(remove_blank_text=True) | ||
if isinstance(document, six.string_types): | ||
self.document = etree.fromstring(document, parser=parser) | ||
elif isinstance(document, (etree._Element, list)): | ||
self.document = document | ||
self.mdtype = mdtype | ||
self.othermdtype = othermdtype | ||
self.data = data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment by @Hwesta: Rather than creating a new attribute, this could use document to store the child document - whether that's a string, ElementTree, or plugin class of the appropriate type.
metsrw/plugins/yapremisrw/event.py
Outdated
if event_type is None: | ||
raise ConstructError('event_type argument is required.') | ||
|
||
if event_datetime is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment by @Hwesta: This feels unpythonic - these parameters are required by the function, and if they're None, it will raise an error later.
metsrw/plugins/yapremisrw/event.py
Outdated
|
||
event_type_el = etree.Element(utils.lxmlns('premis') + 'eventType') | ||
event_type_el.text = self.event_type | ||
root.append(event_type_el) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment by @Hwesta: Tip: These three lines can be etree.SubElement(root, utils.lxmlns('premis') + 'eventType').text = self.event_type
Before this, calling `get_subsections_of_type` could raise `IndexError` if the `FSEntry` instance had no amdSecs.
Added yapremisrw, yet another PREMIS reader/writer plugin. This was based on previous work by mcantelon. It has been rebased against current master and modified minimally to make it a dependency (plugin) injectable into
metsrw.fsentry.FSEntry
.This should supersede #20. The CR/comments from that PR can, in large part, be applied to this one.