Add yapremisrw2, yet another PREMIS reader/writer plugin #34

jrwdunham · 2017-11-09T05:32:07Z

Added yapremisrw, yet another PREMIS reader/writer plugin. This was based on previous work by mcantelon. It has been rebased against current master and modified minimally to make it a dependency (plugin) injectable into metsrw.fsentry.FSEntry.

This should supersede #20. The CR/comments from that PR can, in large part, be applied to this one.

jrwdunham · 2017-11-10T23:35:18Z

After this is merged, a 0.2.1 release should be created so that artefactual/archivematica-storage-service#262 can require it.

Added yapremisrw, yet another PREMIS reader/writer plugin. This was based on previous work by mcantelon. It has been rebased against current master and modified minimally to make it a dependency (plugin) injectable into `metsrw.fsentry.FSEntry`.

sevein

This looks good to me. I'm happy to see how your plugin solution is working. Do you think we should try to pick @mcantelon's brain to review it? It's been almost two years since his original work but hey I'm sure he remembers. @mcantelon, are you watching this?

sevein · 2017-11-11T15:53:44Z

setup.py


 here = path.abspath(path.dirname(__file__))

 # Get the long description from the relevant file
-with open(path.join(here, 'README.md'), encoding='utf-8') as f:
+with open(path.join(HERE, 'README.md'), encoding='utf-8') as f:


You haven't renamed the global yet right? Is still reads here.

sevein · 2017-11-11T15:55:47Z

setup.py

    long_description = f.read()


+def get_version():


This was also done in archivematica-fpr-admin. It looks a bit different - which method you prefer?

Yes, I think raising a RuntimeError if no version retrievable is probably a good idea, as is allowing for UTF-8-encoded README and init.py files. Also, the regular expression approach is shorter. I'll copy that code to here.

sevein · 2017-11-11T16:06:34Z

tests/plugins/dc/test_dublincore.py

+import tests.helpers as h
+
+
+class TestDublinCoreXmlData(TestCase):


These tests live in tests/plugins/dc. Shouldn't it be tests/plugins/dcrw?

yes, yes it should.

sevein · 2017-11-11T16:16:46Z

tests/test_validate.py

@@ -0,0 +1,33 @@
+# -*- coding: utf8 -*-


With you coverage only gets better! 📈

I realize coveralls can be annoying when it doesn't pass the build just because the coverage decreased 0.01% 😆 - but if this gets on your way again you could also change the threshold settings. I'm not sure what the defaults are but they're sure very small!

Good to know about the threshold settings. The test coverage actually decreased by about 0.5%, but yes it was annoying to have to add more tests to get the CI checks to pass.

sevein · 2017-11-11T16:18:27Z

tests/test_validate.py

+    def mockisfile(path):
+        if path == bad_path:
+            return False
+        return True


Just curious, is this the Pythonic way of return not path == bad_path? I like explicit, just wondering!

I think what you've written is equally Pythonic. Maybe I'll add some parens so nobody has to think about operator precedence: return not (path == bad_path).

Also moved tests/plugins/dc to tests/plugins/dcrw to be consistent with metsrw/plugins.

jrwdunham

I have copied relevant comments from #20 to this PR and closed PR #20.

jrwdunham · 2017-11-14T00:08:24Z

metsrw/plugins/dcrw/dc.py

+    """
+    DC_ELEMENTS = ['title', 'creator', 'subject', 'description', 'publisher', 'contributor', 'date', 'format', 'identifier', 'source', 'relation', 'language', 'coverage', 'rights']
+
+    def __init__(self, title=None, creator=None, subject=None, description=None, publisher=None, contributor=None, date=None, format=None, identifier=None, source=None, relation=None, language=None, coverage=None, rights=None):


Comment by @Hwesta: This is a great use case for kwargs instead of accessing locals() directly. We could replace the list of parameters with **kwargs, which would allow us to accept an arbitrary number of inputs. kwargs is a dictionary. Then the loop could be setattr(self, element, kwargs[element])

jrwdunham · 2017-11-14T00:09:19Z

metsrw/metadata.py

        parser = etree.XMLParser(remove_blank_text=True)
        if isinstance(document, six.string_types):
            self.document = etree.fromstring(document, parser=parser)
        elif isinstance(document, (etree._Element, list)):
            self.document = document
        self.mdtype = mdtype
        self.othermdtype = othermdtype
+        self.data = data


Comment by @Hwesta: Rather than creating a new attribute, this could use document to store the child document - whether that's a string, ElementTree, or plugin class of the appropriate type.

jrwdunham · 2017-11-14T00:10:28Z

metsrw/plugins/yapremisrw/event.py

+        if event_type is None:
+            raise ConstructError('event_type argument is required.')
+
+        if event_datetime is None:


Comment by @Hwesta: This feels unpythonic - these parameters are required by the function, and if they're None, it will raise an error later.

jrwdunham · 2017-11-14T00:11:13Z

metsrw/plugins/yapremisrw/event.py

+
+        event_type_el = etree.Element(utils.lxmlns('premis') + 'eventType')
+        event_type_el.text = self.event_type
+        root.append(event_type_el)


Comment by @Hwesta: Tip: These three lines can be etree.SubElement(root, utils.lxmlns('premis') + 'eventType').text = self.event_type

Before this, calling `get_subsections_of_type` could raise `IndexError` if the `FSEntry` instance had no amdSecs.

jrwdunham added the enhancement label Nov 9, 2017

jrwdunham self-assigned this Nov 9, 2017

jrwdunham mentioned this pull request Nov 9, 2017

Dev/issue 8894 premis parsing #20

Closed

jrwdunham force-pushed the dev/issue-11581-premis-parsing branch from 02d14c9 to cd8113c Compare November 10, 2017 03:13

jrwdunham mentioned this pull request Nov 10, 2017

Add API endpoints for search artefactual/archivematica-storage-service#262

Closed

jrwdunham force-pushed the dev/issue-11581-premis-parsing branch from cd8113c to d5a1645 Compare November 10, 2017 23:19

jrwdunham changed the title ~~Add premisrw2, a second PREMIS plugin~~ Add yapremisrw2, yet another PREMIS reader/writer plugin Nov 10, 2017

jrwdunham added 2 commits November 10, 2017 17:04

Improved version retrieval in setup.py

ea868f3

jrwdunham force-pushed the dev/issue-11581-premis-parsing branch from f1a1bea to ee177cf Compare November 11, 2017 01:05

Bump version to 0.2.1

804a295

sevein self-requested a review November 11, 2017 16:19

sevein approved these changes Nov 11, 2017

View reviewed changes

Improve version retrieval in setup.py again

afa67e2

Also moved tests/plugins/dc to tests/plugins/dcrw to be consistent with metsrw/plugins.

jrwdunham requested a review from mcantelon November 14, 2017 00:07

jrwdunham commented Nov 14, 2017

View reviewed changes

jrwdunham added 2 commits November 29, 2017 13:25

Allow fsentry instances with no amdSecs

13d78b1

Before this, calling `get_subsections_of_type` could raise `IndexError` if the `FSEntry` instance had no amdSecs.

WIP: code review fixes

55a394e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add yapremisrw2, yet another PREMIS reader/writer plugin #34

Add yapremisrw2, yet another PREMIS reader/writer plugin #34

jrwdunham commented Nov 9, 2017 •

edited

Loading

jrwdunham commented Nov 10, 2017

sevein left a comment

sevein Nov 11, 2017

sevein Nov 11, 2017

jrwdunham Nov 13, 2017 •

edited

Loading

sevein Nov 11, 2017

jrwdunham Nov 13, 2017

sevein Nov 11, 2017 •

edited

Loading

jrwdunham Nov 13, 2017

sevein Nov 11, 2017

jrwdunham Nov 13, 2017

jrwdunham left a comment

jrwdunham Nov 14, 2017

jrwdunham Nov 14, 2017

jrwdunham Nov 14, 2017

jrwdunham Nov 14, 2017

		import tests.helpers as h


		class TestDublinCoreXmlData(TestCase):

Add yapremisrw2, yet another PREMIS reader/writer plugin #34

Are you sure you want to change the base?

Add yapremisrw2, yet another PREMIS reader/writer plugin #34

Conversation

jrwdunham commented Nov 9, 2017 • edited Loading

jrwdunham commented Nov 10, 2017

sevein left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrwdunham Nov 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sevein Nov 11, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrwdunham left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrwdunham commented Nov 9, 2017 •

edited

Loading

jrwdunham Nov 13, 2017 •

edited

Loading

sevein Nov 11, 2017 •

edited

Loading