Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge marc_map and marc_spec #109

Open
nichtich opened this issue Sep 28, 2021 · 2 comments
Open

Merge marc_map and marc_spec #109

nichtich opened this issue Sep 28, 2021 · 2 comments

Comments

@nichtich
Copy link
Member

Having both marc_map and marc_spec is confusing, why having both with slightly different functionality? Can we make marc_map an alias for marc_spec instead?

@phochste
Copy link
Member

The difference is because of the history how these mappers were created . I made the marc_map and in a later phase @cKlee 's research on MARC Spec resulted in his marc_spec implementation in Perl.

There is overlap between marc_map and the marc_spec (Casten worked on that) but there is some syntax differences. E.g.:

  • Marc spec uses dollar ($) fields to specify subfields, marc_map doesn't have this (100$a vs 100a)
  • Marc spec uses square brackets ([]) for specfying the occurence of a marc tag, marc_map uses square brackets for indicator filtering (100[0] as first 100 field vs 100[0] the 100 field with indicator-1 == 0)
  • Marc spec uses carets (^) for pointing to indicators

The marc_map syntax is also used in other fixes such as marc_replace_all , marc_remove, marc_cut, marc_copy

These things can be tackled by mapping the MARC paths themselves to the MARCSpec version and using the MARC Spec tools. This seems doable for all the fixes above, but requires time and effort.

There is some overhead currently in using marc_spec which runs on my local tests about 11% slower for every included marc_spec instead of marc_map. This overhead is for my own local applications a reason to use marc_map over marc_spec in some use cases (with lots of data and little time slots available for processing)

@nichtich
Copy link
Member Author

Thank's for clarification. To summarize:

  • make $ optional in marc_spec so subfield references are valid in both fixes
  • find out how to do deal with square brackets
    • extend marc_spec to understand indicators in square brackets (at least when it includes a comma)?
    • raise a warning when just a number is given in square brackets
  • see whether more syntax clashes exist
  • improve performance of marc_spec
  • make all marc_map fixes aliases of the corresponding marc_spec fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants