Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dual-surnames are not displayed correctly on catalog #33

Open
jfy133 opened this issue Aug 24, 2020 · 4 comments
Open

Dual-surnames are not displayed correctly on catalog #33

jfy133 opened this issue Aug 24, 2020 · 4 comments

Comments

@jfy133
Copy link
Contributor

jfy133 commented Aug 24, 2020

I just noticed that however the manubot catalog page parses/displays surnames incorrectly (and with a anglo-centric bias).

image

I have two surnames: 'Fellows Yates' (it is not hyphenated), but only Yates is displayed. While this is unusual e.g. in the UK, another co-author from Spain is displayed as only Valtueña whereas the actual surname is 'Andrades Valtueña' - this is problematic as dual surnames is extremely common in both Spain and also hispanic cultures in S. America (and elsewhere). This is also relatively important for specifity during indexing due to realtively common single components of surnames.

It would be great if this would be corrected, e.g. only the first 'word' being listed as a first name (whereby single first names is less common than multiple surnames), any initials in names of the metadata.yaml file being assumed as middle names, and everything after as surname, or more ideally (although harder), the metadata.yaml to be updated to have explicit forename, middle name and surname fields.

@dhimmel
Copy link
Member

dhimmel commented Aug 24, 2020

Thanks @jfy133 for the feedback.

Currently, Manubot default citation/reference style displays full names:

manubot cite --format=markdown https://apeltzer.github.io/eager2-paper
  1. Reproducible, portable, and efficient ancient genome reconstruction with nf-core/eager
    James A. Fellows Yates, Thiseas C. Lamnidis, Maxime Borry, Aida Andrades Valtueña, Zandra Fagernäs, Stephen Clayton, Maxime U. Garcia, Judith Neukamm, Alexander Peltzer
    Manubot (2020-08-06) https://apeltzer.github.io/eager2-paper/

But for https://manubot.org/catalog we did not use our CSL style, and instead wrote something custom for the webpage.

If we look at the metadata produced by manubot for your manubot manuscript, it is extracted from the HTML <head>:

  <meta name="author" content="James A. Fellows Yates" />
  <meta name="author" content="Thiseas C. Lamnidis" />
  <meta name="author" content="Maxime Borry" />
  <meta name="author" content="Aida Andrades Valtueña" />

So the HTML contains full names, but the names are being split during manubot cite https://apeltzer.github.io/eager2-paper:

    "author": [
      {
        "family": "Yates",
        "given": "James A. Fellows"
      },
      ...
      {
        "family": "Valtueña",
        "given": "Aida Andrades"
      },

So the names in the HTML metadata is getting parsed incorrectly. This is probably being done by a Zotero translator, so the following suggestion is something we could pass along there:

It would be great if this would be corrected, e.g. only the first 'word' being listed as a first name (whereby single first names is less common than multiple surnames), any initials in names of the metadata.yaml file being assumed as middle names, and everything after as surname

But the most direct solution, since we're creating the metadata here, is to allow forenames and surnames in the metadata, as you suggest:

or more ideally (although harder), the metadata.yaml to be updated to have explicit forename, middle name and surname fields.

I'll continue to comment below with more info as I continue to look into this.

@dhimmel
Copy link
Member

dhimmel commented Aug 24, 2020

How to encode name parts in HTML metadata

The Highwire HTML meta tags can do "surname, forename" according to https://scholar.google.com/intl/en/scholar/inclusion.html#indexing

<meta name="citation_author" content="Liu, Li">
<meta name="citation_author" content="Rannels, Stephen R.">
<meta name="citation_author" content="Falconieri, Mary">

Looks like Dublin core also uses a "surname, forename" for this:

Creators should be listed separately, preferably in the same order that they appear in the publication. Personal names should be listed surname or family name first, followed by forename or given name. When in doubt, give the name as it appears, and do not invert.
Examples:

  • Creator="Shakespeare, William"
  • Creator="Wen Lee"
  • Creator="Hubble Telescope"
  • Creator="Internal Revenue Service. Customer Complaints Unit"

@dhimmel
Copy link
Member

dhimmel commented Aug 24, 2020

@jfy133
Copy link
Contributor Author

jfy133 commented Aug 24, 2020

Thanks for the detailed follow up @dhimmel! Curious to see how this goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants