Discussion about duplication of metadata in .napari-hub and python package metadata #627
Replies: 13 comments 23 replies
-
Moving to discussion does make this a bit easier to get lost as a non actionable item. Still looking forward to your thoughts and really don't want this to get lost in the mix again. |
Beta Was this translation helpful? Give feedback.
-
Thanks for reaching out @tlambert03 and centralize this conversation. I want to make sure that we distinguish between preferred sources for metadata and allowed sources for metadata. In my mind, the greatest risk of confusion for plugin developers is if the preferred sources aren’t clear and the greatest risk of friction is if the preferred sources are too dispersed. I understand that you don’t want Improving plugin developer support and documentation is high priority for us over the coming months and we have bandwidth to help support and drive this effort. This alone will get us far to addressing concerns about plugin developer confusion and friction, however, part of what I read in your recommendations is that you believe the napari hub’s only allowed sources should be the napari community’s preferred sources.
Is it fair to say that your goal is to make napari plugins “self-describing”?
Agreed. Where there are relevant Python core fields available that meet the needs of napari users and plugin developers these should be the preferred fields. And for every piece of metadata that has a natural home in python’s core metadata, we support this (except perhaps the However, implied here is that you think any other fields for this metadata should not be allowed by the napari hub
This seems to be the heart of our disagreement. When considering which allowed sources of metadata the napari hub supports, we have two goals for our work on the napari hub that I want to emphasize as they have implications for what we choose to support (or not). We want to support as many plugins (and plugin developers) as reasonably possible. Part of what this means is that…
We want to enable napari users (of all levels of Python expertise) to find relevant and high quality plugins. Part of what this means is that…
To maintain these commitments to plugin developers and users, we want to allow multiple sources of some metadata.
Agreed. This sounds great. I’m excited about your efforts to consolidate metadata into the manifest file and However, on the napari hub, we still want to offer an alternate path for plugin developers to document their plugins that doesn’t require a major upgrade to their core build infrastructure just to add some metadata that is useful for napari plugin users. When napari is ready to deprecate npe1 and/or we’re confident that most actively maintained and used plugins are all on npe2, I’m happy to reconsider whether we continue to support The path forward that I would recommend:
I’m going to break out my thoughts on each of the fields you highlight in separate sections below so they can be discussed separately. |
Beta Was this translation helpful? Give feedback.
-
Summary
This falls under the same bucket as the DESCRIPTION for me: plugin developers should use a custom field. Currently, a large number of napari plugins feature summaries that go something like “A napari plugin that X” This is GREAT for PyPI.org, but rather redundant on the napari hub. We’ll use the PyPI summary as a fallback (since something is better than nothing). Ideally, this should probably be in the manifest Preferred:
Allowed:
|
Beta Was this translation helpful? Give feedback.
-
Links
Agreed. Happy to start deprecating our support for links in |
Beta Was this translation helpful? Give feedback.
-
Authors
Authors is surprisingly complicated (due largely to poor community standards around authorship of open source scientific software) and probably warrants a full discussion. It's under-documented (which we are actively working to fix), but we also support CITATION.cff and we recently began pulling author info from the developers’ preferred citation, if it exists. In our current implementation, this takes precedence over the Python core metadata if a CITATION.cff file is provided. This has the pitfall, however, of scenarios where the citation author list is more than the folks who are considered “authors” of the plugin. We’ve also done some work recently to improve our support for finding plugins by author (specifically filter-by-author) which has implications for where we source author info. Since the de facto standard for the napari community is to list authors in the Python core “author” field, we now parse this string to find individual authors, looking for commas, ampersands, and “and” as separators. This has come at the cost of false splitting of some institutions and authors who added their affiliations. For these folks, we’re planning to open PRs to encourage them to use Because of these two issues (the need to support institutional authors and allowing an override of our citation author logic) we will override both Python core and CITATION.cff info with author information defined in Preferred:
Allowed:
I’m very curious if you have thoughts on the preferred locations to specify author information... your ideas around using the newer inline table in Python core looks promising. Note: the ORCiD metadata is currently unused, but I can imagine tracking more author-level information (affiliation, Github handles, etc). That will probably not happen until we create user accounts for authors, though. |
Beta Was this translation helpful? Give feedback.
-
Conda
The primary intent with this field was to offer exactly what you’ve built with the npe2api service. We would (eventually) like to surface this info on the frontend (e.g. through a filter for plugins that are available in the desktop app), but we currently don’t use it. If the napari community wants to replace the napari hub’s API with npe2api, then I’m happy to deprecate this and see where the dust settles. We may still end up needing to support such a field in |
Beta Was this translation helpful? Give feedback.
-
Visibility
Depending on how that PR lands, we may be able to move most of our logic for granular visibility (public vs hidden, for example) to your proposed new However, I strongly suspect that we’ll still need to maintain some kind of “opt out” flag for plugin developers that don’t want to show up on the napari hub (even if they want to show up in napari). We can explore some other options for that, though. |
Beta Was this translation helpful? Give feedback.
-
Labels
I'm certainly open to our team sourcing this metadata from the manifest, however, our implementation here is much more constrained than a general purpose "label", as we only support a closed vocabulary from a subset of a single bioimaging-focused ontology. Does the napari team want to support this? If so, we’d be happy to start pulling from the manifest and can facilitate PRs to help migrate plugins over and begin to deprecate this field. |
Beta Was this translation helpful? Give feedback.
-
my primary concern here is also for the plugin developer, and I don't want them to be confused about who to listen to when it comes to where they should put information. I strongly feel, (and it is my general impression that the core developer team shares the opinion), that we want to adhere as much as possible to python ecosystem standards as possible. So if something already exists in the python world, we don't go create another way to do it.
I can get behind this distinction. I wouldn't insist that you remove all code that parses the .napari-hub file for these duplicated keys... however, I think it's just not a good idea, (and haven't yet heard any good reasoning for it). so, now that it already exists, I think we should explicitly discourage it. so I'd say we should
my goal is to use python standards where they exist, and not needlessly create additional places to put the same information. emphasis on same here: these fields in
they don't have to ... they can use pyproject.toml
This is also not a problem. pyproject toml is not mutually exclusive with setup.cfg ... only the
these are lovely abstract goals. but practically speaking, when we go down the list of actual fields in your schema, there are only two (description and maybe authors) that actually provide anything beyond the core metadata. So again, I'd say lets be a bit more conservative about not just adding more engineering until a very clear use case arises where the existing places for this info is unquestionably limiting.
sounds good. I'd say preferred is definitely whatever the python ecosystem currently encourages, which is core metadata (wherever it may be specified depending on the package bundling tool used, eg setuptools or poetry or flit). For customizing your plugin listing. I'd request that you remove any documentation of support for these duplicated fields, until a compelling case arises where someone must enter something different in
yeah, I'm agnostic here... doesn't matter to me
this bit is slightly frustrating, since right now you have the opportunity to remove these fields like conda and visibility (not deprecate them), since no one has implemented them. That may not be that way forever, and then we'll be in a different situation.
the only field that has anything to do with npe2 and the manifest is possibly labels and visibility.
see point above. |
Beta Was this translation helpful? Give feedback.
-
Just a general comment on all the points here. I understand where the usage of terms like "begin to deprecate" comes from. But when we can do a code search on github and see conclusively that, for example, only one or two plugins has actually implemented these things. Realistically speaking, an abstract deprecation cycle (as would be done when you have no idea who out there is depending on something) is likely overkill here. We could literally contact the 4-5 plugins that would be affected by the core metadata changes, give them a PR to fix it, and then just strip this stuff... no elongated deprecation process necessary |
Beta Was this translation helpful? Give feedback.
-
Proposalthere's a lot of text on this page now, so i'd like to re-summarize/simplify my proposal, and invite specific concerns on these points:
|
Beta Was this translation helpful? Give feedback.
-
Elsewhere, I've been complaining about the problems with data duplication in the docs between the README and description.md file the hub requires. Talley suggested this is a better place to put that thought, since it's more likely to produce a useful change if the hub maintainers know about the problems the requirement for a separate description file causes. Here's the full thread: jni/affinder#63 (comment)
|
Beta Was this translation helpful? Give feedback.
-
Thanks for resurfacing this discussion @GenevieveBuckley! I am happy to help with making sure our documentation and utilities make it clear where plugin devs should put their metadata to avoid duplicating effort and confusion. It sounds like there has been some alignment about this in the thread about DESCRIPTION.md and the github README. We had taken some steps already to improve our documentation, but we missed the Plugin Preview Page utility and we should audit the rest of our documentation and utilities to make sure they are clear and consistent. I’ve made an issue to track this, and I’ll be leading this effort. Please let me know if you find any other areas that need updating. |
Beta Was this translation helpful? Give feedback.
-
The discussion of where plugin developers should enter specific metadata about their project has come up many times. While I know I've often requested that we try not to create new places to enter data that already has a "standard" place, I don't know that we have a running issue about it. So, I'd like to open this issue as a place to discuss it, and a somewhat more public request to remove some/all keys.
cc @neuromusic @potating-potato @jni @sofroniewn @DragaDoncila
My personal hope is that we can reduce the keys in
.napari
or.napari-hub
to the bare minimum, and, preferably, to remove the file altogether. Below I discuss each key mentioned in the wiki at https://github.com/chanzuckerberg/napari-hub/wiki/Customizing-your-plugin's-listing#githubAs you'll see below only 1 or two plugins have actually used any of these key in
.napari
or.napari-hub
, so removing these from the spec and from the wiki should be quick and painless.keys in
.napari
that are duplicated from core metadata that should be removedI would propose that each of these be removed from
.napari
as soon as possibleSummary
only a single package (
napari-features
) has added a summary. unfortunately, they didn't include it in their setup.cfg metadata. (a good argument for removing it from.napari
, since it should minimally be insetup.cfg
). One PR would solve that.Links
Project Site
this should be taken only from core metadata home page. only two plugins have used this so far.
Each of the following should just be a key that is looked for in core metadata project urls. See also the
urls
field in PEP621Documentation
(only two plugins are using this, they should add it to core metdata)
Support
(three plugins are using this, they should add it to core metdata.)
Report Issues
(only one plugin is using this, they should add it to core metdata)
Twitter
(only one plugin is using this, they should add it to to core metdata)
keys in core metadata that might need to be duplicated
Authors
As I understand it, the motivation for this was to accept an orcid field, which is a nice idea. EP621 has made it possible to declare an inline table for
project.authors
... so it would be possible to declare something like this:setuptools will ignore that key, so it wouldn't make it into the
dist-info
of a wheel, but it's still probably the correct place to put it. Alternatively see[tool.napari]
below.Description
This is the one field that makes the most sense to potentially duplicate. Since it's certainly conceivable that someone would want to make their hub page look different then their readme and/or their pypi page... so, a single
DESCRIPTION.md
file remains the main thing I can't see a quick replacement for.keys not in core metadata, that I don't think should go in
.napari
Conda
see use npe2api in plugin install dialog napari/napari#4893 for some context. I'd prefer this not be in
.napari
at all, since it's not specific to the napari hub or display of a package on the hub. I would propose that it be moved either to a[tool.napari]
section in pyproject toml (which is guaranteed to make it into the sdist) or to the manifest itself (which is guaranteed to make it into the wheel of a functioning plugin.)Visibility
see Proposal: add
visibility
field to manifest schema, similar to originalpreview
napari/npe2#196 for some context. I'd prefer this not be in.napari
at all, since it's not specific to the napari hub or display of a package on the hub. I would propose that it be moved either to a[tool.napari]
section in pyproject toml (which is guaranteed to make it into the sdist) or to the manifest itself (which is guaranteed to make it into the wheel of a functioning plugin.)labels
Becuase of the PRs opened manually by the hub team, this is by far the most commonly used field, with about 20 plugins using it. This was originally in the npe2 manifest spec, but removed (the motivation for that removal is confusing to me, but in the past now). I'd propose this be readded to the manifest, but this would require opening new PRs to all those plugins that were previously instructed to put it in
.napari
.pyproject.toml
[tool.napari]
ornapari.yaml
For any plugin metadata that has a natural home in python's core metadata, i think we should absolutely use that field. There shouldn't be two places to enter the same thing.
For metadata that doesn't have a natural home in python's core metadata, we should probably use either a
[tool.napari]
section in pyproject.toml or add a new field to thenapari.yaml
manifest.Looking forward to your feedback. thanks
Beta Was this translation helpful? Give feedback.
All reactions