-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sorting by citations #3
Comments
A good idea would be to add a randomize option. |
Thanks @uecker for the suggestion - I spent a little time trying to work out how this could be implemented, but my web-coding knowledge wasn't sufficient to get it work! If anyone wants to implement this, please go ahead :) |
Hello @dgallichan et al. - this seems like a great initiative. I have some feedback. I don't have any expectations for how the feedback is used - feel free to use or ignore as you wish :). In terms of citation sorting, I think it would be really good or even necessary to use something besides Semantic Scholar. The main reason is Semantic Scholar is doing a poor job of indexing ISMRM abstracts from the main conference and workshops. One option would be for the ISMRM to work on getting its proceedings indexed, but if that doesn't happen I think it might be necessary to move away from Semantic Scholar as many people are publishing their packages at ISMRM. The current status of MR Hub leaves a great community project like SigPy at the very bottom of the list. In the meantime as long as MR Hub is sticking with Semantic Scholar I would change the default sorting mechanism. Citations is pretty good, but at the moment many packages are linked to research papers rather than software papers, which just reinforces an author's scientific contributions rather than their software contributions. There are many ways to highlight scientific contributions. We have Google Scholar pages, research awards, prestigious positions, etc. MR Hub is a little more unique in that it can showcase software work that isn't naturally promoted as much. I think a better option vs. the status quo would be to sort by most recent software update. This would also have the added benefit of highlighting projects that are actively being updated and maintained. Also, it would help promote new projects that might benefit the most from promotion. Disclaimer: I am looking to PR my project, torchkbnufft, for which the associated paper was at the 2020 ISMRM Sedona Workshop. |
I agree, sorting by last update would also be good, but may be difficult to automate. I think we need somebody to implement it... |
According to the README there is currently a mechanism for querying BitBucket and GitHub for the last update. A commit seems a reasonable surrogate. Were you thinking to use releases for the date? Or is the concern about software not kept on GitHub/BitBucket? |
I was thinking about software not on GitHub/BitBucket, e.g. a repository maintained by some institution. But maybe those could also be polled automatically. |
For reference, the default sorting option is defined here: Line 32 in 32838c3
If it is decided that the default sort should be by the most recent commit, then the |
I am not sure what this does. For BART is says 2021-07-07 but the latest release was in March and the latest public commit a couple of days ago. Maybe this is the random number I asked for.... |
That might be because the update script was last ran ~3 months ago, so the project info could be out of date. |
My impression is that if a Github action could be set to run, say, once a week, this would be a nice solution. I think the main problems in getting it working would be making sure you don't exceed the daily API queries without logging in (although I may already be out of date on this, as these kinds of limits have a habit of changing as well...) |
Each instance of github actions should have 60 API queries per hour. If that becomes a bottleneck, then it should be possible to use the builtin |
@dgallichan I will try to test a draft of an Action on my fork now that my PR is merged. I'll open a PR if everything works. |
Sorry - I see it is already merged! |
So the default sorting option is now 'last update' - I think there are only a few repositories that don't use Github or Bitbucket, so it's mostly pretty good (we could do with adding API querying for Gitlab as well though, but again, hardly any packages affected at the moment). For those hosting themselves, then I guess the onus is on them to submit a PR to the MRHub whenever they want to manually update the date for their package. Thanks so much notZaki for the Github Action - it ran successfully this lunchtime, and is definitely a good way to keep the MRHub 'fresh'! :) |
Currently we are using the number of citations that Semantic Scholar finds on the main paper associated with the software to allow sorting by a proxy metric related to 'impact' of the software. We realise that this is not a perfect solution, not least because not everyone who cites a paper uses the software - and not everyone who uses software has cited the paper.
For now this seems like a reasonable solution - but feel free to use this space to discuss any issues that arise due to this choice, along with any suggestions you might have for how to improve it.
Note that Semantic Scholar was chosen because of its free-to-use API. We probably can't use Google Scholar as they seem to regularly change their site to block scraping attempts, and Scopus and WoS both have APIs - but at a premium. It seems that CrossRef might be a viable alternative - but I haven't had the time to read more to see if it offers something beyond what now seems to work with the Semantic Scholar option.
The text was updated successfully, but these errors were encountered: