Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PRE REVIEW]: DisTRaX: Accelerating High Performance Compute Processing #6013

Closed
editorialbot opened this issue Nov 2, 2023 · 24 comments
Closed
Labels
pre-review Python query-scope Submissions of uncertain scope for JOSS rejected TeX Track: 7 (CSISM) Computer science, Information Science, and Mathematics

Comments

@editorialbot
Copy link
Collaborator

Submitting author: @GMW99 (Gabryel Mason-Williams)
Repository: https://github.com/rosalindfranklininstitute/DisTRaX
Branch with paper.md (empty if default branch): paper
Version: v1.0.0
Editor: Pending
Reviewers: Pending
Managing EiC: Daniel S. Katz

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e"><img src="https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e/status.svg)](https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e)

Author instructions

Thanks for submitting your paper to JOSS @GMW99. Currently, there isn't a JOSS editor assigned to your paper.

@GMW99 if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
@editorialbot editorialbot added pre-review Track: 7 (CSISM) Computer science, Information Science, and Mathematics labels Nov 2, 2023
@editorialbot
Copy link
Collaborator Author

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.88  T=0.05 s (2007.2 files/s, 106553.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          48            651           1003           1877
reStructuredText                38            393            485            271
TeX                              1              7              0             75
YAML                             2              6              1             69
Markdown                         2             27              0             67
DOS Batch                        1              8              1             26
make                             1              4              7              9
TOML                             1              0              0              3
-------------------------------------------------------------------------------
SUM:                            94           1096           1497           2397
-------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

@editorialbot
Copy link
Collaborator Author

Wordcount for paper.md is 1483

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.14569/IJACSA.2016.070211 is OK
- 10.48550/arXiv.2212.03054 is OK
- 10.48550/arXiv.1610.08015 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

FRIEDA: Flexible Robust Intelligent Elastic Data Management Framework
Submitting author: @dghoshal-lbl
Handling editor: @acabunoc (Retired)
Reviewers: @krother
Similarity score: 0.8028

DARE Platform: a Developer-Friendly and Self-Optimising Workflows-as-a-Service Framework for e-Science on the Cloud
Submitting author: @iaklampanos
Handling editor: @danielskatz (Active)
Reviewers: @rafaelfsilva, @Himscipy
Similarity score: 0.7996

DataLad: distributed system for joint management of code, data, and their relationship
Submitting author: @yarikoptic
Handling editor: @arokem (Retired)
Reviewers: @szorowi1, @jkanche
Similarity score: 0.7996

Launcher: A simple tool for executing high throughput computing workloads
Submitting author: @lwilson
Handling editor: @danielskatz (Active)
Reviewers: @kc9qey
Similarity score: 0.7985

hotsub: A batch job engine for cloud services with ETL framework
Submitting author: @otiai10
Handling editor: @brainstorm (Retired)
Reviewers: @reisingerf
Similarity score: 0.7982

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@danielskatz
Copy link

@GMW99 - thanks for your submission. Before we continue, I have some concerns:

  1. Please add to your README, as it will not pass the JOSS review criterion as it is.
  2. Please make a more substantive case in the paper for this being research software, rather than infrastructure software. Do you expect that researchers would cite this software in their papers? If so, which types of researchers?
  3. In the paper, please provide some discussion about other solutions to this need and competing packages.
  4. I don't see the figures in the paper.

If you do make additions to the paper, you may want to remove some of the current text so that it doesn't get to be too long. Perhaps some text can be replaced by a pointer to documentation?

After you have made changes in the .md file, use the command @editorialbot generate pdf to make a new PDF. editorialbot commands need to be the first entry in a new comment. If you make changes in the references, please use the command @editorialbot check references to check them.

@danielskatz
Copy link

👋 @GMW99 - Did you see my comments/requests above ☝️ ?

@danielskatz
Copy link

@GMW99 - If I don't hear back from you in the next 2 weeks, I'll mark this paper as rejected, but you can certainly address these issues and the resubmit, if you choose to.

@GMW99
Copy link

GMW99 commented Nov 21, 2023

Hi @danielskatz,

Sorry for the delayed response. I only work one day a week. I am currently on holiday and hope to get back to you with a complete response to your comments in the next three weeks.

Again, apologies for the delay in my response.

Kind regards

Gabryel

@danielskatz
Copy link

@GMW99 - it's now been three weeks - do you have any update?

I think the right thing to do is to mark this as withdrawn, with the idea that you can address the issues I mentioned above, and resubmit at a later point. What do you think?

@GMW99
Copy link

GMW99 commented Dec 13, 2023

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@editorialbot
Copy link
Collaborator Author

Five most similar historical JOSS papers:

Launcher: A simple tool for executing high throughput computing workloads
Submitting author: @lwilson
Handling editor: @danielskatz (Active)
Reviewers: @kc9qey
Similarity score: 0.8151

FRIEDA: Flexible Robust Intelligent Elastic Data Management Framework
Submitting author: @dghoshal-lbl
Handling editor: @acabunoc (Retired)
Reviewers: @krother
Similarity score: 0.8134

DARE Platform: a Developer-Friendly and Self-Optimising Workflows-as-a-Service Framework for e-Science on the Cloud
Submitting author: @iaklampanos
Handling editor: @danielskatz (Active)
Reviewers: @rafaelfsilva, @Himscipy
Similarity score: 0.8124

DataLad: distributed system for joint management of code, data, and their relationship
Submitting author: @yarikoptic
Handling editor: @arokem (Retired)
Reviewers: @szorowi1, @jkanche
Similarity score: 0.8084

hotsub: A batch job engine for cloud services with ETL framework
Submitting author: @otiai10
Handling editor: @brainstorm (Retired)
Reviewers: @reisingerf
Similarity score: 0.8078

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

@GMW99
Copy link

GMW99 commented Dec 13, 2023

Hi @danielskatz

First of all, thank you for your patience.

The following is our response to your concerns surrounding the paper (#6013 (comment)) :

  1. I have updated the README to be more expansive and provide a more straightforward scientific use case. (https://github.com/rosalindfranklininstitute/DisTRaX)
  2. We would envisage researchers running this software themselves and using it as part of assembling rapid software pipelines, as there are lots of researchers developing workflow software in many fields (including cryo-em). Therefore, we expect they will deploy this software as part of their workflow to set up a temporary in-memory shared disk to speed up IO during the workflow run. The idea and software presented are particularly relevant to cloud (or cloud-like) clusters, where researchers would have root access and would define the complete "digital instrument" to run the workflow as a combined piece of workflow software + software-defined infrastructure cluster. For these reasons, we argue that it is research software. We would, therefore, expect it to be cited like other foundational software, e.g. numpy, pytorch etc., e.g. It would not be cited normally as it would be invisible to the software users, but it would be in the software inventory of workflows that build upon this.
  3. Currently, we know of no software solutions that would compete with DisTRaX. In the paper, we compare to standard deployment tools such as Ansible and find that it outperforms them in the HPC setting due to their sequential nature and the increased deployment time this would introduce. The other method of doing this is to use the hardware solution of adding a file system to your cluster. We state in the paper that this adds expense, security and complexity for cloud-based clusters and traditional HPC, and it puts a bottleneck on high I/O processes. DisTRaX breaks this need making I/O simpler on clusters by using available RAM. We think it is made clear and effectively communicated, although the lack of figures may have hampered this.
  4. My apologies; you should be able to see them now.

Again apologies for our delay in replying to you

Kind regards

Gabryel

@danielskatz
Copy link

@editorialbot check repository

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.88  T=0.07 s (1426.1 files/s, 76495.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          48            651           1003           1877
reStructuredText                38            393            485            271
Markdown                         2             47              0             99
TeX                              1              7              0             75
YAML                             2              6              1             69
DOS Batch                        1              8              1             26
make                             1              4              7              9
TOML                             1              0              0              3
-------------------------------------------------------------------------------
SUM:                            94           1116           1497           2429
-------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

@editorialbot
Copy link
Collaborator Author

Wordcount for paper.md is 1483

@danielskatz
Copy link

👋 @GMW99 - thanks for all the changes. I'm now going to ask the editors to confirm that this is research software as defined by JOSS. You should hear back in a week or two (or perhaps after the holidays).

@danielskatz
Copy link

@editorialbot query scope

@editorialbot
Copy link
Collaborator Author

Submission flagged for editorial review.

@editorialbot editorialbot added the query-scope Submissions of uncertain scope for JOSS label Dec 13, 2023
@danielskatz
Copy link

@GMW99 - I'm sorry to say (and also sorry for the holiday delay) that after discussion amongst the JOSS editors, we have decided that this submission is not research software as defined by JOSS. This does not mean that it is not software that is useful in research, but just that JOSS does not consider it in scope for review as research software. Please see https://joss.readthedocs.io/en/latest/submitting.html#other-venues-for-reviewing-and-publishing-software-packages for other suggestions for how you might receive credit for your work.

@danielskatz
Copy link

@editorialbot reject

@editorialbot
Copy link
Collaborator Author

Paper rejected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pre-review Python query-scope Submissions of uncertain scope for JOSS rejected TeX Track: 7 (CSISM) Computer science, Information Science, and Mathematics
Projects
None yet
Development

No branches or pull requests

3 participants