Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roadmap to 1.0 #321

Closed
mr-c opened this issue Mar 10, 2014 · 12 comments
Closed

roadmap to 1.0 #321

mr-c opened this issue Mar 10, 2014 · 12 comments
Assignees
Milestone

Comments

@mr-c
Copy link
Contributor

mr-c commented Mar 10, 2014

The big feature for the version 1.0 release is a commitment to following Semantic Versioning ( http://semver.org/ ) where the API is the command line interface exposed by the scripts.

Therefore the only must-haves for a 1.0 release is documentation of this API and any remaining changes needed to make it more consistent.

As the 1.0 release is also a time-based release the remaining bug fixes and features can be incorporated if there is enough time. Otherwise they can wait for version 1.0.1 or version 1.1. We've done a great job at 'releas[ing] early'; I'd like to improve on 'releas[ing] often.' http://en.wikipedia.org/wiki/Release_early,_release_often

Here are the required issues to close for 1.0:

Document and lock down the 1.0 API #16
scan through scripts/ and remove unused/unsupported scripts #118
Add documentation on how to tie into a particular tagged version #29
Update README to point to publications & citation info #49
Provide an updated list of papers and maybe blog posts, somewhere. #162
add documentation for sample-reads-randomly #192
doc/examples.txt needs updating #201
doc/guide.txt needs editing #202
doc/blog-posts.txt hardlinks to ctb's github #203
document what happened to old scripts #206
all scripts need to output their version #236
Add "when you shouldn't use diginorm" to our documentation #275
~~environmental variables need documenting #303
git clone of the khmer repo reqs > 0.5 GiB #245
Add Torsten's rant into our docs re command line stuff #209
http://khmer.readthedocs.org/en/latest/ tracks the master branch, not releases #230
C++ code should be consistently formatted #261
Inconsistent and incomplete output filename handling #81

Bonus: Write a khmer citation handle paper => arXiv #72

Bugs that could be fixed in 1.0 but if not then added to doc/known-issues.txt:

https://github.com/ged-lab/khmer/issues?direction=asc&labels=bug&milestone=1&page=1&sort=created&state=open

@mr-c mr-c added this to the 1.0 release milestone Mar 10, 2014
@mr-c mr-c self-assigned this Mar 10, 2014
@mr-c
Copy link
Contributor Author

mr-c commented Mar 13, 2014

Pursuant to documentation related issues on this list I printed out the PDF generated by Sphinx and did a two pass edit: for global consistency and flow and then for details.

Here's is a digital version of many of the changes I'd like to make:

  • spell check the entirety
  • add a subtitle
  • clarify document authorship: only CTB & Jason Pell are listed on the front cover
  • document our documentation conventions
  • use the sphinxcontrib.autoprogram extension to pull the argparse parsers from each script so that the documentation auto-syncs with changes in each script's option
  • move much of the text in the docs back into the scripts' argparse --help output
  • building on the above: use the Sphinx unix manual page builder to build and ship manual pages for each script
  • organize the bulk of the documentation into two labeled parts: user documentation and technical (dev) documentation.
  • for each part: list intended audiences and expectations
  • merge the following chapters elsewhere or drop them completely: "6. Blog posts and additional documentation", "8. Partitioning large data sets", "9. ktable, simple k-mer counting", "11. Miscellaneous implementation details", "16. Sequence access patterns in khmer"
  • remove the incomplete '22. Indices and tables" and "Python Module Index" (for now)
  • In "1.1 Introduction" explain what hashing & hash tables for the non-C.S. folk.
  • in "1.2 Using khmer" make sure we mention the appropriate script name for every use case listed. They should be auto linked using sphinx magic
  • split the install instructions up for user & dev use
  • document how to install a specific version and do a system-wide install
  • add a section under "Chapter 3. A few examples" describing and linking to the khmer protocols
  • review the usefulness of examples/
  • determine which scripts can make use of stdin or stdout as input or outputs. Document this and file issues if it doesn't work and it would be useful if it did.
  • ensure that every script includes example usage (about half do already)
  • document exactly how filter-stoptags trims
  • elevate the following sandbox scripts to scripts/ or remove references to them: fasta-to-abundance-hist.py, abundance-hist-by-position.py, hi-lo-abundance-by-position.py, velvet-assemble.sh
  • filter-exact.py was deleted last year (ba3df5d). Investigate, remove reference in Section 6.1 if need be (yes, do remove it as per roadmap to 1.0 #321 (comment))
  • work with @qingpeng to update "Chapter 7: Choosing hash sizes for khmer"
  • Possibly remove the "Architecture and Design" chapter as it is out of date. If we keep it then many sections need removing and updating. All TODOs within should be removed and become GitHub issues if they aren't already.

more to come in a bit...

@RamRS
Copy link
Contributor

RamRS commented Mar 13, 2014

Awesome. May I work on the spellcheck, for starters?

@mr-c
Copy link
Contributor Author

mr-c commented Mar 13, 2014

@RamRS I'd like to see you tackle something bigger next. Spell checking is an easy thing for me to do while I work on my checklist.

#86 needs a test case written + fix
#247 needs a test case (tricky!) + c++ fix
#152 the screencast still needs to be made regardless of the side conversation
#253 another c++ project

@RamRS
Copy link
Contributor

RamRS commented Mar 13, 2014

The spell-check was just the salsa dip. I'd of course have something more interesting for the entree!

I'm going to try and tackle the python test case first. Also, the C++ ones look quite scary to me, might need quite a bit of hand-holding - would that be OK?

@RamRS
Copy link
Contributor

RamRS commented Mar 13, 2014

Also, I have a branch addressing #328 . If you think the issue is relevant, I'll fire up a PR.

@mr-c
Copy link
Contributor Author

mr-c commented Mar 13, 2014

More notes from my documentation review:

  • In "Chapter 12: Development miscellany": talk about pylint
  • build a Makefile target to do code style and correctness checks on all files modified in commits that are a part of the current branch which haven't been merged into master yet. Add it to the docs
  • Acknowledge the in-flux nature of the codebase; if a contributor modifies a file then they are responsible for making the entire file well formatted (using autopep8 / astyle / their hands)
  • link to https://github.com/pjotrp/bioinformatics & file issue for outstanding items
  • write up motivation for the coding standards, use Software Sustainability Institute articles as they are useful
  • move the git/github strategies before the code review section
  • talk about the two types of pull requests & workflow including requesting code review and who does the merge
  • mention specific git techniques including rebasing to clean up commits
  • doc how to calculate code coverage on ones own
  • in "13.5 Command line scripts": argparse, examples in documentation
  • configure pylint to respect our script naming convention + doc that if needed
  • specific exit codes can be 2-255 but their meaning needs to be documented. I don't know how much demand for fine grained exit codes there are, though
  • add versioneer to "13.7 Upstream sources"
  • "Chapter 14: Deploying the khmer project tools on Galaxy": we also support filter-abund.py
  • fix numbered list in 14.2
  • "Chapter 17: How to make a khmer release": add step to scan for TODOs BUGs and FIXMEs that need to be moved to Github
  • automake checking for updates to ez_setup.py & versioneer
  • versioneer installs via pip now
  • document batlab
  • add version check to each stage of the release check; replace interrogative comments with statements
  • drop the redundant section 17.1 "Upstream sources"
  • document that the setuptools version requirement is enforced by setup.py

@mr-c
Copy link
Contributor Author

mr-c commented Mar 16, 2014

@mr-c
Copy link
Contributor Author

mr-c commented Mar 29, 2014

@ctb or others: is there a suggested replacement for filter-exact.py? (deleted in ba3df5d as part of the ReadMask removal last February)

@ctb
Copy link
Member

ctb commented Mar 29, 2014

On Sat, Mar 29, 2014 at 08:12:23AM -0700, Michael R. Crusoe wrote:

@ctb or others: is there a suggested replacement for filter-exact.py? (deleted in ba3df5d as part of the ReadMask removal last February)

filter-abund, I believe.

C. Titus Brown, [email protected]

@mr-c
Copy link
Contributor Author

mr-c commented Mar 29, 2014

Okay. Since that is used in the khmer-protocols I'll drop it from https://github.com/ged-lab/khmer/blob/master/doc/blog-posts.txt#L18

@ctb
Copy link
Member

ctb commented Jun 12, 2015

Remaining components of this issue should be moved to a new issue :)

@mr-c mr-c mentioned this issue Jun 13, 2015
32 tasks
@mr-c
Copy link
Contributor Author

mr-c commented Jun 13, 2015

Remaining components moved to #1091.

@mr-c mr-c closed this as completed Jun 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants