Skip to content

Latest commit

 

History

History
349 lines (238 loc) · 8.58 KB

tdd.mkd

File metadata and controls

349 lines (238 loc) · 8.58 KB

name: inverse layout: true class: center, middle, inverse


Test-driven development

Radovan Bast

Licensed under CC BY 4.0. Code examples: OSI-approved MIT license.


layout: false

Tests are no guarantee


Motivation 1: code quality

  • More robust code
  • As projects grow, it becomes easier to break things without noticing immediately
  • In particular for non-modular entangled projects (effects at distance)
  • Testing helps detecting errors early: programming without constantly testing your changes will introduce errors and it is extremely time consuming to detect and fix them
  • Interpreted dynamically typed imperative languages need to be tested
  • So do compiled languages but the compiler catches at least some errors
  • We publish scientific papers based on scientific code

Motivation 2: distributing code

  • Users and computing center personnel may not be able to identify incorrect installation without automated testing
  • Users of your code publish papers based on results produced by your code

Motivation 3: developers

  • Satisfying instant visual feedback
  • Avoid coding constipation
  • Avoid "gold-plating" the code

Motivation 4: external contributors

  • Easier for external developers to contribute to the project
  • Tests can be good documentation; up to date by definition

Motivation 5: managing complexity

  • Help to modularize/encapsulate
  • Unit tests sharpen and document interfaces
  • Enable reuse and migration
  • Faster coding: allows to make big changes quickly; refactoring
  • You start with design; often leads to better design
  • Well structured code is easy to test

template: inverse

Imperfect tests run frequently are better than perfect tests which are never written


Important

  • Test frequently (each commit)
  • Test automatically (Travis CI)
  • Test with numerical tolerance
  • Think about code coverage (https://coveralls.io)
  • Write test, red, green, refactor

Unit tests

  • Test one unit: module or even single function
  • Sharpen interfaces
  • Speed up testing
  • Good documentation of the capability and dependencies of a module
  • In fact it is a documentation that is up to date by definition

Test-driven development

  • Write tests before writing code
  • Very often we know the result that a function is supposed to produce
  • Development cycle:
    • Write the test
    • Write an empty function template
    • Test that the test fails
    • Program until the test passes
    • Perhaps refactor
    • Move on

Functionality regression tests

  • Like in a car assembly we have to test all components and also whether the components work together
  • Typically tests entire code for a set of input examples and compares them with reference outputs
  • Hopefully with numerical tolerance
  • Typically we need both unit and functionality regression tests
  • Both are no guarantee that the code is correct "always"

Performance regression tests

  • Real example

    • In one code module somebody forgot and committed debug code
    • This did not break functionality
    • It slowed down that part of the code by factor 2
    • Nobody noticed for one year because we had no automatic detection of performance degradation
  • Good idea

    • Create benchmark calculations to cover various performance-critical modules
    • Run these benchmarks regularly (weekly?)
    • Monitor the timing
    • If timing changes significantly, alarm bells should be ringing

Code coverage

  • If I break the code and all tests pass who is to blame?
  • A code that is not tested will break sooner or later
  • Code coverage measures and documents which lines of code have been traversed during a test run
  • It is possible to have line-by-line coverage (example later)
  • In all the files that have zero coverage you can multiply all numbers with 10E5 and all tests will happily pass

Total time to test matters

  • You should require developers to run the test set before each git push
  • Professional projects require running the test set before each commit
  • In extreme cases after each save
  • With a pre-commit hook you can make sure that commits that break tests are rejected
  • Total time to test matters
  • If the test set takes 7 hours to run, it is likely that nobody will run it
  • Identify fast essential test set that has sufficient coverage and is sufficiently short to be run before each commit or push

Continuous integration

  • Test each commit on every branch
  • Makes it possible for the mainline maintainer to see whether a modification breaks functionality before accepting the merge

Good practices (1/2)

  • Test before committing (use the Git staging area)
  • Fix broken tests immediately (dirty dishes effect)
  • Do not deactivate tests "temporarily"
  • Think about coverage (physics and lines of code)
  • Go public with your testing dashboard and issues/tickets
  • Dogfooding: use the code/scripts that you ship

Good practices (2/2)

  • Think about input sanitization (green testing dashboard does not prevent me from making unexpected keyword combinations)
  • Test controlled errors: if something is expected to fail, test for that
  • Make testing easy to run (make test)
  • Make testing easy to analyze
    • Do not flood screen with pages of output in case everything runs OK
    • Test with numerical tolerance (extremely annoying to compare digits by eye)

Unit test frameworks


py.test

  • Very easy to set up
$ pip install pytest  # ideally into a virtualenv
  • Very easy to use
def get_word_lengths(s):
    """
    Returns a list of integers representing
    the word lengths in string s.
    """
    return [len(w) for w in s.split()]

def test_get_word_lengths():
    text = "Three tomatoes are walking down the street"
    assert get_word_lengths(text) == [5, 8, 3, 7, 4, 3, 6]

Google Test

#include "gtest/gtest.h"
#include "example.h"

TEST(example, add)
{
    double res;
    res = add_numbers(1.0, 2.0);
    ASSERT_NEAR(res, 3.0, 1.0e-11);
}

pFUnit

@test
subroutine test_add_numbers()

   use hello
   use pfunit_mod

   implicit none

   real(8) :: res

   call add_numbers(1.0d0, 2.0d0, res)
   @assertEqual(res, 3.0d0)

end subroutine

CTest and CDash

  • CTest runs a set of tests through bash/perl/python scripts and reports to CDash which aggregates results
  • The test scripts can be anything you like
add_test(my_test_name ${PROJECT_BINARY_DIR}/my_test_script --flags)

include(CTest)
enable_testing()
  • CTest looks for the return codes: 0/non-0 (success/failure)
  • CTest is a powerful test runner, no need to implement one
  • Reporting to CDash is easy
$ make Nightly

Fun: Jenkins Chuck Norris plugin

https://wiki.jenkins-ci.org/display/JENKINS/ChuckNorris+Plugin


Fun: Jenkins "Extreme Feedback" Contraption

https://github.com/codedance/Retaliation

"Retaliation is a Jenkins CI build monitor that automatically coordinates a foam missile counter-attack against the developer who breaks the build. It does this by playing a pre-programmed control sequence to a USB Foam Missile Launcher to target the offending code monkey."


Travis and Coveralls

  • GitHub plus Travis plus Coveralls is a killer combination - use it!
  • Free for public repositories

Travis CI

Coveralls: code coverage history and stats