-
-
Notifications
You must be signed in to change notification settings - Fork 373
Testing: Continuous Integration
http://d1m1s1b1.stat.columbia.edu:8080
The server is a desktop computer running Windows 7. It has an Intel Core i7 920 @ 2.67 GHz processor with 8 GB of RAM. It is located in Daniel, Bob, and Sophie's office at Columbia University.
We use Jenkins as our continuous integration server.
Daniel is currently maintaining the server.
We use our continuous integration server to assist our development process. Here are some of our goals for our continuous integration:
- Maintaining the state of the
develop
branch. We test thedevelop
branch on every merged pull request. If a merge ends up breaking the build, we are able to detect it rather quickly. - Verifying pull requests. Before pull requests are merged into the
develop
branch, Jenkins is used to verify that the tests pass. - Cross platform testing. Most of the developers are on Mac or Linux machines. Running the server in Windows catches some of the cross platform issues. Ideally, we would set up Jenkins slave processes on each of our target platforms, but we currently do not have that set up.
For additional information on continuous integration, see the Wikipedia entry.
Here are the current Jenkins projects:
Stan Develop Branch. These projects test the current state of the develop
branch.
-
This project verifies that the C++ API documentation can be built and tracks Doxygen warnings.
Build target make doxygen
Frequency daily, when changes are made to the develop
branchArtifacts current API doc -
This project verifies that the manual can be built.
Build target make manual
Frequency daily, when changes are made to the develop
branchArtifacts draft version of the manual -
This project executes and tracks the unit tests and tracks compiler warnings. On success, calls the Header Tests.
Build target make test-unit O=0 -j4
Frequency every push to develop
and dailyArtifacts none -
This project executes the header tests and tracks compiler warnings. On success, calls the Distribution Tests.
Build target make test-headers -j4 O=0
Frequency called when Unit Tests reports success Artifacts none -
[Distribution Tests](http://d1m1s1b1.stat.columbia.edu:8080/view/Stan Develop Branch/job/Stan Distribution Tests):
This project executes and tracks the distribution tests and tracks compiler warnings.
Build target make test-distributions -j4 O=0
Frequency called when Header Tests reports success Artifacts none -
This project executes and tracks the model tests and tracks compiler warnings. These tests are currently used as an integration test that will indicate whether Stan is completely broken.
Build target make test-models
Frequency daily Artifacts time to execute each model is recorded. The files can be found here: timing files
Stan Pull Requests. These projects test incoming pull requests before they are merged into develop
.
-
Stan GitHub Pull Requests Project:
This project tests pull requests. See the Testing Pull Requests through Jenkins section for more information. The workspace starts with a current copy of
develop
, merges the pull request, then runs the suite of tests. Compiler warnings are tracked.Build target make manual doxygen && make test-unit test-headers test-distributions test-models -j2 O=3
Frequency see Testing Pull Requests through Jenkins Artifacts none
rstan. These projects test the current state of the rstan
repository.
-
This project tests pull requests. See the Testing Pull Requests through Jenkins section for more information. The workspace starts with a current copy of
develop
, merges the pull request, then runs the suite of tests. Compiler warnings are tracked.Build target git submodule update
cd stan
git fetch
git checkout origin/develop
cd ..
git submodule status
R -q -e "library(Rcpp); sessionInfo()"
cd rstan
make clean
make check && (cd tests & R -q -f runRunitTests.R --args ../rstan.Rcheck)
Frequency daily Artifacts none
Miscellaneous projects. These are utility projects.
-
This project pushes the website from branch
master
to http://mc-stan.orgBuild target N/A Frequency on demand Artifacts none
- We use our continuous integration server to test pull requests prior to merging into our
develop
branch. The plugin documentation is sparse. We describe how we use the plugin. - The Stan GitHub Pull Requests project is controlled through comments on the pull request.
- We created a GitHub user called
stan-buildbot
. This is how Jenkins communicates through comments. - We maintain a list of admins and whitelisted GitHub users on Jenkins.
- A whitelisted contributor creates a pull request.
- Jenkins automatically places the pull request on the testing queue.
- After the pull request is tested,
stan-buildbot
posts the status of the build as a comment. - An admin can retest the pull request by posting a comment.
- A contributor creates a pull request.
- Jenkins recognizes the pull request.
stan-buildbot
posts a comment on the pull request requesting verification from an admin. - An admin places the pull request on the testing queue.
- After the pull request is tested,
stan-buildbot
posts the status of the build as a comment. - An admin can retest the pull request by posting a comment.
All contributors need to create a GitHub pull request from their feature branch to the stan-dev:stan:develop
branch. This is done through GitHub.
When Jenkins recognizes the pull request, the stan-buildbot
GitHub user will respond to the pull request with:
Can one of the admins verify this patch?
Jenkins is now waiting for an admin to approve the pull request for testing. An admin can do this by commenting on the GitHub pull request with:
Jenkins, ok to test.
Jenkins will then place the pull request in the queue for testing.
Jenkins will merge the pull request into a clean develop
branch. The pull request needs to be able to be merged automatically.
After running the tests, stan-buildbot
will comment on the pull request with either:
Test PASSed.
Refer to this link for build results: http://d1m1s1b1.stat.columbia.edu:8080/job/Stan%20Github%20Pull%20Requests/###/
or
Test FAILed.
Refer to this link for build results: http://d1m1s1b1.stat.columbia.edu:8080/job/Stan%20Github%20Pull%20Requests/###/
If the tests pass and the pull request passes a code review, one of the developers should merge the pull request. After the pull request is merged, this will kick off another round of tests.
An admin may want to retest the pull request. A couple common reasons would include:
- Updated pull request.
- Updated
develop
branch.
To retest the pull request, an admin can comment:
Jenkins, retest this please.
The pull request will be placed back on the testing queue. Jenkins will respond with a comment indicating pass or fail as above.
Every subsequent push to the originating feature branch will put the pull request on the testing queue. This happens for every push, tagged with the commit. If the feature branch is still active, this could result in the same pull request being tested multiple times holding up other pull requests from being tested.
- Admins:
- bob-carpenter
- syclik
- Whitelist:
- all admins are automatically whitelisted
- Default messages (these can be reconfigured):
- Request for testing phrase:
Can one of the admins verify this patch?
- Success:
Test PASSed
. - Failure:
Test FAILed
.
- Request for testing phrase:
If you would like to be added to either the whitelist or admin list, please get in touch with Daniel.
-
Accept to test phrase:
Jenkins will comment:
Can one of the admins verify this patch?
An admin can accept the pull request with:
Jenkins, ok to test.
The acceptance regex:
.*ok\W+to\W+test.*
-
Test phrase:
To retest the pull request, an admin can comment:
Jenkins, retest this please.
The test regex:
.*test\W+this\W+please.*
Although an admin can just type
test this please
, it could be misinterpreted as an imperative statement directed to the creator of the pull request. -
Add to whitelist:
To add the contributor to the whitelist, an admin can either manage the whitelisted users through Jenkins or respond with:
Jenkins, add to whitelist.
The add to whitelist regex:
.*add\W+to\W+whitelist.*
To restart a Travis job:
- Log into https://travis-ci.org with your GitHub credentials. It'll pull up all the repos you have access to on the left.
- Find the job that's broken. Here's an example: I clicked on the "stan-dev/stan" link (in green) on the left. Then I clicked on "Branches". Here, you can see that "feature/issue-2012-make" has an error. <Screen Shot 2016-08-17 at 10.59.27 AM.png>
- Click on the red "#1750 errored" link. That's the build that's broken. It'll bring up a view of all the build jobs that it kicked off. I've scrolled down a little here to show a couple of the broken build jobs: <image.png>
- Click on one of the red links. Here, I clicked on "1750.18". It'll pull up detail of the build. I added the green arrows. <image.png>
- Restart! Click on the the icon I've marked with green arrows. The hover-over tool-tip says "Restart job"
For a tutorial on interacting with the current Jenkins jobs, please see this discourse post: http://discourse.mc-stan.org/t/new-jenkins-jobs-tutorial/2383
The rest of this wiki is historical.
Our Jenkins configuration isn't the best. I, Daniel, am trying to take inventory of our current situation so we can make it better. We've evolved our Jenkins config over the years. I'm the only one that's keeping tabs on it, for now.
We are currently having problems with false positives (Jenkins failing in some way and this causing a slow-down in our development). If we could cut that down, that would be great.
My maintenance of Jenkins has been fairly minimal. Day-to-day, I tend not to look at Jenkins config. But when things break, I spend some time with it.
These are the repositories tested using Jenkins:
There is also a dependency between the repos managed by git submodules:
math <- stan <- cmdstan
Math is standalone.
Math is a Stan submodule. It's located at ./lib/stan_math
.
Stan is a CmdStan submodule. It's located at ./stan
.
We test each pull request for the three repositories. We also test every time a merge happens into the develop
branch. Why do we do both? Honestly -- weird shit happens sometimes and even a merge that seems safe ends up breaking the develop
branch. (It hasn't happened in a while, but it has happened.) I'd just rather know when that happens.
Note: we should really be testing more on Windows. Travis-ci tests Linux and Mac.
Link to Jenkins job: Math Pull Request
This is a MultiJob project (that's what Jenkins calls it).
- Phase: Documentation
-
Doxygen
- Requires: doxygen
- Build: clean, then
make doxygen
- Post-build: scan for compiler warnings (Doxygen).
-
Doxygen
- Phase: Warnings
-
cpplint
- Requires: python2.7. Cpplint doesn't run on Python 3
- Build:
make cpplint || echo '--- cpplint has errors ---'
(mask the error message so the post-build action happens) - Post-build: scan for compiler warnings (CppLint)
- math dependenices
- Build:
make test-headers
- Post-build: scan for compiler warnings (math-dependencies)
- The parser for
math-dependencies
is a custom one written in Jenkins. Here's what it looks like:-
Name:
math-dependencies
-
Link name:
mc-stan.org
-
Trend report name:
math-dependencies
-
Regular expression:
^(.*)\((.*)\):\s(.*\.)\s*\[(.*)\]\s*$
-
Mapping script:
``` import hudson.plugins.warnings.parser.Warning String fileName = matcher.group(1); String lineNumber = matcher.group(2); String category = matcher.group(4); String message = matcher.group(3); return new Warning(fileName, Integer.parseInt(lineNumber), "Dynamic Parser", category, message); ```
-
Example log message:
src/stan/math/prim/scal/meta/OperandsAndPartials.hpp(21): File includes a stan/math/rev header file. [prim]
-
- Build:
-
cpplint
- Phase: Quick tests
-
Unit tests
- Build:
./runTests.py -j${PARALLEL} test/unit
- Post-build:
- scan for compiler warnings (GNU C Compiler 4 (gcc))
- publish junit test report (test report xml:
test/**/*.xml
)
- There are ~6000 tests
- Build:
-
Header tests
- Build:
make -j${PARALLEL} test-headers
- Post-build: scan for compiler warnings (GNU C Compiler 4 (gcc))
- Build:
-
Unit tests
- Phase: Long test
-
Distribution tests
-
Builds only on gelman-group-mac. Compiled tests take up a lot of space... something like ~200 GB.
-
Also takes a lot of time. About 5.5 hours.
-
Build:
-
Conditional step. Execute shell:
## searches for changes to source in the Math Library stan/math git diff --name-only $sha1 $(git rev-parse origin/develop) | grep -c stan/math
-
run:
echo 'CC=clang++' > make/local echo 'O=0' >> make/local ./runTests.py -j${PARALLEL} test/prob
-
-
Not only are the executables and the output large, this generates a lot of test results. Something like 600k.
-
-
Distribution tests
- Phase: Upstream projects.
-
Stan
- This takes the branch specified in
stan_branch
, which defaults todevelop
, and the pull request's sha, inmath_sha
, and merges the math pull request with develop. Then it runs the Stan tests. - Build
-
prep:
make math-update/develop cd lib/stan_math git merge ${math_sha} --ff
-
run:
./runTests.py src/test
-
- Post-build: publish junit report.
- This takes the branch specified in
-
CmdStan
- This takes the branch specified in
cmdstan_branch
(defaultdevelop
),stan_branch
(defaultdevelop
), and the pull request's sha, inmath_sha
, and merges the math pull request with develop. Then it runs the CmdStan tests. - Build
-
prep:
``` make stan-update/${stan_branch} cd stan make math-update/develop cd lib/stan_math git merge ${math_sha} --ff cd .. ```
-
./runCmdStanTests.py src/test
-
- Post-build: publish junit report.
- This takes the branch specified in
-
Stan
Link to Math.
- Update Stan
-
This is my really poor attempt at updating Stan's submodule now that Math has been updated. It grabs a script I wrote from https://github.com/syclik/stan-scripts/blob/master/jenkins/create-stan-pull-request.sh and runs it.
-
- Documentation
- Warnings
- Tests
Note: no upstream tests.
This is also a MultiJob project. Link to Stan Pull Request.
-
Documentation
-
Doxygen
- build:
make doxygen
- build:
-
Reference Manual
- build:
make manual
- build:
-
Doxygen
-
Stan Tests
-
Header
- build:
make test-headers
- build:
-
Unit
- build:
./runTests.py -j${PARALLEL} src/test/unit
- post-build:
- scan for compiler warnings (GNU C Compiler 4 (gcc))
- Publish junit test report: ~1100
- build:
-
Integration
- build:
./runTests.py -j${PARALLEL} src/test/integration
- post-build:
- scan for compiler warnings (GNU C Compiler 4 (gcc))
- Publish junit test report: ~60
- build:
-
Windows Unit
- build:
.\runTests.py -j%PARALLEL% src/test/unit
- post-build:
- scan for compiler warnings (GNU C Compiler 4 (gcc))
- Publish test report: ~1100
- build:
-
Header
-
Warnings
-
Cpplint
- build:
make cpplint || echo '--- cpplint has errors ---'
- post-build: scan for compiler warnings (CppLint)
- build:
-
Cpplint
-
Performance
-
Performance
-
This is really more like a regression test. This checks a few things:
- the Stan program still compiles
- when repeated, results are the same given a seed
- results are the same given a seed compared to some reference value (only useful on a single machine with the same settings)
- shows timing. want to make sure we're not completely messing up.
-
build:
``` ./runTests.py src/test/performance cd test/performance RScript ../../src/test/performance/plot_performance.R ```
-
post-build:
- archive artifacts: test/performance/performance.csv, test/performance/performance.png
- publish junit report
-
-
Performance
-
Upstream projects (envision a few more in the future, but maybe not.)
-
CmdStan
- build:
-
update math library:
git clean -d -x -f git submodule foreach --recursive git reset --hard make stan-update/develop cd stan git pull --ff git merge ${stan_sha} --ff make math-revert
-
run:
./runCmdStanTests.py src/test
-
- post-build:
- publish junit report: ~50
- build:
-
CmdStan
Link to Stan.
-
Update CmdStan
-
Documentation
-
Tests
-
Warnings
-
Performance
Note: no upstream tests.
A simple project: CmdStan
- build:
-
update math:
git clean -d -x -f make clean-all git submodule update --init --recursive make stan-revert make stan-update make
-
run tests:
./runCmdStanTests.py src/test/interface
-
build manual:
make manual
-
- post-build:
- scan for compiler warnings (GNU Make + GNU C Compiler (gcc))
- publish junit test report: ~50
- publish performance test report. We never look at this.
These are jobs that don't pertain to any of the pull requests or builds when develop
changes.
-
This project runs the performance test on the Mac Pro when nothing else is running. It's run once a day and waits until the machine is free. I don't know how to programatically test performance. This is here so we can tell if something bad has happened with speed. It doesn't test nearly enough, but it's a start.
-
The backup of the Jenkins config is on a github repo. This is just a test job to make sure the repo is reachable.
Master (d1m1s1b1.stat.columbia.edu):
- iMac. OSX 10.9.5
- 2.93 GHz Intel Core i7 (4 core)
- 16 GB
- 250 GB
gelman-group-mac (gelman-group-mac.stat.columbia.edu):
- Mac Pro. OSX 10.11.1
- 3 GHz 8-Core Intel Xeon E5
- 64 GB
- 500 GB
Windows Box (gelman.stat.columbia.edu):
- Dell. Windows 7 Enterprise
- Intel Xeon CPU E5507 @ 2.27 GHz 2.26 GHz (2 processors) (shows as 8 cores)
- 12 GB
- 464 GB
gelman-group-win (gelman-group-win.stat.columbia.edu):
- HP Z640 Workstation. Windows 7 Professional
- Intel Xeon CPU E5-2630 @ 2.40 GHz (shows as 16 cores)
- 32 GB
- 918 GB
unamed-server:
- HP Z640 Workstation.
- Intel Xeon CPU E5-2630 @ 2.40 GHz
- 32 GB
- 918 GB
Andrew's broken laptop:
- Macbook Pro Retina
- 2.3 GHz Intel Core i7
- 16 GB
- 153 GB
Here are some of the features we rely on. And some of the features we wish to have.
- GitHub authentication
- Backup to git
- Better notification when build passes or fails. We used to have that through comments, but it's not working any more.
- Reduce failing builds due to GitHub being unreachable.
- Encrpytion across nodes?
- Better GitHub notification. Currently, all builds say: "Build finished. No test results found."
- Easier retest if something like a GitHub timeout caused the error.
- Make it easier to control the upstream project branch? For example, if we know a change to Stan will break CmdStan, I have to manually go into the project and add put in the branch name.
- Update the upstream branch script needs a lot of work. I want to update the existing issue and pull request if it's still open. Currently, it just adds a new issue and a new pull request.
- Test that the submodule is pointed to some place on develop and make sure to update to latest develop before merging.