CI historically is a system/workflow/practice of merging a codebase that developers work on individually into a main stream, usually several times a day. This ensures that there are few to no conflicts within the code base. However, CI presently is often used in the context of automated builds of a git repository to ensure that everything runs/tests/builds as expected in a clean, new environment (aka on a server). For this reason, I will only refer to CI in the context of git repos.
Automated CI services are available on GitHub through Travis CI.
Travis is a service independent (as a company) of GitHub, however they currently
only work with GitHub. There is the .org
version of Travis for public GitHub
repos and a .com
version for private repos. There are other CI’s available on
GitLab or BitBucket (services similar to GitHub). Through Travis, you can get
your git repo and associated code/files/documents (really, any thing that is
command based, including things such as pandoc, rmarkdown, or website build tools)
to be built, tested, or whatever else you want it to do!
The main use of CI is for automated testing of your code in a separate environment, to ensure that any new additions to the code don’t cause any breaks or conflicts. Other common uses are to build websites that first require code or documents to be generated before pushing to the live website.
There are many reasons to use CI, especially if your codebase is large and complex, when multiple people work on the code, when it is relied on by others, or if it is for production (industry or commercial) purposes. If your code is small or simple or a one-time thing, it is often not necessary to use CI. BUT! Using CI will make sure that your code is robust and more likely to be accurate and correct… which is especially important in most if not all scientific fields.
.travis.yml
fileAs I mentioned, you use Travis for GitHub. In order to use Travis, you need to sign up and follow the getting started steps. Travis has many supported languages, including R and Python.
The two main things you need for Travis are to include a .travis.yml
file in
your GitHub repo and link Travis to it via the Travis website.
.travis.yml
filelanguage: r
cache: packages # to reduce the build time for dependent packages
r:
- release
- devel
script:
# If building packages
- R CMD build .
- R CMD check *.tar.gz
# If running from an R or shell file
- Rscript test.R
# or:
- sh ./deploy.sh
Package dependencies are by default searched for in the
DESCRIPTION
file. Otherwise,
specific packages can be installed by including this in the .travis.yml
file:
r_packages:
- "packagename"
If you are creating packages and want to use Travis to test your packages, use
the R command devtools::use_travis()
to get you started.
Building websites via R Markdown is relatively straight forward. You first need
to create a deploy.sh
file of some sort (see
this file as
an example) and include Travis commands to push to a gh-pages
branch (see
this file as
an example). For a more detailed example and walkthrough, see
this Gist.
Let’s work through a (very simple and silly) example. Create three files: .travis.yml
,
function.R
, test.R
, and DESCRIPTION
(this file is needed for
R with travis). (As a side note, as with above, I will not be covering formal
unit testing in this lesson).
.travis.yml
language: r
cache: packages
# build in two different OS
r:
- release
os:
- linux
- osx
script:
- Rscript test.R
function.R
add_nums <- function(num1, num2) {
num1 + num2
}
random_half_split <- function(data) {
size <- nrow(data) / 2
dplyr::sample_n(data, size)
}
test.R
library(testthat)
source("function.R")
expect_equal(add_nums(2, 2), 4)
actual_size <- nrow(random_half_split(iris))
expect_equal(actual_size, 75)
DESCRIPTION
Package: testing_ci_r
Type: Package
Title: Testing out
Version: 0.0.0.9000
Authors@R: person("Luke", "Johnston", email = "lwjohnst@gmail.com",
role = c("aut", "cre"))
Description: Testing around
LazyData: TRUE
# packages to install
Imports:
dplyr,
testthat
RoxygenNote: 6.0.1
.travis.yml
filelanguage: python
python:
- "3.6"
# install dependencies
install:
- pip install -r requirements.txt # if you put your dependencies in a file
script: pytest
I am not nearly as familiar with Python, however, the same things that apply to R
also apply to Python. You (optionally) need to install dependencies and use the script
option
to specify what commands or file to run the tests with.
As with R, let’s do a simple and silly example. Create the three files: .travis.yml
,
function.py
, and test_function.py
. (As a side note, as with above, I will not be covering
formal unit testing in this lesson).
.travis.yml
language: python
python:
- 3.6
script: pytest
function.py
def add_nums(num1, num2):
added = num1 + num2
return added
test_function.py
import function
def test_add_nums():
added = function.add_nums(2, 2)
assert added == 4
In the live version of the lesson, I’ll cover simple examples of using Travis
for R and Python. Since I am more familiar with R, I’ll be covering R examples
in more detail. In general, the lesson will go over setting up Travis, creating
the .travis.yml
file, getting Travis to run, and seeing the results for the
CI build.
There are so many things you can do with Travis (or any CI for that matter)! Simple reasons are pretty straight forward, though there is a learning curve. More complex uses are, well, more complex to set up. In the end, setting up Travis/CI can save you time and reduce your workload as it automates many of the repetitive tasks you many do when creating code or code-dependent documents.