Skip to content

Setting up your computer

Fleur Le Mire edited this page Jul 18, 2023 · 12 revisions

Setting up your computer

The following are the core tools we use in our projects.

We recommend that anyone who will be working on our projects take the time to become familiar with GitHub and Git, as well as any of the other tools they will be using actively. We expect RAs to be proficient in all of the core tools.

Individual GitHub repositories document their own requirements, including specific modules or libraries as well as software not listed here.

Communication

  • Please install Microsoft Teams (log in with your @tilburguniversity.edu email address)
  • Please install Zoom (log in with the domain tilburguniversity.zoom.us; you will then be redirected to the Tilburg University login page).

Collaboration and Coding

GitHub & Git

  • We store our project repositories on GitHub.
  • We use Git to track changes to our code, using the default "main" branch for production, and self-named development branches for new features.
  • We use "Issues" on GitHub to manage tasks and structure communication around projects.
  • We use "Projects" on GitHub to brainstorm/keep track about new ideas/tasks/stuff to work on ("backlog"), and manage our current Scrum-like sprint in "to do's", "in progress", and "done".

Anyone who is new to Git / GitHub should start by reading the following GitHub guides:

DataCamp Courses:

For premium DataCamp subscription go to: Premium DataCamp access

Other valuable guides are:

The book Pro Git is a standard source of more detailed documentation. It provides a good practical reference for how to execute specific tasks. It is freely available online.

For a deeper understanding, we recommend chapters 4-9 of the book Version Control with Git. It provides a more detailed ground-up explanation of how Git works and why. Many aspects of Git are confusing to newcomers, and understanding the underlying structure will make your work more efficient (and less frustrating!).

Setup Steps:

Create a GitHub account and install the Git desktop / command line clients and follow these instructions if you have troubles. Give your GitHub username to a team member who can give you permissions to the appropriate repositories.

Git Large File Storage (LFS)

Git LFS is a separate piece of software that allows Git to handle large files. We require everyone running one of our repositories to have Git LFS installed because inadvertently committing large files directly can cause bad things to happen.

You can read more on how to work with large files on GitHub here.

Setup Steps:

Install Git LFS. Note that you only need to do step 1 under "Getting Started" at this point.

Hugo

If you want to make changes to the website and see how these would look, you need to host the website locally with Hugo. You can read how to install Hugo on this site.

To see how to run the site local see our contribute page.

Some resources for learning more about building a website with Hugo are our tutorial on Hugo or this article.

Quarto

For some projects Quarto may be useful to publish articles, reports, presentation, websites, blogs, and books.

See this comprehensive guide to using Quarto and explore multiple tutorials.

Data

We use a combination of different services for storing data. Check with your colleagues on which data storage service they expect you to use.

Research Drive

Research Drive is a data storage service by SURF, to store files that are too big for GitHub (even with LFS), files that need to be shared across multiple projects, archives from pre-GitHub projects, and other kinds of files that don't have a natural home in repositories (e.g., raw data). Think of it as a sort of Dropbox.

Setup Steps:

If you are an employee at Tilburg University, you can request access to Research Drive by contacting the Research Data Office. Ask then a a member of your team to give you access to the appropriate directories you will need for your project(s).

Accessing files:

Use rclone to selectively access, download, upload, move and delete data on Research Drive. This will save you disk space - you won't need to download the entire project directory on your local disk.

Setup Steps:

On your personal machine, run the command brew install rclone or download from rclone.org. On computing services and research clusters, make sure that module load rclone is in your ./bash_profile. In either case, follow these instructions to access Research Drive via rclone.

Amazon Web Services / AWS S3

Another service frequently used is AWS S3, which can be considered a commercial version of Research Drive. To access files, you need the so-called AWS command line interface (called awscli). Follow these steps to install it on your computer.

Does your project involve AWS S3? Then please approach a team member (typically your team leader) to obtain the necessary login credentials.

Stats & Coding Software

R and RStudio

R is one of the main statistical packages we use. Here are some of our tips on how to learn R. Some other good resources for learning R are the Analysis and Programming tutorials from Software Carpentry. R4DS is a great introduction to all stages of the data analysis pipeline. Datacamp is a very useful set of interactive courses covering a wide range of topics; several introductory courses are free, while more advanced courses require a subscription.

Setup Steps: Install R and RStudio.

Stata

Stata is the other main statistical package we use. Some resources for learning Stata are the UNC Population Center Tutorial and Christopher Baum's lecture A Little Bit of Stata Programming Goes a Long Way.

Setup Steps: Install Stata.

Python

We use Python to control the running of code and for many other data building and analysis tasks. We compiled a list of resources to get you started learning Python. There are also many other excellent online introductions to Python, including this one from Software Carpentry. The book Learning Python by Mark Lutz is a definitive manual. Datacamp is a very useful set of interactive courses focusing specifically on data analysis; several introductory courses are free, while more advanced courses require a subscription.

Python currently exists in two different development streams, Python 2 (versions numbered 2.X) and Python 3 (versions numbered 3.X). We use Python 3 for our projects and you should make sure you have this version installed.

Setup Steps:

All Mac / Linux machines and many Windows machines come with Python installed. You can confirm whether it is by going to a terminal / bash window and typing python. If it is installed, you should see a welcome message indicating the version of Python (you can type quit() to exit). If it is not installed, or if you only have the 2.X version, you can follow these installation instructions.

Latex

Overleaf

Many research projects make use of an online Latex editor, such as Overleaf. Get a login for the service, and ask your team leader to share relevant documents with you.

LyX

As an "offline" version of Overleaf, LyX is used by many researchers. Documentation is on the main page here. Note that you will need to install a TeX system such as MiKTeX on Windows or MacTex on Mac OS before installing the LyX software itself. Instructions for this are on the LyX download page.

Setup Steps:

Install LyX

Make and other optional tools

Some other tools may be needed depending on your project needs and level of automation. For instance, some workflows can be automated using Make. If you're interested in pipeline automation using Make, we suggest you read this tutorial.

Computation & Research Clusters

Tilburg University's Blade

Sometimes it would take too long to run an analysis on your own PC. Or maybe it's not powerful enough to withstand it at all. In such cases, you should use one of the many research clusters or compute services available at Tilburg University, like Blade. In the Appendices, you can find a guide on how to set up Blade.

Research Cloud

For other projects, we make use of Research Cloud. Read here about what it is, and how to use it.

Clone this wiki locally