- Organize your work so that you have everything in a script
- The script reproduces all of your work, when run from a clean workspace
- Outputs (figures, processed datasets) are disposable, your scripts can always re-produce the output
17 April 2015
Keep raw (original) data in a sub-folder, and never modify raw data
Use projects in Rstudio to manage files and your workspace
Try to avoid very large projects, instead split them into more manageable chunks.
As a rule of thumb, a 'project' is about the size of the analysis for a single manuscript.
To keep raw data separate from scripts, functions, and outputs, a good folder structure is important. Below is an example, but this is of course flexible, and depends on the type of project.
In this simple example, we keep functions and scripts in the
R folder, the raw data files (normally as CSV) in
rawdata, and output is sent to the
allom <- read.csv("rawdata/allometry.csv")
pdf("output/Figure1.pdf") plot(height ~ diameter, data=allom) dev.off()
I like to have a single 'master script', that loads other scripts that do particular bits, like read and clean the raw data, make figures, and fit linear models. This script may look like the following.
# Load packages library(gplots) library(car) # Load functions source("R/functions.R") # Read raw data and add new variables source("R/readdata.R") # Make figures source("R/figures.R") # Do stats source("R/linearmodels.R")
Finally, I strongly recommend using projects in Rstudio.
rm(list=ls())in your scripts