2 Importing and exporting data
For the remainder of this course, we will be working with a single data set to answer a number of simple biological questions about the effects of maternal cigarrette use during pregnancy. This data is a table of parental and newborn birth weights adapted from a data set provided by Mathematics and Statistics Help at the University of Sheffield. It has been slightly modified to allow us to demonstrate a few more functions, but the data is largely unchanged.
Begin by setting up a working directory for this course.
We’re going to do all the work for this course in the directory we created in the last section. To make things easier, we’ll set R’s working directory to the directory where we saved our notebook. To do this, select the “Session” menu in the toolbar, then navigate to “Set Working Directory,” and click on “Choose Directory.” Choose the directory in which you saved your notebook.
When you do so, you should see that a line of code ran in your console:
This is the GUI at work! When you clicked on a directory in the file browser, R was actually running the code displayed in your console. In the future, you can set your working directory that way instead.
Download the data.
download.file("https://raw.githubusercontent.com/ucdavis-bioinformatics-training/2022_February_Introduction_to_R_for_Bioinformatics/main/birthweight.csv", "birthweight.csv")
2.1 Import data using read.csv()
Manual data entry is time-consuming and leads to errors. R has a number of functions for reading data in a variety of formats. Let’s use the read.csv()
function to read in a spreadsheet containing data from an experiment.
CSV stands for “comma separated value,” and the CSV file is simply a text file where each row in the file represents a row in the data table, and the columns are separated by commas. The contents of the CSV file are now stored in the variable “birthweight.”
2.2 Export data using write.csv()
In the course of our analysis, we will add metrics to this data set. When we’re finished, we will want to be able to save our analyses. To write the contents of the birthweight object to a new CSV, we can use the write.csv()
function.
The similar read.delim()
and write.delim()
can be used to read and write tab-delimited files, where columns are separated by tab characters rather than commas.