Exercises: Basics#
1. Data Types and Basic Operations#
For introductory concepts, please see this course.
You are studying Drosophila (fruit flies) in a genetics lab. Each of four vials contains a different number of flies after a week of growth.
You’ve created a vector recording all the counts.
# Please run the following code:
flies <- c(23, 17, 28, 21)
After counting them, you are now ready to analyse them.
Check what type of data flies contains using:
typeof(flies).Calculate the total number of flies across all vials. Hint: Use
help.search("sum")to find helpful functions (commands).Compute the average (mean) number of flies.
Find how many vials have fewer than the average number of flies.
Suppose each vial can hold up to 30 flies. Create a new vector called
remaining_spaceshowing how many flies could still fit in each vial.
# help.search() lets you look for R functions or topics by keyword when you don’t know the exact function name.
# Your code here:
Just for fun: use summary(flies) function and notice the output.
2. Data Frames and Subsetting#
For data frames and subsetting, please see this course.
You measured expression levels for three genes in four Drosophila samples. The data frame looks like this:
# Please run the code below
expression <- data.frame(
Gene = c("CEP290", "GATA2", "IFT190"),
Sample1 = c(12.4, 7.8, 15.3),
Sample2 = c(13.1, 8.6, 14.9),
Sample3 = c(11.8, 9.0, 15.7),
Sample4 = c(12.9, 8.3, 16.1)
)
Tasks:
Use
head()andstr()to inspect the data frame.Use
names()orcolnames()to see the column names..Extract all expression values for the gene
IFT190.Extract expression values from
Sample2only.Extract the expression value of
CEP290in Sample3 (hint: you can use row and column indexing, e.g.expression[row, column]).Create a new data frame containing only
GATA2andIFT190rows.Add a new column called
Averageshowing the mean expression for each gene across all samples.Which gene has the highest average expression? (Use
which.max().)Use
subset()to extract all rows where the average expression is greater than 12.
# Your code here
3. Basic Visualisation#
For basic visualisation plots, please see this course.
You measured the wing lengths (in millimetres) of 20 fruit flies from a single population and recorded them:
# Please run the code below
# Wing length measurements (mm)
wing_lengths <- c(2.3, 2.5, 2.4, 2.6, 2.2, 2.8, 2.4, 2.5, 2.7, 2.6, 2.3, 2.9, 2.8, 2.7, 2.6, 2.4, 2.5, 2.3, 2.6, 2.7)
Create a histogram of
wing_lengths(hint: usehelp(hist)for more info).Create a boxplot of
wing_lengths(hint: usehelp(boxplot)for more info).
# Your code here
Generate a new set of measurements for a second population
wing_lengths_2.Create side-by-side boxplots comparing the two populations.
# Your code here
4. Control Structures and Functions#
For controls structures syntax, please see this course.
For functions basic syntax, please see this course.
You are studying the growth of plants under different light conditions. After one week, you measured the growth (in cm) of 10 plants.
# Please run the code below
# Plant growth in cm after one week
growth <- c(3.5, 5.2, 2.8, 4.6, 5.9, 6.3, 3.1, 4.9, 2.7, 5.5)
Loop through each plant’s growth value and print a message in the format: “Plant X grew X cm”. There are two options: (1) loop directly over values; (2) loop using an index - uses
(i in 1:length(growth).
# Run the code below and observe what happens:
# Create a vector
numbers <- c(10, 20, 30)
# Option 1
for (i in numbers) {
print(i)
}
# Option 2
for (i in 1:length(numbers)) {
print(i)
}
# Your code here
Loop through each plant’s growth value and print a message
Plant X: grew wellif growth is>=4.5 cm andPlant X: poor growthotherwise.
# Your code here
Write a function called
good_growth()that:Takes a vector of growth measurements and a threshold value as input.
Returns the number of plants that grew above the threshold. Example:
good_growth(growth, 4.5)should return 6.
# Your code here
Bonus challenge:
Modify your function so it also prints a short summary:
6 out of 10 plants grew above 4.5 cm.
# Your code here