To submit this assignment, upload the full document on blackboard, including the original questions, your code, and the output. Submit you assignment as a knitted .pdf (preferred) or .html file.

  1. Variable assignment (1 mark)

    1. Assign the value 5 to the variable/object a. Display a. (0.25 marks)

    2. Assign the result of 10/3 to the variable b. Display b. (0.25 marks)

    3. Assign the product of a and b to product. Display product. (0.25 marks)

    4. Write a function that adds two numbers and returns their sum. Use it to assign the sum of a and b to result. Display result. (In practice, there is already a more sophisticated built-in function for this: result <- sum(a, b)) (0.25 marks)

  2. Vectors (1 mark)

    1. Create a vector v with all integers 0-30, and a vector w with every third integer in the same range. (0.25 marks)

    2. What is the difference in lengths of the vectors v and w? (0.25 marks)

    3. Create a new vector, v_square, with the square of elements at indices 3, 6, 7, 10, 15, 22, 23, 24, and 30 from the variable v. Hint: Use indexing rather than a for loop. (0.25 marks)

    4. Calculate the mean and median of the first five values from v_square. (0.25 marks)

  3. Boolean indexing (1 mark)

    1. Create a boolean vector v_bool, indicating which vector v elements are bigger than 20. How many values are over 20? Hint: In R, TRUE = 1, and FALSE = 0, so you can use simple arithmetic to find this out. (0.5 marks)

    2. Use the variable v_bool as an index to extract the elements from v that are bigger than 20. What are the min and max values of this new vector? (0.5 marks)

  4. Data frames (2 marks)

    1. There are many built-in data frames in R, which you can find more details about online. What are the column names of the built-in dataframe beaver1? How many observations (rows) and variables (columns) are there? (0.5 marks)

    2. How can you view the first 5 rows of this data frame? How can you view the last 5 rows? (0.5 marks)

    3. What is the min, mean, and max body temperature in this data set? Hint: Remember that each column in a data frame is a vector, so you can use the same functions as in the previous question on vectors. (0.5 marks)

    4. Use the summary function to display an overview of the temp column. (0.5 marks)

  5. Data frames with dplyr (3 marks)

    1. Use dplyr to calculate the mean temperature in the dataset. (0.25 marks)

    2. Calculate the mean temperature for day 346. (0.5 marks)

    3. Rather than using filter() to calculate the mean for each day separately, the more convenient group_by() can be used to aggregate measurements by a categorical value (such as the day column in beaver). Use this approach to calculate the mean temperature and activity level for each of the days in the dataset. (0.5 marks)

    4. Express in writing what the average activity level from the above calculation means. Hint: Remember that you can read a description of the columns online. (0.25 marks)

    5. How many observations are there per day in this dataset? (0.25 marks)

    6. How many observations are there per day when the beaver is active outside the retreat? (0.25 marks)

    7. Was the body temperature higher when the beaver was active or inactive? Is this what you would expect? (0.5 marks)

    8. Grouping by activity level and the day of the observation. Which variable seems to be more related to high body temperature: activity level or day of measurement? (0.5 marks)


This work is licensed under a Creative Commons Attribution 4.0 International License. See the licensing page for more details about copyright information.