*To submit this assignment, upload the full document on blackboard, including the original questions, your code, and the output. Submit you assignment as a knitted .pdf (preferred) or .html file.*

Variable assignment (1 mark)

Assign the value

`5`

to the variable/object`a`

. Display`a`

. (0.25 marks)Assign the result of

`10/3`

to the variable`b`

. Display`b`

. (0.25 marks)Write a function that adds two numbers and returns their sum. Use it to assign the sum of

`a`

and`b`

to`result`

. Display`result`

. (In practice, there is already a more sophisticated built-in function for this:`result <- sum(a, b)`

) (0.25 marks)Write a function that multiplies two numbers and returns their product. Use it to assign the product of

`a`

and`b`

to`product`

. Display`product`

. (In practice, there is already a more sophisticated built-in function for this:`product <- prod(a, b)`

) (0.25 marks)

Vectors (1 mark)

Create a vector

`v`

with all integers 0-30, and a vector`w`

with every third integer in the same range. (0.25 marks)What is the difference in lengths of the vectors

`v`

and`w`

? (0.25 marks)Create a new vector,

`v_square`

, with the square of elements at indices 3, 6, 7, 10, 15, 22, 23, 24, and 30 from the variable`v`

.*Hint: Use indexing rather than a for loop.*(0.25 marks)Calculate the mean and median of the first five values from

`v_square`

. (0.25 marks)

Boolean indexing (1 mark)

Create a boolean vector

`v_bool`

, indicating which vector`v`

elements are bigger than 20. How many values are over 20?*Hint: In R, TRUE = 1, and FALSE = 0, so you can use simple arithmetic to find this out.*(0.5 marks)Display the output of

`v[TRUE]`

. Explain why you think R outputs this. (0.25 marks)*(Note: this is not really something you would ever need to do in practice!)*Use the variable

`v_bool`

as an index to extract the elements from`v`

that are bigger than 20. What are the min and max values of this new vector? (0.25 marks)

Data frames (2 marks)

There are many built-in data frames in R, which you can find more details about online. What are the column names of the built-in dataframe

`beaver1`

? How many observations (rows) and variables (columns) are there? (0.5 marks)Display both the first 6 and last 6 rows of this data frame. Show how to do so with both indexing as well as specialized functions. (0.5 marks)

What is the min, mean, and max body temperature in this data set?

*Hint: Remember that each column in a data frame is a vector, so you can use the same functions as in the previous question on vectors.*(0.5 marks)Use the

`summary`

function to display an overview of the`temp`

column. (0.25 marks)Use a single instance of the

`summary`

function to display an overview of the`time`

and`temp`

columns. (0.25 marks)

Data frames with dplyr (3 marks)

Say weโre attempting to calculate mean temperature in the

`beaver1`

dataset. What is wrong with the following chain of dplyr commands? (0.5 marks)`beaver1 %>% filter(is.na(temp)) %>% summarise(mean_temp = mean(temp))`

Use dplyr to randomly sample 20 rows from

`beaver1`

. Calculate mean temperature from this subsetted dataset. (0.5 marks)*Hint: you may want to refer to the dplyr cheatsheet for this*Using the full

`beaver1`

dataset, calculate the mean temperature for day 346. (0.25 marks)*Note: use the full dataset for parts c-f below as well.*Rather than using

`filter()`

to calculate the mean for each day separately, the more convenient`group_by()`

can be used to aggregate measurements by a categorical value (such as the`day`

column in`beaver`

). Use this approach to calculate the mean temperature and activity level for each of the days in the dataset. (0.5 marks)Express in writing what the average activity level from the above calculation means.

*Hint: Remember that you can read a description of the columns online.*(0.25 marks)How many observations are there per day in this dataset? (0.25 marks)

How many observations are there per day when the beaver is active outside the retreat? (0.25 marks)

Grouping by activity level

*and*the day of the observation. Which variable seems to be more related to high body temperature: activity level or day of measurement? (0.5 marks)

This work is licensed under a Creative Commons Attribution 4.0 International License. See the licensing page for more details about copyright information.