Lesson objectives:
- Learn about the scientific method and applying it
- Learn basics of group dynamics in a team setting
- Get into your groups, start your projects - Identify which dataset to use - Brainstorm possible research questions
Lesson outline:
- Fill out survey for modelling section (5 min)
- What is the scientific method (20 min)
- Team dynamics and assigning roles and tasks (15-20 min)
- Getting into groups (10 min)
- Exercise (25-30 min)
- Start project work (20-25 min)
- Datasets available for use (10-20 min)
Final assignment is in a group and to succeed, you need to understand team dynamics:
Continuous Plankton Recorder Dataset
Data on northern hemisphere plankton species, latitude, longitude, date. Going back to 1946. 935 Mb size, almost 2.75 million rows of data.
Data on insects species in Costa Rica, latitude, longitude, elevation, date. 1.4 Gb size, almost 3.25 million rows of data.
Marine predator and prey body sizes
Data from 27 different global locations on species, body measurements, latitude, longitude, date. 21 Mb, almost 35,000 rows of data (in long format).
Data about general mammalian life history with species, body size, lifespan, litter size, and other reproductive variables. 150 Kb size, 1440 rows of data.
North American Bird Breeding Survey
Data about number of birds at multiple stops in North America. Many datasets of varying rows that need to be linked together. ~5.7 Gb size (can be shortened) across >50 files.
Simple hypothesis: Bigger predators eat bigger prey.
Easy to identify independent and dependent variables and visualize with plots and test with linear regression.
Advanced hypothesis: Which characteristics determine prey size among marine predators?
Lots of possible groupings available. Start with visualizing some good candidate variables such as predator weight and length. Realize that the relationship is more complex than this and start dividing the data set according to species, water temperature, weather, etc. Use a combination of visualization and regression analyses. Fit models to the data to determine which types of regressions are appropriate.
# Read data online and fix colnames
pred_prey <- read_tsv('http://www.esapubs.org/archive/ecol/E089/051/Predator_and_prey_body_sizes_in_marine_food_webs_vsn4.txt')
## Parsed with column specification:
## cols(
## .default = col_character(),
## `Record number` = col_integer(),
## `Individual ID` = col_integer(),
## `Predator length` = col_double(),
## `Predator standard length` = col_double(),
## `Predator fork length` = col_double(),
## `Predator total length` = col_double(),
## `Standardised predator length` = col_double(),
## `Predator quality of length-mass conversion` = col_integer(),
## `Predator mass` = col_double(),
## `Predator mass check` = col_double(),
## `Predator mass check diff` = col_double(),
## `Predator ratio mass/mass` = col_double(),
## `SI predator mass` = col_double(),
## `Prey length` = col_double(),
## `Prey quality of conversion to length` = col_integer(),
## `SI prey length` = col_double(),
## `Prey mass` = col_double(),
## `Prey mass check` = col_double(),
## `Prey mass check diff` = col_double(),
## `Prey ratio mass/mass` = col_double()
## # ... with 7 more columns
## )
## See spec(...) for full column specifications.
colnames(pred_prey) <- tolower(gsub(' ', '_', colnames(pred_prey)))
# Only adults and some columns of interest
adult_pred_food <- pred_prey %>%
mutate(predator_lifestage = tolower(predator_lifestage)) %>% # Mix of upper and lower case...
filter(predator_lifestage == 'adult') %>%
select(predator_common_name, predator_length, prey_mass, predator_mass)
# Summarize data to plot
plot_data <- adult_pred_food %>%
group_by(predator_common_name) %>%
summarize(
mean_pred_mass = mean(predator_mass),
mean_prey_mass = mean(prey_mass),
mean_pred_length = mean(predator_length)) %>%
filter(mean_pred_mass < 8000) # "outliers"
# Plot pred weight vs prey weight
ggplot(plot_data, aes(x = mean_pred_mass, y = mean_prey_mass)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
# Plot pred length vs prey weight
ggplot(plot_data, aes(x = mean_pred_length, y = mean_prey_mass)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
In your group, rotate roles (need a facilitator and recorder at minimum). Before the end of class, finish these:
Make sure to record everything down! Many of these tasks will also be part of assignment 7!
Before we finish the class, we need to prepare a bit for next class. We need to create a GitHub account!
See Khan Academy on Scientific Method for a brief overview and explanation on the scientific process.↩
This work is licensed under a Creative Commons Attribution 4.0 International License. See the licensing page for more details about copyright information.