Advanced Statistics and Experimental Design (PLSCI 7201)

Short Course, Cornell University, School of Integrative Plant Sciences, 2020

Advanced Statistics and Experimental Design

Instructors: Kelly Robbins (Bradfield 310), Malachy Campbell

Office Hours: by appointment

Meets: MWF (10:10am - 12:05pm; Aug 28 - Sept 25)

Grading: Letter Grade, 2 credit hours

I. Rationale:

Advances in various ‘omics technologies (genomics, phenomics, metabolomics) have allowed plant scientists to generate large, high-dimensional datasets at a relatively low cost. The effective generation and use of these data requires experimental designs that balance statistical power and logistical constraints (cost, labor, space, etc.), statistical frameworks that can accommodate high-dimensional data, and knowledge of computational approaches to facilitate such analyses. As part of the Digital Plant Science initiative, this course seeks to train participants in effective experimental designs and statistical analyses of ‘omics data.

II. Course Aims and Outcomes:

Aims

This course will provide participants with a comprehensive introduction of the experimental designs that are commonly used in plant science and plant breeding, and provide participants with the knowledge and practical coding skills necessary to analyse data from such designs. This basic knowledge will be extended to accommodate high-dimensional data generated by modern ‘omics techniques. Upon completion of this course, students will have a foundational understanding of experimental designs and statistical analyses that help guide their independent research and avoid many common mistakes that are often made by new scientists. While this course will cover a wide range of topics, it is by no means an exhaustive coverage of experimental design and statistics. Students are strongly encouraged to compliment the foundational knowledge learned in this course with classes dealing with advanced statistical methods and/or experimental design. Specific Learning Outcomes As a result of participating in the course: Participants will be able to interpret experimental designs that are commonly used in plant science and plant breeding Participants will be able to apply and interpret linear models to account for systematic effects in commonly used experimental designs Participants will be able to extend classical experimental designs and statistical frameworks to challenges and limitations associated with high-dimensional ‘omics data Participants will be able to apply these concepts to meet independent research objectives

## III. Format and Procedures:

This course will utilize a combination of lectures and hands-on exercises to introduce students to concepts in experimental design and statistics. Each two-hour course will begin with a 1hr lecture and will end with exercises in R and/or Python. Grades will be based on completion of three take-home problem sets and a final project. Students will be given the opportunity to correct errors in their problem sets to earn additional points.

IV. My Assumptions

Students entering this course should have a basic knowledge of concepts in statistics, a strong foundation in mathematics, and have some experience programming in R or another similar environment.

V. Grading Procedures

Problem sets – 300 pts (100 pts each) Final – 100 pts

VI. Academic Integrity

Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student’s own work.

VII. Accommodations for Students with Disabilities

In compliance with the Cornell University policy and equal access laws, I am available to discuss appropriate academic accommodations that may be required for students with disabilities. Requests for academic accommodations are to be made during the first three weeks of the semester, except for unusual circumstances, so arrangements can be made. Students are encouraged to register with Student Disability Services to verify their eligibility for appropriate accommodations.

VIII. Inclusivity Statement

We understand that our members represent a rich variety of backgrounds and perspectives. The School of Integrative Plant Science is committed to providing an atmosphere for learning that respects diversity. While working together to build this community we ask all members to:

  • share their unique experiences, values, and beliefs.
  • be open to the views of others.
  • honor the uniqueness of their colleagues.
  • appreciate the opportunity that we have to learn from each other in this community.
  • value each other’s opinions and communicate in a respectful manner.
  • keep confidential discussions that the community has of a personal (or professional) nature.
  • use this opportunity together to discuss ways in which we can create an inclusive environment in this course and across the Cornell community.

There are no required textbooks, however participants my find the following texts helpful.

  • Montgomery, Douglas C. Design and analysis of experiments. John wiley & sons, 2017.
  • Bailey, Rosemary A. Design of comparative experiments. Vol. 25. Cambridge University Press, 2008.

X. Tentative Course Schedule

  • Module 1: Introduction; design and analysis of unstructured experiments (Ch. 1-3 Bailey; Ch. 2 Montgomery)
  • Module 2: Introducing the linear model and hypothesis testing (Ch 3 Bailey; Ch 3 Montgomery)
  • Module 3: Regression and prediction (Ch 10 Montgomery)
  • Module 4: Blocked designs (Ch 4, 6, 9, 11 Bailey; Ch 4 Montgomery)
  • Module 5: Factorial designs (Bailey Ch 5, 12; Montgomery Ch 5-7)
  • Module 6: Unreplicated designs
  • Module 7: Experiments with correlated observations - spatial and longitudinal analysis
  • Module 8: Multivariate approaches
  • Module 9: Coping with high-dimensional responses
  • Module 10: Advanced topics - Bayesian statistics