1 About the course

1.1 Overview

Machine learning gives computers the ability to learn without being explicitly programmed. It encompasses a broad range of approaches to data analysis with applicability across the biological sciences. Lectures will introduce commonly used algorithms and provide insight into their theoretical underpinnings. In the practicals students will apply these algorithms to real biological data-sets using the R language and environment.

During this course you will learn about:

  • Some of the core mathematical concepts underpinning machine learning algorithms: matrices and linear algebra; Bayes’ theorem.
  • Classification (supervised learning): partitioning data into training and test sets; feature selection; logistic regression; support vector machines; artificial neural networks; decision trees; nearest neighbours, cross-validation.
  • Exploratory data analysis (unsupervised learning): dimensionality reduction, anomaly detection, clustering.

After this course you should be able to:

  • Understand the concepts of machine learning.
  • Understand the strengths and limitations of the various machine learning algorithms presented in this course.
  • Select appropriate machine learning methods for your data.
  • Perform machine learning in R.

1.3 Prerequisites

1.5 License

GPL-3

1.6 Contact

If you have any comments, questions or suggestions about the material, please contact the authors: Sudhakaran Prabakaran, Matt Wayland and Chris Penfold.

1.7 Colophon

This book was produced using the bookdown package (Xie 2017), which was built on top of R Markdown and knitr (Xie 2015).

References

Xie, Yihui. 2017. Bookdown: Authoring Books and Technical Documents with R Markdown. https://github.com/rstudio/bookdown.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. http://yihui.name/knitr/.