OA4108 Introduction to Statistical and Machine Learning

The art and science of finding real patterns in (usually very large) data sets as seen from a statistical perspective. Introduction to some of the techniques used in machine learning and discussion of their implementation, their strengths and weaknesses, and some common and specific pitfalls. Supervised algorithms for classification and regression include trees and neural networks, as well as ensembles. Some unsupervised techniques for clustering, dimension reduction, and visualization are presented. Data acquisition including web scraping, SQL and regular expression for handling of disparate data types needed as inputs for machine learning algorithms will also be covered. Most computation will be done using the R software package, but other software will be introduced as needed.

Prerequisite

OA3103, OA4106

Lecture Hours

2

Lab Hours

2