OA4118 Statistical and Machine Learning

This course introduces the art and science of statistical and machine learning to find patterns in large and "Big" data. The focus is on the strengths and weaknesses of learning techniques and their implementation. We cover the fundamental ideas common to learning methods and introduce supervised/unsupervised techniques including: re-sampling methods, advanced clustering and visualization, tree-based ensembles, stochastic gradient boosting, deep neural networks, auto-encoding and other dimension reduction techniques, and applications to natural language processing. The software package Rand high-performance parallel or distributed computing will be used to demonstrate these methods. May not be taken for credit with OA4108.

Prerequisite

OA4106 or consent of instructor

Lecture Hours

4

Lab Hours

0

Course Learning Outcomes

Upon successful completion the student will be able to:

  • achieve an understanding of the key concepts in data science modeling and in deep learning,
  • demonstrate the ability to apply these concepts in a practical way on a carefully curated set of problems, and
  • can apply machine learning and deep learning models in a practical way on real-world, problems you encounter via R or RStudio for the Department of Defense.