CS3315 Introduction to Machine Learning and Big Data

A survey of methods for process large amounts of data and classifying and analyzing it using machine-learning methods. Big-data topics examine the obstacles to processing including managerial obstacles, problems of data consistency, problems of data accuracy, data-reduction methods, and big-data distributed processing methods. Topics on machine learning include concept learning, decision trees, Bayesian models, linear models, neural networks, case-based reasoning, genetic algorithms, sequence learning, and assessment techniques. Students will do projects with software tools on military data..

Prerequisite

CS3310 or consent of instructor.

Lecture Hours

Lab Hours

Course Learning Outcomes

Upon completion of this course student is expected to:

Be able to recommend the most appropriate machine-learning method for an application and the data at hand (e.g. regression vs classification, supervised vs unsupervised learning).
Explain the value of large amounts of data and the high-level key concepts behind modern big-data architectures (e.g., contrast data analytics, data science, machine learning, artificial intelligence), data lakes, data warehouses, Hadoop, and MapReduce)
Understand basic types of learning methods including:

Caching, case-based reasoning, decision trees
Concept learning of logical expressions
Regression (linear, polynomial etc.)
Classification using probabilistic reasoning
Heuristic search
Gradient based learning algorithms (e.g. support-vector machines) and the regularization concept
Neural-network basics
Ensemble-based algorithms (e.g. random forests)

Be able to explain learning methods with paper and pencil.
Be able to implement learning methods using a software tool.
Be able to produce a written report to include data analysis, feature engineering, modeling of hypotheses, evaluation of model performance relative to identified task, hyperparameter selection, etc.
Identify the major difficulties in implementing and testing learning systems, including explaining of reasoning and handling model bias and variance.