CS3315 Introduction to Machine Learning and Big Data

A survey of methods for process large amounts of data and classifying and analyzing it using machine-learning methods.  Big-data topics examine the obstacles to processing including managerial obstacles, problems of data consistency, problems of data accuracy, data-reduction methods, and big-data distributed processing methods.  Topics on machine learning include concept learning, decision trees, Bayesian models, linear models, neural networks, case-based reasoning, genetic algorithms, sequence learning, and assessment techniques.  Students will do projects with software tools on military data..

 

Prerequisite

CS3310 or consent of instructor.

Lecture Hours

3

Lab Hours

1

Course Learning Outcomes

Upon completion of this course student is expected to:

  • Be able to recommend the most appropriate machine-learning method for an application and the data at hand (e.g. regression vs classification, supervised vs unsupervised learning).
  • Explain the value of large amounts of data and the high-level key concepts behind modern big-data architectures (e.g., contrast data analytics, data science, machine learning, artificial intelligence), data lakes, data warehouses, Hadoop, and MapReduce)
  • Understand basic types of learning methods including:
    • Caching, case-based reasoning, decision trees
    • Concept learning of logical expressions
    • Regression (linear, polynomial etc.)
    • Classification using probabilistic reasoning
    • Heuristic search
    • Gradient based learning algorithms (e.g. support-vector machines)  and the regularization concept
    • Neural-network basics
    • Ensemble-based algorithms (e.g. random forests)
  • Be able to explain learning methods with paper and pencil.
  • Be able to implement learning methods using a software tool.
  • Be able to produce a written report to include data analysis, feature engineering, modeling of hypotheses, evaluation of model performance relative to identified task, hyperparameter selection, etc.
  • Identify the major difficulties in implementing and testing learning systems, including explaining of reasoning and handling model bias and variance.