Duration
3 Days
Audience
Employees of federal, state and local governments; and businesses working with the government.
Course Overview
Introduction to Artificial Intelligence (AI) & Machine Learning (AI & ML JumpStart) is a three-day, foundation-level, hands-on course that explores the fast-changing field of artificial intelligence (AI). programming, logic, search, machine learning, and natural language understanding. Students will learn current AI / ML methods, tools, and techniques, their application to computational problems, and their contribution to understanding intelligence.
In this course, we will cut through the math and learn exactly how machine learning algorithms work. Although there is clearly a requirement for the students to have an aptitude for math, this course is about focusing on the algorithms that will be used to create machine learning models. Using clear explanations, simple pure Python code (no libraries!) and step-by-step labs, you will discover how to load and prepare data, evaluate model skill, and implement a suite of linear, nonlinear and ensemble machine learning algorithms from scratch.
This course presents a wide variety of related technologies, concepts and skills in a fast-paced, hands-on format, providing students with a solid foundation for understanding and getting a jumpstart into working with AI and machine learning. Each topic area presents a specific challenge area, current progress, and approaches to the presented problem. Attendees will exit the course with practical understanding of related core skills, methods and algorithms, and be prepared for continued learning in next-level, more advanced follow on courses that dive deeper into specific skillsets or tools.
Learning Objectives
This “skills-centric” course is about 50% hands-on lab and 50% lecture, with extensive practical exercises designed to reinforce fundamental skills, concepts and best practices taught throughout the course. Students will be led through a series of progressively advanced topics, where each topic consists of lecture, group discussion, comprehensive hands-on lab exercises, and lab review. Throughout the course students will learn about and explore popular machine learning algorithms, their applicability and limitations; practical application of these methods in a machine learning environment; and practical use cases and limitations of algorithms.
Working in a hands-on learning environment led by our expert practitioner, attendees will explore:
- Getting Started with Python & Jupyter
- Statistics and Probability Refresher, and Python Practice
- Matplotlib and Advanced Probability Concepts
- Algorithm Overview
- Predictive Models
- Applied Machine Learning
- Recommender Systems
- Dealing with Data in the Real World
- Machine Learning on Big Data (with Apache Spark)
- Testing and Experimental Design
- GUIs and REST: Build a UI & REST API for your Models
Course Outline
1. Getting Started
- Installing a Python Data Science Environment
- Using and understanding IPython (Jupyter) Notebooks
- Python basics – Part 1
- Understanding Python code
- Importing modules
- Python basics – Part 2
- Running Python scripts
2. Statistics and Probability Refresher, and Python Practice
- Types of data
- Mean, median, and mode
- Using mean, median, and mode in Python
- Standard deviation and variance
- Probability density function and probability mass function
- Types of data distributions
- Percentiles and moments
3. Matplotlib and Advanced Probability Concepts
- A crash course in Matplotlib
- Covariance and correlation
- Conditional probability
- Bayes’ theorem
4. Algorithm Overview
- Data Prep
- Linear Algorithms
- Simple Linear Algorithms
- Multivariate Linear Regression
- Logistic Regression
- Perceptrons
- Non-Linear Algorithms
- Classification Trees (CARTs)
- Naive Bayes
- k-Nearest Neighbors
- Ensembles
- Bootstrap Aggregation
- Random Forest
5. Predictive Models
- Linear regression
- Polynomial regression
- Multivariate regression and predicting car prices
- Multi-level models
6. Applied Machine Learning with Python
- Machine learning and train/test
- Using train/test to prevent overfitting of a polynomial regression
- Bayesian methods – Concepts
- Implementing a spam classifier with Naïve Bayes
- K-Means clustering
- Clustering people based on income and age
- Measuring entropy
- Decision trees – Concepts
- Decision trees – Predicting hiring decisions using Python
- Ensemble learning
- Support vector machine overview
- Using SVM to cluster people by using scikit-learn
7. Recommender Systems
- What are recommender systems?
- Item-based collaborative filtering
- How item-based collaborative filtering works?
- Finding movie similarities
- Improving the results of movie similarities
- Making movie recommendations to people
- Improving the recommendation results
8. More Applied Machine Learning Techniques
- K-nearest neighbors – concepts
- Using KNN to predict a rating for a movie
- Dimensionality reduction and principal component analysis
- A PCA example with the Iris dataset
- Data warehousing overview
- Reinforcement learning
9. Dealing with Data in the Real World
- Bias/variance trade-off
- K-fold cross-validation to avoid overfitting
- Data cleaning and normalization
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
10. Apache Spark Basics | Machine Learning on Big Data
- Installing Spark
- Spark introduction
- Spark and Resilient Distributed Datasets (RDD)
- Introducing MLlib
- Decision Trees in Spark with MLlib
- K-Means Clustering in Spark
- TF-IDF
- Searching Wikipedia with Spark MLlib
- Using the Spark 2.0 DataFrame API for MLlib
11. Testing and Experimental Design
- A/B testing concepts
- T-test and p-value
- Measuring t-statistics and p-values using Python
- Determining how long to run an experiment for
- A/B test gotchas
12. GUIs and REST
- Build a UI for your Models
- Build a REST API for your Models
13. What the Future Holds