**Bootcamp developer and instructor: **

**Saeed Aghabozorgi**, PhD, Chief Data Scientist at Cognitive Class, IBM

**Polong Lin**, MSc, Data Scientist at Cognitive Class, IBM

**Bootcamp length:**

**Option 1: **4 days- Aug 12 to 15.

**Option 2: **4 days- Aug 16 to 19 (repeat of Option 1).

**Day 1 Morning****:** **Introduction to Data Science**

What is Data Science? Learn about the importance of data, machine learning, and big data. Find out about IBM's free online resource for data science education – Cognitive Class. And get a feel for popular open data science tools through IBM's Cognitive Class Labs platform which includes Jupyter (IPython) Notebooks, RStudio IDE, and Apache Spark.

**Topics covered: **

- Explore definitions of data science, paths to data science, R vs Python, data science tools, skills and technology, definition of cloud, Big Data, etc.
- Learn about data science in a business context
- Discover some business applications and use cases for data science

**Day 1 Afternoon:** **Data Analysis with Python**

Learn how to analyze data using Python. This section will take you from the basics of Python to exploring many different types of data. You will learn how to prepare data for analysis, perform simple statistical analyses, create meaningful data visualizations, predict future trends from data, and more!

Topics covered:

- Importing and cleaning Data sets
- Intro to Pandas, Numpy and Scipy libraries
- Data frame manipulation
- Summarizing the Data

**Day 2 Morning:** **Statistics for Data Science with Python **

Through lecture, labs and an assignment learn basics of Statistics for Data Science. First, the main concepts of statistics are taught through lecture including Central Limit Theorem, Normal Distribution, Descriptive Statistics, then you will practice those in lab.

Topic covered:

- Descriptive Statistics Notebook: mean, Median and Standard Deviation
- Histograms and Probability Mass functions Notebook: Calculate and Display data
- Normal Distribution and probability density functions

**Day 2 Afternoon: ****Data Visualization with Python**

A picture is worth a thousand words - or should we say data points? In this section, we will go through how to plot the major graphs in Python. Learn how to plot bar graphs, line graphs, histograms, and more. Finally, learn how to create an interactive visualization of data using Plot.ly.

Topic covered:

- Intro to matplotlib and Plotly library
- Histograms, Bar graphs, Line graphs and Scatter plots
- Maps (creating maps using latitude, longitude data)

**Day 3 Morning: ****Big Data with Python**

You will learn how to work with Big Data using Apache Spark. Spark is a lightweight front-end library that is used for distributed processing when dealing with big data. You will read data from a big dataset, preprocess and apply preprocessing operations.

Topics covered:

- Intro to Apache Spark
- Reading data from a big dataset
- Selecting data, filtering, and aggregating big data

**Day 3 Afternoon:** **Machine Learning with Python**

How can we get machines to learn from the data on their own? In this part you will learn get an overview of machine learning algorithms. To get hands-on practice with machine learning, you will work with real data sets and practice data mining techniques to predict or classify different datasets. Also, you will learn how to choose the best algorithm for different problems in various domains and industries.

Topics covered:

- Overview of Machine Learning
- Which ML algorithm is proper for my problem?
- Regression
- Classification (Decision trees and KNN)
- Clustering (Hierarchical and k-means)
- Machine learning libraries, e.g

**Day 4 Morning: ****Intro to Deep leaning with TensorFlow**

Deep learning is a subset of machine learning that uses neural networks to model high-level abstractions in data, which enables data scientists to create models on complex, unstructured data like images and videos. In this session you will work with specific type of deep learning, called convolutional neural networks, and use TensorFlow library to work with these networks.

Topics covered:

- Intro to TensorFlow
- Logistic Regression with TensorFlow
- Convolutional Neural Networks

**Day 4 Afternoon:** **Final exam (optional).**

Half day for optional exam to obtain a Verified IBM Badge