Data Science Bootcamp (Intensive In-Class Hands-On training)
Bootcamp length: 4 days
Day 1: Data Analysis with Python.
In first day, you learn about the importance of data, machine learning, and big data. Find out about free online resource, labs and tools for data science education which includes Jupyter (IPython) Notebooks, RStudio IDE, and Apache Spark.
Learn how to analyze data using Python. You will learn how to prepare data for analysis, perform simple statistical analyses, create meaningful data visualizations, predict future trends from data, and more!
- Learn about data science in a business context
- Discover some business applications and use cases for data science
- Importing and cleaning Data sets
- Pandas, Numpy and Scipy libraries
- Data frame manipulation
- Histograms and Probability Mass functions Notebook: Calculate and Display data
- matplotlib and Plotly library
- Maps (creating maps using latitude, longitude data)
- Hands-on Project
Day 2: Machine Learning with Python.
How can we get machines to learn from the data on their own? In this part you will learn get an overview of machine learning algorithms. To get hands-on practice with machine learning, you will work with real data sets and practice data mining techniques to predict or classify different datasets. Also, you will learn how to choose the best algorithm for different problems in various domains and industries.
- Overview of Machine Learning
- Which ML algorithm is proper for my problem?
- Classification (Decision trees and KNN)
- Clustering (Hierarchical and k-means)
- Recommender Systems
- Machine learning libraries, e.g ScikitLearn
Day 3: Big Data with Python.
You will learn how to work with Big Data using Apache Spark. Spark is a lightweight front-end library that is used for distributed processing when dealing with big data. You will read data from a big dataset, preprocess and apply preprocessing operations.
- Intro to Apache Spark
- Reading data from a big dataset
- Selecting data, filtering, and aggregating big data
- Spark SQL
- Machine Learning with Spark
Day 4 Morning: Intro to Deep leaning with TensorFlow
Deep learning is a subset of machine learning that uses neural networks to model high-level abstractions in data, which enables data scientists to create models on complex, unstructured data like images and videos. In this session you will work with specific type of deep learning, called convolutional neural networks, and use TensorFlow library to work with these networks.
- Deep Learning libraries
- Intro to TensorFlow
- Neural Networks
- Logistic Regression with TensorFlow
- Convolutional Neural Networks
- Recurrent Neural Networks
Day 4 Afternoon: Final exam (optional).
2 hours for optional exam. Participants with a passing grade will receive:
CERTIFICATION & IBM BADGE
IBM validated badge
- an IBM course completion certificate
- an IBM badge
Both the completion certificate and the badge will be stored and verifiable by Documentorum, an academic credentials blockchain.