About the Bootcamp

Welcome to the four-day Cognitive Class Data Science Bootcamp!

This is an intensive hands-on bootcamp, where you can learn the fundamentals of data science from data scientists at IBM.

You will learn how to use the popular Python programming language for data analysis and data visualization, we'll introduce you to machine learning, AI, big data, and Apache Spark. You'll have a chance to apply your new skills to on data science projects.

What you will achieve at the end of the bootcamp:

An understanding of data science, machine learning and deep learning techniques. Data cleaning and pre-processing. Data visualization. Python, Apache Spark, TensorFlow.

Prerequisite course requirements:

You are either already comfortable programming in Python, or you have successfully completed the following online course (course duration is 3 hours):

 

Python 101 (https://cognitiveclass.ai/courses/introduction-to-python/)

This free Python course provides a beginner-friendly introduction to Python. Practice through lab exercises, and you'll be ready to start data analysis in the bootcamp!

Participants must supply their own laptops.

Bootcamp Agenda

Bootcamp developer and instructor:

Saeed Aghabozorgi, PhD, Chief Data Scientist at Cognitive Class, IBM
Polong Lin, MSc, Data Scientist at Cognitive Class, IBM  

Bootcamp length:

Option 1: 4 days- Aug 12 to 15.
Option 2: 4 days- Aug 16 to 19 (repeat of Option 1).

Day 1 Morning: Introduction to Data Science
What is Data Science? Learn about the importance of data, machine learning, and big data. Find out about IBM's free online resource for data science education – Cognitive Class. And get a feel for popular open data science tools through IBM's Cognitive Class Labs platform which includes Jupyter (IPython) Notebooks, RStudio IDE, and Apache Spark.

Topics covered:

  • Explore definitions of data science, paths to data science, R vs Python, data science tools, skills and technology, definition of cloud, Big Data, etc.
  • Learn about data science in a business context
  • Discover some business applications and use cases for data science

Day 1 Afternoon: Data Analysis with Python
Learn how to analyze data using Python. This section will take you from the basics of Python to exploring many different types of data. You will learn how to prepare data for analysis, perform simple statistical analyses, create meaningful data visualizations, predict future trends from data, and more!

Topics covered:

  • Importing and cleaning Data sets
  • Intro to Pandas, Numpy and Scipy libraries
  • Data frame manipulation
  • Summarizing the Data

Day 2 Morning: Statistics for Data Science with Python
Through lecture, labs and an assignment learn basics of Statistics for Data Science. First, the main concepts of statistics are taught through lecture including Central Limit Theorem, Normal Distribution, Descriptive Statistics, then you will practice those in lab.

Topic covered:

  • Descriptive Statistics Notebook: mean, Median and Standard Deviation
  • Histograms and Probability Mass functions Notebook: Calculate and Display data 
  • Normal Distribution and probability density functions

Day 2 Afternoon: Data Visualization with Python
A picture is worth a thousand words - or should we say data points? In this section, we will go through how to plot the major graphs in Python. Learn how to plot bar graphs, line graphs, histograms, and more. Finally, learn how to create an interactive visualization of data using Plot.ly.

Topic covered:

  • Intro to matplotlib and Plotly library
  • Histograms, Bar graphs, Line graphs and Scatter plots
  • Maps (creating maps using latitude, longitude data)

Day 3 Morning: Big Data with Python
You will learn how to work with Big Data using Apache Spark. Spark is a lightweight front-end library that is used for distributed processing when dealing with big data. You will read data from a big dataset, preprocess and apply preprocessing operations.

Topics covered:

  • Intro to Apache Spark
  • Reading data from a big dataset
  • Selecting data, filtering, and aggregating big data

Day 3 Afternoon: Machine Learning with Python
How can we get machines to learn from the data on their own? In this part you will learn get an overview of machine learning algorithms. To get hands-on practice with machine learning, you will work with real data sets and practice data mining techniques to predict or classify different datasets.  Also, you will learn how to choose the best algorithm for different problems in various domains and industries.

Topics covered:

  • Overview of Machine Learning
  • Which ML algorithm is proper for my problem?
  • Regression
  • Classification (Decision trees and KNN)
  • Clustering (Hierarchical and k-means)
  • Machine learning libraries, e.g

Day 4 Morning:  Intro to Deep leaning with TensorFlow
Deep learning is a subset of machine learning that uses neural networks to model high-level abstractions in data, which enables data scientists to create models on complex, unstructured data like images and videos. In this session you will work with specific type of deep learning, called convolutional neural networks, and use TensorFlow library to work with these networks.

 

Topics covered:

  • Intro to TensorFlow
  • Logistic Regression with TensorFlow
  • Convolutional Neural Networks

 

 

Day 4 Afternoon:  Final exam (optional).

Half day for optional exam to obtain a Verified IBM Badge

Schedule & Location

INTEGRATED TECHNOLOGIES LABORATORY LTD.
Website: www.intela-edu.com

 

Event will take place at

National Technical University of Ukraine 'Igor Sikorsky Kyiv Polytechnic Institute'