Bootcamp length: 9 days ( in the evening )
Introduction to Cognitive Class Data Science Bootcamp:
This is an intensive hands-on bootcamp, where you can learn the fundamentals of data science from data scientists at IBM. You will learn how to use the popular Python programming language for data analysis and data visualization, we'll introduce you to machine learning, AI, big data, and Apache Spark. You'll have a chance to apply your new skills to on data science projects. What you will achieve at the end of the bootcamp: An understanding of data science, data cleaning and pre-processing, data visualization, and Apache Spark.
Day 1: Introduction to Data Science. What is Data Science? Learn about the importance of data, machine learning, and big data. Find out about IBM's free online resource for data science education – Cpgnitive Class. And get a feel for popular open data science tools through IBM's Cognitive Class Labs platform which includes Jupyter (IPython) Notebooks, RStudio IDE, and Apache Spark.
- Explore definitions of data science, paths to data science, R vs Python, data science tools, skills and technology, definition of cloud, Big Data, etc.
- Learn about data science in a business context
- Discover some business applications and use cases for data science
Day 2: Data Analysis with Python. Learn how to analyze data using Python. This section will take you from the basics of Python to exploring many different types of data. You will learn how to prepare data for analysis, perform simple statistical analyses, create meaningful data visualizations, predict future trends from data, and more!
- Importing and cleaning Data sets
- Intro to Pandas, Numpy and Scipy libraries
- Data frame manipulation
- Summarizing the Data
Day 3-4: Machine Learning with Python. How can we get machines to learn from the data on their own? In this part you will learn get an overview of machine learning algorithms. To get hands-on practice with machine learning, you will work with real data sets and practice data mining techniques to predict or classify different datasets. Also, you will learn how to choose the best algorithm for different problems in various domains and industries.
- Overview of Machine Learning
- Which ML algorithm is proper for my problem?
- Classification (Decision trees and KNN)
- Clustering (Hierarchical and k-means)
- Recommender systems
- Machine learning libraries
Day 5-6: Big Data with Python. You will learn how to work with Big Data using Apache Spark. Spark is a lightweight front-end library that is used for distributed processing when dealing with big data. You will read data from a big dataset, preprocess and apply preprocessing operations.
- Intro to Apache Spark
- Reading data from a big dataset
- Selecting data, filtering, and aggregating big data
Day 7: Statistics for Data Science with Python Through lecture, labs and an assignment learn basics of Statistics for Data Science. First, the main concepts of statistics are taught through lecture including Central Limit Theorem, Normal Distribution, Descriptive Statistics, then you will practice those in lab.
- Descriptive Statistics Notebook: mean, Median and Standard Deviation
- Histograms and Probability Mass functions Notebook: Calculate and Display data
- Normal Distribution and probability density functions
Day 8: Data Visualization with Python, A picture is worth a thousand words - or should we say data points? In this section, we will go through how to plot the major graphs in Python. Learn how to plot bar graphs, line graphs, histograms, and more. Finally, learn how to create an interactive visualization of data using Plot.ly.
- Intro to matplotlib and Plotly library
- Histograms, Bar graphs, Line graphs and Scatter plots
- Maps (creating maps using latitude, longitude data)
Day 9: Final exam (optional).
2 hours for optional exam. Participants with a passing grade will receive:
- an IBM course completion certificate
- an IBM badge
CERTIFICATION & IBM BADGE
Data Science Bootcamp (validated badge)
Both the completion certificate and the badge will be stored and verifiable by Documentorum, an academic credentials blockchain.