Big Data for Quants Boot Camp
Overview
Note: Registration is now closed. For questions, or to receive notification of future courses, please contact
Brief Description
This four-day course/bootcamp will take place at the Fields Institute. Practical data science techniques will be covered, along with hands-on experience in Python. The big data tools Hadoop, Apache Hive and Apache Pig will be introduced.
Course Objective
- Reframing a business challenge as an analytics challenge.
- Familiarization with advanced statistical concepts and machine learning.
- Demonstrating the fundamental concepts and techniques of data analytics.
- Defining basic concepts related to big data analytics.
- Recognizing categories of tools used in big data analytics.
- Discussing the challenges related to big data analytics.
Instructors
Dr. Ceni Babaoglu is a senior research fellow and Data Science instructor at Ryerson University. Since receiving her PhD in Applied Mathematics from Istanbul Technical University, she has worked as a researcher and Mathematics instructor in Istanbul, Sweden, and Toronto. Three years ago, inspired after visiting the Data Science Laboratory, Ryerson – where she learned the practical applications of data analytics and how it involved (her specialty) mathematics – Ceni shifted her focus to data. Her current research is focused on numerical analysis, data mining, and machine learning programming.
Dr. Sebnem Kuzulugil is a senior research fellow and Data Science instructor at Ryerson University. She has degrees in Computer Science and Business Administration from Istanbul Bogazici University. She has worked on both fields as an instructor and as consultant in Istanbul, Turkey. Her interest in statistical methods was transformed into data analytics after her visit to Toronto and Ryerson University three years ago. Joining the Data Science Laboratory, she has focused her studies more towards linear models, prediction and machine learning as well as analysis of Big Data on distributed computing platforms.
Course Schedule
Day 1
Session 1 (9:00-10:30)
- Introduction
- Data science overview
- The role of the data scientist and skills required
Session 2 (11:00-12:30)
- Business problem framing
- Data analytics life cycle
Session 3 (13:30-15:00)
- Preparing data for analysis
- Statistical tools for data analysis
Session 4 (15:15-16:45)
- Hands on practice: preprocessing and extraction of financial data using Python
Day 2
Session 1 (9:00-10:30)
- Exploratory data analysis
- Dimensionality reduction, cross validation
Session 2 (11:00-12:30)
- Hands on practice: exploratory data analysis of financial data using Python
Session 3 (13:30-15:00)
- Machine learning methods (supervised)
- Statistical prediction: regression
- Methods: multiple linear regression, logistic regression
- Statistical prediction: classification
- Methods: k-nearest neighbors, decision trees, Naïve Bayes
Session 4 (15:15-16:45)
- Hands on practice: regression and classification on financial data using Python
Day 3
Session 1 (9:00-10:30)
- Machine learning methods (unsupervised)
- Statistical prediction: clustering
- Method: k-means clustering
Session 2 (11:00-12:30)
- Hands on practice: clustering of financial data using Python
Session 3 (13:30-15:00)
- Model evaluation, confusion matrices, ROC curves
Session 4 (15:15-16:45)
- Hands on practice: confusion matrices in Python
Day 4
Session 1 (9:00-10:30)
- Introduction to big data
- Hadoop ecosystem
Section 2 (11:00-12:30)
- Moving data in and around HDFS, hands on practice
Section 3 (13:30-15:00)
- Tools for big data analytics: Apache Pig and Spark
- Hands-on practice: Apache Pig and Spark
Session 4 (15:15-16:45)
- Tools for big data analytics, Apache Pig, Spark
- Apache Pig and Spark, hands on practice
- Students: $150
- Other academics: $350
- Fields Corporate Sponsors: $750
- General Public/Industry: $1500
Spaces are limited, so early registration is advised.
To register, please click here.