- Oggetto:
Big Data Science and Machine Learning_23
- Oggetto:
Big Data Science and Machine Learning
- Oggetto:
Academic year 2023/2024
- Teacher
- Federica Legger (Lecturer)
- Degree course
- PhD in Physics
- Year
- 1st year, 2nd year, 3rd year
- Teaching period
- First semester
- Type
- Elective
- Credits/Recognition
- 4
- Course disciplinary sector (SSD)
- FIS/01 - experimental physics
- Delivery
- Traditional
- Language
- English
- Attendance
- Obligatory
- Type of examination
- Practice test
- Prerequisites
- Basic knowledge of python is required
In particular I suggest to get familiar with Jupyter notebooks, numpy and pandas before the course starts. No expert knowledge is required, but doing a couple of tutorials on these topics (easily found on the web) is highly recommended.
- Oggetto:
Sommario del corso
- Oggetto:
Course objectives
Data science is one of the fastest growing fields of information technology, with wide applications in key sectors such as research, industry, public administration. The course will cover the definition of big data and the basic techniques to store, handle and process them. Machine Learning (ML) and Deep Learning (DL) algorithms will be briefly introduced. We will focus on the technical implementation of different ML algorithms, focusing on the parallelisation aspects and the deployment on distributed resources and different architectures (CPUs, FPGAs, GPUs). A basic introduction to the current computer architecture will be given, with a focus on parallel computing paradigms aimed at the exploitation of the full potential of parallel architectures.
- Oggetto:
Results of learning outcomes
KNOWLEDGE AND UNDERSTANDING
Fundamental concepts of:
- big data science
- Machine Learning and Deep Learning
- computer architectures and distributed systems
APPLYING KNOWLEDGE AND UNDERSTANDING
Ability to:- implement various machine learning model architectures and metrics
- use set of machine learning libraries
- faster code execution by parallelization of tasks, avoiding race conditions
- Oggetto:
Program
- Introduction to big data science
- The big data pipeline: state-of-the-art tools and technologies
- ML and DL methods: supervised and unsupervised training, neural network models
- Introduction to computer architecture and parallel computing patterns
- Parallelisation of ML algorithms on distributed resources
- Beyond CPUs: ML applications on distributed architectures, GPUs, FPGAs- Oggetto:
Course delivery
The course will be held in person, and it will not be possible to attend remotely. To pass the course you need to follow 80% of the lessons and pass the final test.
- Oggetto:
Learning assessment methods
Practical test
Suggested readings and bibliography
- Oggetto:
Chen, M., Mao, S. & Liu, Y. Mobile Netw Appl (2014) 19: 171. https://doi.org/10.1007/s11036-013-0489-0
Yao, Yuanshun & Xiao, Zhujun & Wang, Bolun & Viswanath, Bimal & Zheng, Haitao & Y. Zhao, Ben. (2017). Complexity vs. performance: empirical analysis of machine learning as a service. 384-397. 10.1145/3131365.3131372
- Oggetto:
Notes
Students wishing to take this course must register!
Schedule:
- Nov 18th, 10:00-12:00, Aula Wataghin
- Nov 19th, 14:00-16:00, Aula Fubini
- Nov 20th, 10:00-12:00, Aula Wataghin
- Nov 21st, 10:00-12:00, Aula Wataghin
- Nov 22nd, 11:00-13:00, Aula Fubini
- Oggetto:
Class schedule
Notes: 2023-2024 schedule is still being finalised
- Enroll
- Closed
- Enrollment opening date
- 02/11/2022 at 00:00
- Enrollment closing date
- 20/11/2024 at 00:00
- Maximum number of students
- 15 (Once this number of students is reached, enrollment will no longer be permitted!)
- Oggetto: