PDAI - EMbeDS-education/ComputingDataAnalysisModeling20232024 GitHub Wiki
This is the home page of the courses Programming and Data Analytics and AI
- Module 1
- Module 2
The right-sidebar can be used to navigate pages related to the course, e.g., to consult the calendar, access example datasets, reserve a classroom seat and mark attendance, and retrieve slides, code and materials for our Lectures.
SYLLABUS INFORMATION
Lecturer: Andrea Vandin ([email protected])
Former co-lecturer: Daniele Licari.
- Daniele co-designed these courses and has been co-lecturer until leaving Sant'Anna
Teaching Assistant: Sima Sarv Ahrabi ([email protected])
Language: English
Duration: Module 1 20h, Fall, 2023; Module 2 20h, Spring, 2024.
Description
This course will provide a well-structured introduction to (object-oriented) programming with applications to data processing and Artificial Intelligence. The content is structured in two modules that the students can attend in different years. The course uses python as reference language.
-
Module 1 introduces students to the fundamental principles of structured programming, with basic applications to data processing. It starts from basic notions of programming (variables, data types, collections, control & repetition structures, functions & modules), and progresses to basic data processing functionalities (loading, manipulation, and visualization of CSV data).
-
Module 2 introduces students to the components of typical data analysis processes and machine learning pipelines. It first builds the necessary toolset by introducing popular Python libraries for data manipulation/visualization (NumPy, Pandas, Seaborn, scikit-learn) with simple applications. The toolset is then applied to a more complex case study on the classification of benign and malignant breast cancer, including aspects of data preprocessing, dimensionality reduction, clustering, and classification. The course will conclude with one research-driven topics like process-oriented data science (Process Mining).
A student who has met the objectives of the course will acquire an understanding of the issues and tasks involved in structured computer programming, data analysis, and machine learning so to be able to make informed decisions. The student will be able to write Python programs of various nature, with a focus on complex data analysis tasks.
Prerequisites: No prerequisites for Module 1, while Module 2 requires knowledge on computer programming obtained attending Module 1.
Materials
The course makes extensive use of online repositories and game-based e-learning platforms to
- GitHub Wiki: collect and distribute slides, coding examples, datasets, and further course material
- Colab: distribute and automatically provide feedback for coding assignments
- Kahoot: perform online quizzes to monitor the learning process
Where possible, we will also coordinate some content and Practicum activities with the courses Applied Statistics (run in parallel with M1) and Statistical Learning for Large Data Topics in Statistical Learning (whose M1 is run in parallel with our M2).
Suggested books are
- Learning Python, M. Lutz
- Python for Data Analysis, W. McKinney
- Statistics and Machine Learning in Python, E.Duchesnay, T.Löfstedt, F.Younes
We will employ Pyhton as the programming language and statistical software of choice for the course.
- Please visit the setup your machine entry on the right sidebar
Evaluation
Students can attend single modules, therefore there will be an evaluation for each module.
- Module 1: Evaluation will be based on individual oral examinations on the topics covered in the course, starting from the students' solutions to the weekly assignments.
- Module 2: Evaluation will be based on oral examinations, starting from group project work and written reports to be held/handed in at the end of the course. Each group will use a given dataset (or propose one of interest) and apply to it techniques described during the course. The project report consists of a jupyterlab notebook as those used by the lecturers.
Attendance: The course will be given in blended mode using
- the rooms specified in the general calendar
- remotely on WebEx. The recurrent meeting link is: https://santannapisa.webex.com/meet/a.vandin
Allievi Ordinari of Scuola Superiore Sant'Anna have to attend the classes in person, if not explicitly justified (e.g., Allievi abroad participating to the ERASMUS project).
- All other attendees are allowed to attend in person only if enough seats are available (information on this is going to be provided via email). Alternatively, they will have to attend remotely on WebEx (this should not affect the learning process: some previous editions of the course were run mostly online).