Advanced Data Mining
News:
End of the Semester:
I propose some additional time slots for discussing assignements/projects (online by default). Please book them in my Google Calendar. If you prefer stationary meeting, I'll be available on Thursday, June 22, 10-12, room 203 (earlier booking by email would be appreciated). You may also contact me by email for additional time slots, if needed.
Organizational Issues:
Due to the untypical organization of this semester, I made some assignments non-obligatory: the list of assignments 5 contains the last regular assignments and the list of assignments 6 contains bonus assignments. Therefore, instead of 80 points in total, there are only 76 points for the obligatory assignments, so I will rescale the points by multiplying by 80/76.
End of the Semester - labs:
The absolutely final deadline for presenting the assignemnts as well as the projects is July 2, but I appreciate earlier presentations.
End of the Semester - exam:
I propose 2 dates of exam to choose by each student: Thursday, June 22, 2023 at 12:00, room 119 and Wednesday, June 28, 2023 at 12:00, room 119. Please contact me by email, if none of the dates suit you. Students who completed their projects for at least 50% of points may be exempted from the exam with the grade equal to the grade for labs.
Scores:
Scores are published in SKOS (login required).
Assignments:
List of assignments 1 PDF (deadline March 17, but each assignment presented by March 10 gives 1 bonus point)
- Jupyter Python notebook with Introduction to Time Series Prediction on Airline Passengers HTML IPYNB
- Jupyter Python notebook with Bid and Ask Reconstruction HTML IPYNB
- Data for the assignments: ZIP (the password is the title of our lecture written in lowercase without spaces - due to the copyright, please do not publish the data and do not use them for purposes unrelated to our lecture)
List of assignments 2 PDF (deadline April 14, but each assignment presented by March 31 gives 1 bonus point)
- Jupyter Python notebook with Introduction to Time Series Clustering HTML IPYNB
- Data for the assignments: ZIP (the password is the title of our lecture written in lowercase without spaces - due to the copyright, please do not publish the data and do not use them for purposes unrelated to our lecture)
List of assignments 3 (by Mikołaj Słupiński) IPYNB (deadline May 12, but each assignment presented by May 5 gives 1 bonus point)
List of assignments 4a - on Kalman Filters (by Mikołaj Słupiński) IPYNB (deadline June 2, but each assignment presented by May 26 gives 1 bonus point)
List of assignments 4b - on SLDS (by Mikołaj Słupiński) IPYNB NPY (deadline June 2, but each assignment presented by May 26 gives 1 bonus point)
List of assignments 5 PDF (deadline - end of the semester)
List of assignments 6 (optional - bonus assignments) PDF (deadline - end of the semester)
Project on Advanced Data Mining:
Project on Advanced Data Mining PDF
Lecture presentations:
Predicting Time Series Data PDF
Jupyter Python notebook with Introduction to Time Series Prediction on Airline Passengers HTML IPYNB
Predicting Sequential Data PDF
Session-based Recommendation with Graph Neural Networks PDF BLOG
Target Attentive Graph Neural Networks for Session-based Recommendation PDF
Jupyter Python notebook with Introduction to Time Series Clustering HTML IPYNB
Time Series Classification PDF
Time Series Classification with Shapelets PDF
Introduction to Recommender Systems PDF
Basic Collaborative Filtering Algorithms PDF
Matrix Factorization in Recommender Systems PDF
Neural Collaborative Filtering PDF
Self-supervised Graph Learning for Recommendation PDF
Self-Supervised Graph Co-Training for Session-based Recommendation PDF
Hidden Markov Models (by Mikołaj Słupiński) PDF
Probababilistic Graphical Models (by Mikołaj Słupiński) PDF
Exponential Families, Conjugate Priors and Kalman Filters (by Mikołaj Słupiński) PDF
Switching Dynamical Systems (by Mikołaj Słupiński) PDF
Mini-talks:
I propose the following topics for mini-talks (the list will be updated during the semester). Please contact me, if you are interested in preparing such a mini-talk. It should last about 15 minutes and present the topic in more or less details (depending on the particular topic and the possibility of summarizing it in 15 minutes).
1. Soft-DTW distance - based on M. Cuturi, M. Blondel, "Soft-DTW: a Differentiable Loss Function for Time-Series". ICML, 2017, pp.894-903. PDF
2. Evolutionary Algorithms approach for discovering time series shapelets - based on G. Vandewiele, F. Ongenae, F. De Turck, "GENDIS: Genetic Discovery of Shapelets". Sensors, 21(4), 2021. LINK [booked for J.K.]
3. Bayesian probabilistic matrix factorization using Markov Chain Monte Carlo.
4. Matrix Factorization in ranking tasks.
5. Bayesian Factorization Machines.
6. Fast Context-aware Recommendations with Factorization Machines.
7. Factorial HMM.
8. Oops I Took A Gradient: Scalable Sampling for Discrete Distributions LINK
9. Hamiltonian Monte Carlo LINK
10. Changepoint Detection LINK
11. Time2Graph LINK
12. TraClu - the trajectory clustering algorithm.