Advanced Data Mining - CS724
Objective | The objective of this course is to develop the state-of-the-art technical and research skills in Big Data Mining and Management. |
Instructor | Tarique Anwar Office: Room 205, CSE Block, IIT Ropar Email: tarique@iitrpr.ac.in |
Teaching Assistants | TBA |
Class Schedule | Monday, 09:00am-09:50am, Lecture Tuesday, 10:00am-10:50am, Lecture Wednesday, 11:00am-11:50am, Lecture |
Venue | M2, Lecture Hall Complex |
Credits | 4 (3 Lectures + 0 Tutotials + 2 Labs + 7 Self-study hours, weekly) |
Who can take this course | Pre-requisites: 1. CS356 (ADA) / CS506 (DSA), 2. CS524 (DM) / CS503 (ML) For UG students, Grade must be |
Syllabus ♣
Serial No. | Topics | Reference papers | Lecture Notes / Slides |
---|---|---|---|
1. | Managing and Mining Streaming and Time-Series Data | Download | Download |
2. | Managing and Mining Discrete Sequence Data | Download | Download |
3. | Managing and Mining Graph and Multirelational Data | Download | Download |
4. | Managing and Mining Spatial, Spatio-temporal, Object, Multimedia, Text, and Web Data | Download | Download |
5. | Managing and Mining Dynamic and Evolving Networks | Download | Download |
6. | Managing and Mining Urban Data | Download | Download |
7. | Recent Advancements in Big Data Mining and Management, Discussion on related research papers published in the last 5 years | Download | Download |
Reference Papers
Serial No. | Title | Venue | Download |
---|---|---|---|
1. | gSparsify: Graph Motif Based Sparsification for Graph Clustering | CIKM 2015 | [Paper] [Presentation] |
2. | Automatic Discovery of Tactics in Spatio-Temporal Soccer Match Data | SIGKDD 2018 | [Paper] [Presentation] |
3. | The Flexible Socio Spatial Group Queries | VLDB 2018 | [Paper] |
4. | Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection | ACM TKDD 2015 | [Paper] [Presentation] |
5. | Efficient Computation of Multiple Density-Based Clustering Hierarchies | ICDM 2017 | [Paper] |
6. | MustaCHE: A Multiple Clustering Hierarchies Explorer | VLDB 2018 | [Paper] [Presentation] |
7. | Gotcha - Sly Malware! Scorpion: A Metagraph2vec Based Malware Detection System | SIGKDD 2018 | [Paper] |
8. | A Framework for Clustering Evolving Data Streams | VLDB 2003 | [Paper] |
9. | Clustering Stream Data by Exploring the Evolution of Density Mountain | VLDB 2017 | [Paper] |
10. | Real-time Constrained Cycle Detection in Large Dynamic Graphs | VLDB 2018 | [Paper] |
11. | Multiple Infection Sources Identification with Provable Guarantees | CIKM 2016 | [Paper] |
12. | Fast and Scalable Big Data Trajectory Clustering for Understanding Urban Mobility | TITS 2018 | [Paper] [Presentation] |
13. | Affective Neural Response Generation | ECIR 2018 | [Paper] [Presentation] |
Assessment Policy ♣
|
Weightage: 10% |
Research Project | Weightage: |
Mid-Semster Examination ♦ | Weightage: Syllabus: Topics covered till the last class. |
End-Semster Examination ♦ | Weightage: Syllabus: Entire Syllabus |
Grading Policy | A combination of absolute and relative grading will be followed. |
♣ Tentative
♦ Some Quizzes and Exams will be open-book/notes. The exact format will be announced one day before the scheduled date. Keep checking the announcements at the bottom of this page.
Textbooks
1. Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2011
2. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison-Wesley, 2005
3. Mohammad J Zaki and Wagner Meira Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014
4. Charu C. Aggarwal, Data Mining: The Textbook, Springer, 2015
Apart from the above books, the following Journals and Conferences may also be referenced.
1. Journals: IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on
Big Data, ACM Transactions on Database Systems, ACM Transaction on Knowldge Discovery from Data, VLDB Journal, Data Mining and Knowledge Discovery, and Information Systems
2. Conferences: VLDB, SIGMOD, SIGKDD, ICDE, ICDM, EDBT, CIKM, WWW, and WSDM
Announcements
20/01/2019: Those who are taking the course but have not submitted the ADD request on CRP, need to submit the request at an earliest (CRP has listed the course code as CS7XX).
Also, please join the Advanced Data Mining course on Moodle, with enrolment key cs724_201820192.
18/01/2019: A list of papers have been uploaded in the section of "Reference Papers" above. Groups of 1/2/3 students can be formed, and each group needs to select 1-2 (multiplied by the group size) papers of own choice, and inform me about the selected papers. The group is then supposed to thoroughly read through the papers, and present and discuss in the class.
15/01/2019: Tomorrow, we will discuss on the paper "gSparsify: Graph Motif Based Sparsification for Graph Clustering", CIKM 2015. Download a copy from here.
13/01/2019: I will be on leave tomorrow. So there will be no Advanced Data Mining classes tomorrow (Monday, 14th January 2019). Any urgent communications can be directed through email.
09/01/2019: Considering some requests, the minimum requirements of the CS524 (DM) and CS724 (ADM) have been changed. Now, the students who have already completed the CS503 (ML) course with a grade of B or lower, can enrol in CS524 (DM). The UG students can enrol in CS724 (ADM), only if their grade in CS524 is at least A-.
08/01/2019: The confusion about the timing of this course has been clarified and confirmed. Please see the details above. Classes are going to start from tomorrow. The first class is going to take place at 11am tomorrow in M2, Lecture Hall Complex.
08/01/2019: Welcome to the course CS724 - Advanced Data Mining. It will be taught by Dr Tarique Anwar (myself). The timing, venue and other details will be announced here shortly.
Note: This page will be updated regularly with all the helpful information and announcements. Students are recommended to keep checking the updates here.