• Course code:63536
  • Credits:6
  • Semester: summer
  • Contents

Mining Massive Datasets

Term of implementation: 5 January to 20 March 2024

Schedule: This course starts in the beginning of January. We will follow a weekly schedule which means that you will also have to do homework assignment during exam break.

The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.

Topics include: Frequent itemsets and Association rules, Near Neighbor Search in High Dimensional Data, Locality Sensitive Hashing (LSH), Dimensionality reduction, Recommendation Systems, Clustering, Link Analysis (PageRank), Large scale supervised machine learning, Data streams, Mining the Web for Structured Data, Relation extraction and Web Advertising.

 
  • Study programmes
  • Distribution of hours per semester
45
hours
lectures
30
hours
laboratory work
  • Professor
Teaching Assistant
Room:R3.69 - Kabinet
Course Organiser
Room:R2.58 - Kabinet