Description
This course explores the basic concepts of web data mining, including Ranking, Clustering, Similarity and Classification, and basic methods togather contents of web data, structures of web data and behaviors of web users from web based systems. Also, this course explores the analysis methods for the gathered web data. To improve the criticism of the recent research related to web data mining, extensive paper reading assignments and team projects are conducted.
Instructor
Kyungbaek Kim
Office : Engineering Building #6, 715
Tel : +82-62-530-3438
Email : kyungbaekkim@chonnam.ac.kr
Office Hours : Thursday 09:30 ~ 10:00
Time and Location
Tue 09:00-11:45, Engineering Building #7, 352
Main Text
DATA MINING THE WEB, Uncovering Patterns in Web Content, Structure, and Usage, by ZDRAVKO MARKOV and DANIEL T. LAROSE
Reference Texts
- Mining the Social Web, by Matthew A. Russell
Grading Policy
- Attendance : 10%
- Exercises and Homework : 50%
- Reading assignment : 15%
- Presentation (2 times) : 15%
- Project Preparation : 20% - Mid Exam : 20%
- Final Project : 20%
Lecture Notes
- 0.Syllabus
- 1.Introduction
- 2.Information Retrieval and Web Search
Lecture notes are accessible through the eClass of JNU portal.
Reading Assignment
- Data Mining the Web
- Mining the Social Web 2nd Edition
Presentation Schedule
- 2016-March-22
- Hyperlink-based ranking : Gde
- Clustering : Flavio
- 2016-March-29
- Evaluating Clustering : Tung
- Classification : Lam
- 2016-April-05
- Introduction to Web Usage Mining : Ngoc
- Preprocessing for Web Usage Mining : Quyet
- 2016-April-12
- Exploratory Data Analysis for Web Usage Mining : Ha
- Modeling for Web Usage Mining : Gemoh
- 2016-April-26
- Mining Twitter 1: Gde
- Mining Twitter 2: Quyet
- 2016-May-03
- Mining Facebook 1: Flavio
- Mining Facebook 2: Tiep
- 2016-May-10
- Mining LinkedIn 1: Tiep
- Mining LinkedIn 2: Misun
- 2016-May-17
- Mining Google+ 1: Gemoh
- Mining Google+ 2: Tung
- 2016-May-24
- Mining Web Pages 1: Lam
- Mining Web Pages 2: Misun
- 2016-May-31
- Mining Github 1: Ngoc
- Mining Github 2: Ha