Description
This course explores the basic concepts of web data mining, including Ranking, Clustering, Similarity and Classification, and basic methods togather contents of web data, structures of web data and behaviors of web users from web based systems. Also, this course explores the analysis methods for the gathered web data. To improve the criticism of the recent research related to web data mining, extensive paper reading assignments and team projects are conducted.
Instructor
Kyungbaek Kim
Office : Engineering Building #6, 715
Tel : +82-62-530-3438
Email : kyungbaekkim@chonnam.ac.kr
Office Hours : Thueday 19:30pm ~ 20:30pm
Time and Location
Tue 16:30pm-19:30pm, Engineering Building #6, 102
Main Text
DATA MINING THE WEB, Uncovering Patterns in Web Content, Structure, and Usage, by ZDRAVKO MARKOV and DANIEL T. LAROSE
Reference Texts
- Mining the Social Web, by Matthew A. Russell
Grading Policy
- Attendance : 10%
- Reading Assignments and Projects : 50%
- Tentatively Two papers per week : around 26 papers.
- Project.
- Midterm Exam : 20%
- Final Exam : 20%
Lecture Notes
- 0.Syllabus
- 1.Introduction
- 2.IR and Web Search
- 3.Hyperlink-based ranking
- 4.Clustering
- 5.Evaluating Clustering
Lecture notes are accessible through the eClass of JNU portal.
Homeworks, Quiz, Midterm/Final Exam
All of the materials related to homeworks, quiz, midterm exam and final exam, including solutions, are accessible through the eClass of JNU portal.
Reading Assignment
Submit the summary of given papers on the due date. Here is a template of summary.- Social Networks : Due on March 12
- Collaboration in social networks
- w03_Online team formation in social networks
- w03_Understanding task-driven information flow in collaborative networks
- w03_New objective functions for social collaborative filtering
- Information diffusion in social networks
- w04_Information transfer in social media
- w04_The role of social networks in information diffusion
- w04_Recommendations to boost content spread in social networks
- Social interactions and the web
- Community detection in social networks
- w06_Using content and interactions for discovering communities in social networks
- w06_Community detection in incomplete information networks
- w06_QUBE: a quick algorithm for updating betweenness centrality
- Security and fraud in social networks
- Fraud and bias in user ratings
- w09_Semi-supervised correction of biased comment ratings
- w09_Spotting fake reviewer groups in consumer reviews
- w09_Estimating the prevalence of deception in online review communities
- Crowdsourcing
- w10_Max algorithms in crowdsourcing environments
- w10_Crowdsourcing with endogenous entry
- w10_Answering search queries with CrowdSearcher
- Behavioral analysis and content characterization in social media
- Obtaining and leveraging user comments
- w12_Multi-objective ranking of comments on web
- w12_Care to comment?: recommendations for commenting on news stories
- w12_Leveraging user comments for aesthetic aware image search reranking
- Web mining
- w13_Trains of thought: generating information maps
- w13_Learning causality for news events prediction
- w13_Your two weeks of fame and your grandmother's
- Recommender Systems
- w14_Build your own music recommender by modeling internet radio streams
- w14_Using control theory for stable and efficient recommender systems
- w14_An exploration of improving collaborative recommender systems via user-item subgroups
- Advertising on the web
- Extra Topics
Presentation Schedule
2013.March.19 | Thang | w03_New objective functions for social collaborative filtering |
정회윤 | w03_Understanding task-driven information flow in collaborative networks | |
2013.March.26 | Do | w04_The Role of Social Networks in Information Diffusion |
Perdana | w07_Understanding and combating link farming in the twitter social network | |
2013.April.02 | Chuyen | w05_Echoes of power: language effects and power differences in social interaction |
Abhijeet | w05_Actions speak as loud as words: predicting relationships from social behavior data | |
2013.April.09 | Hong Quy | w06_Community detection in incomplete information networks |
Tan | w05_Bimodal Invitation-Navigation Fair Bets Model for Authority Identification in a Social Network | |
2013.April.16 | Ardiansyah | Who Killed My Battery: Analyzing Mobile Browser Energy Consumption |
한세진 | w07_Branded with a scarlet "C": cheaters in a gaming social network | |
2013.May.07 | Thang | w09_Spotting fake reviewer groups in consumer reviews |
정회윤 | w09_Semi-supervised correction of biased comment ratings | |
2013.May.14 | Do | w10_Max Algorithms in Crowdsourcing Environments |
Perdana | w07_Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter | |
2013.May.21 | Chuyen | w11_YouTube around the world: geographic popularity of videos |
Abhijeet | w11_Dynamical classes of collective attention in twitter | |
2013.May.28 | Hong Quy | w12_Multi-objective ranking of comments on web |
Tan | w11_We Know What @You #Tag: Does the Dual Role Affect Hashtag Adoption? | |
2013.June.04 | Ardiansyah | w13_Your two weeks of fame and your grandmother's |
한세진 | w13_Learning causality for news events prediction |
Team Project
Team Members | Subject of Team Project | Initial Report | Mid-Term Presentation/Report | Final-Term Presentation/Report |
Nguyen Hong Quy, Tran Huu Tan | Community detection in social networks | TBA | TBA | TBA |
Thang Hoang, Ardiansyah Musa Efendi | A Secured Remote Web Authentication System Based on Behavioral Biometric Using Mobile Phone | TBA | TBA | TBA |
한세진, Boragule Abhijeet | Music recommender by modeling internet radio streams | TBA | TBA | TBA |
Do Van Son, Luong Thi Chuyen | Place Recommendation Based on Users' shared photos | TBA | TBA | TBA |
정회윤, Perdana Adhitama | Classification of Trending Topic in Twitter | TBA | TBA | TBA |