Description
The purpose of the graduate-level course is understanding the recent trend of research fields related to distributed systems such as overlay network systems, cloud computing systems, distributed processing systems and data center issues. The course includes extensive readings of recent publications and programming projects of distributed systems.
In this semester, the main topic of this course is Map/Reduce distributed computation. The course covers Hadoop Map/Reduce(exercises), Hadoop related projects and recent Map/Reduce paper topics. Students will do projects implementing Map/Reduce modules.
Instructor
Kyungbaek Kim
Office : Engineering Building #6, 715
Tel : +82-62-530-3438
Email : kyungbaekkim@chonnam.ac.kr
Office Hours : Mon 5pm ~ 6pm
Time and Location
Mon 9am-12pm, Engineering Building #6, 103
Reference Text
- Hadoop: The Definitive Guide, by Tom White
- Do it! 직접 해보는 하둡 프로그래밍, by 한기용
- Distributed systems principles and paradigms 2nd edition, by andrew S. Tanenbaum and Marrten Van Steen
Grading Policy
- Attendance : 10%
- Reading Assignments and exercises : 30%
- Tentatively Two papers per week : 13 papers
- Hadoop exercises
- Projects : 40%
- Personal Project : research on Hadoop related projects
- Team Project
- Final Exam : 20%
Lecture Notes
Lecture notes are accessible through the eClass of JNU portal.
- 0.Syllabus
- 1.BigData
- 2.Hadoop MapReduce
- 3.Setting up Hadoop MapReduce
- 4.Hadoop MapReduce Programming
Homeworks, Quiz, Midterm/Final Exam
All of the materials related to homeworks, quiz, midterm exam and final exam, including solutions, are accessible through the eClass of JNU portal.
Exercise Homework
- Exercise Homework 01
- Exercise Homework 02
Reading Assignment
Submit the summary of given papers on the due date. Here is a template of summary.- Due on 21th September
- [2004 OSDI]MapReduce Simplified Data Processing on Large Clusters
- Due on 27th September
- [2010 ICDE]Hive a petabyte scale data warehouse using hadoop
- Due on 4th October
- [2010 HPDC]Twister A Runtime for Iterative MapReduce
- Due on 18th October
- [2012 HotCloud]Discretized Streams An Efficient and Fault Tolerant Model For Stream Processing on Large Clusters
- Due on 25th October
- [2011 MIDDLEWARE]Resource Provisioning Framework for MapReduce Jobs with Performance Goals
- Due on 1st November
- [2013 MIDDLEWARE]FlowFlex Malleable Scheduling for Flows of MapReduce Jobs
- Due on 8th November
- [2011 VLDB Endowment]CoHadoop Flexible Data Placement and Its Exploitation in Hadoop
- Due on 15th November
- [2012 VLDB Endowment]M3R Increased Performance for In Memory Hadoop Jobs
- Due on 22th November
- [2012 VLDB]Muppet MapReduce Style Processing of Fast Data
- Due on 29th November
- [2013 BigData]Scalable Distributed Event Detection for Twitter
- Due on 6th December
- [2013 ICDCS]Efficient Geo-Distributed Data Processing with Rout
- Due on 13th December
- [2013 ICDCS]HybridMR A Hierarchical MapReduce Scheduler for Hybrid Data Centers
Personal Projects
- Ngoc
- Zookeeper - 2016/10/19
- Spark - 2016-11-09
- Quyet
- Sqoop - 2016/10/19
- Kafka - 2016-11-09
- Quan
- Cassandra - 2016/11/02
- Flume - 2016-11-23
- 주종민
- Hive - 2016/11/02
- Storm - 2016-11-23
- 권민구
- Hbase - 2016/11/02
- Mahout - 2016-11-23
- Dong
- Pig - 2016/11/09
- Zeppelin - 2016/11/30