Logged in as: guest Log in


  • December 7: Problem Set #3 has been graded and Problem Set #1 re-graded. I will do my best finish grading Problem Set #2 tonight.
  • December 4: I will hold a final study session for Comp 521 on Saturday from 4pm to 6pm on Saturday 12/6. It will be held in SN011.
  • November 18: As I announced in class today, the second midterm has been delayed until 11/25 due to popular demand. If that time will not work for you make sure that you send me an email to arrange another time to take the test. I will also hold a study session to prepare for the 2nd midterm on Wednesday night, 11/19. During the study session I will go over a midterm from a previous semester.
  • November 11: Problem Set 3 is now on-line. It is due in only one week (11/18). I will hold extra office hours on Monday 11/17 from 3pm-5pm. I will also hold a study session in SN-014 from 6pm-8pm that night.
  • November 3: I have postponed the second mid-term until 11/20. More news to come soon.
  • October 8: The due date of the second problem set has been moved to Oct 14. Make good use of the extra weekend, I suggest that you read the online manual for the sqlite3 Python package (and the URL is fixed now... thanks for the bug reports). Leonard McMillan will hold extended office hours from 5pm-7pm on Monday, Oct 13.
  • September 30: The first midterm will be given in class on 10/2. It will be open notes, but closed books, closed computer, mobile phone, and tablet or any other device with internet access. The format of the test will be multiple choice (~18-20 questions) and short answer (~2-4 questions). There will be a midterm study session held tomorrow night in SN014 from 6pm-8pm.   
  • September 26: The second problem set is now posted, and its new due date is October 9, 2014. Stay tuned for an announcement concerning the first mid-term.
  • September 10: Due to the number of bugs in the problem set found during office hours, I have decided to extend the deadline until noon on Monday, September 15.
  • September 6: A minor error was found in the "Gene" schema of Problem 3 in the first problem set. It has been fixed. I also clarified the queries in question 4, by replacing the phrases "Superman" movie, "Batman" movie, and "Spider-Man" movie with the movie entitled "Superman", etc. Please download it again.
  • September 5: The sailing club database (right click or command-click for menu, then "save link as...") for sqlite3 is available for exploration.
  • September 4: All problems are now posted for the first Problem Set. It is due one week from today in class.
  • August 26: Create a course login at this link. Use Comp521 as your "Institute" and either your Onyen or CS login as "Email".
  • August 19: First day of class and the course syllabus.

Course Description

Databases are an indispensable tool for managing information, and a course on the principles and practice of database systems is now an integral part of any computer science curricula. This course covers the fundamentals of modern database management systems, in particular relational database systems.

The material covered in Comp 521 can be broken into three areas of emphasis. The first area includes database foundation material such as the relational model, relational algebra, relational calculus, and normal forms. The second area of emphasis is database application programming and includes among others the topics of Structured Query Language (SQL), eXtensible Markup Language (XML), integrating databases into programs, web-based database usage, and triggers and active databases. The third area of emphasis is the systems side of databases, which includes database indexing, efficient query evaluation, the transaction-based model, concurrency, and security.

This course is suitable for computer science majors at both undergraduate and graduate levels. Students who wish to take this course should have some programming experience in a modern language and knowledge of data structures. There will be five homeworks, two mid-terms, and a final exam.

Book, Course Information, and Prerequisites

Database Textbook

Textbook (optional): Database Management Systems, Third edition
by Raghu Ramakrishnan and Johannes Gehrke
McGraw-Hill Higher Education © 2003, ISBN: 0072465638.
Credit Hours: 3
Location: FB007
Time: TTh 9:30-10:45PM
URL: http://www.cs.unc.edu/Courses/comp521-f14
Prerequisite: COMP 401, Comp 411 or equivalents
Some background in data structures (Comp 410) is also helpful.

Course Instructors

Leonard Instructor: Leonard McMillan    Co-Instructor: Erik Scott

Office: SN311    Office: SN127
email: mcmillan@unc.edu    email: escott@renci.org

Office Hours:

Leonard McMillan: Wednesdays 3pm-5pm
Erik Scott: Thursdays 11:15am-1:15pm
Chad Armstrong: SN364 Mondays 3:30pm-5:00pm


Week 1:
Aug 19 Introduction and Overview (pdf) LM Chapter 1 Survey
Aug 21 Entities and Relations (pdf) LM Chapter 2  
Week 2:
Aug 26 The Relational Model (pdf) ES Chapter 3  
Aug 28 Relational Algebra and Calculus (pdf) LM Chapter 4 PSet #1 out
Week 3:
Sept 2 SQL: Basic Queries (pdf) ES Chapter 5.1-5.4  
Sept 4 SQL: Advanced Queries (pdf) ES Chapter 5.5-5.6  
Week 4:
Sept 9 SQL Constraints & Triggers (pdf) LM Chapter 5.7-5.9  
Sept 11 Database Applications; Part 1 (pdf) LM Chapter 6.0-6.3 PSet #1 due
Week 5:
Sept 16 Database Applications; Part 2 (pdf) ES Chapter 6.4-6.7  
Sept 18 Databases and Internet Apps; Part 1 (pdf) ES Chapter 7.0-7.5  
Week 6:
Sept 23 Databases and Internet Apps; Part 2 (pdf) ES Chapter 7.5-7.9  
Sept 25 Overview of Storage and Indexing (pdf) ES Chapter 8 PSet #2 out
Week 7:
Sept 30 Storing Data: Disks and Files (pdf) ES Chapter 9  
Oct 2 Mid-Term #1 (covers Chaps 1-7) LM    
Week 8:
Oct 7 Return and Discuss Mid-term ES    
Oct 9 Tree-Structured Indexing (pdf) ES Chapter 10


Week 9:
Oct 14 Hash-based Indexing (pdf) LM Chapter 11 PSet #2 due
Oct 16 Fall Break (no class)      
Week 10:
Oct 21 Overview of Query Evaluation (pdf) LM Chapter 12  
Oct 23 External Sorting (pdf) LM Chapter 13  
Week 11:
Oct 28 Evaluation of Relational Operators (pdf) ES Chapter 14  
Oct 30 Schema Refinement and Normal Forms (pdf) ES Chapter 19
Week 12:
Nov 4 Schema Refinement and Normal Forms (pdf) LM Chapter 19

Nov 6 Mid-term #2 (covers Chaps 8-14)
Database Design and Tuning (pdf)
LM  Chapter 20
Week 13:
Nov 11 Transaction Managenment (pdf) ES Chapter 16 PSet #3 out
Nov 13 Concurrency Control (pdf) ES Chapter 17  
Week 14:
Nov 18 Crash Recovery (pdf)
LM Chapter 18
Pset #3 Due 
Nov 20 Midterm #2 (covers Chaps 8-14) LM    
Week 15:
Nov 25
Return Midterm and Catch up
Nov 27
Thanksgiving (no class)
Week 16:
Dec 2 Bleeding-edge database topics (pdf) ES
Final Exam:
Dec 9        8:00am-11:50am


SQLlite is a open-source light-weight serverless database management system that uses ordinary disk files as its repository. Despite its simplicity, sqlite3 is one of the most widely used databases in the world. For example, nearly every smartphone uses sqllite to track the settings and configurations of their local apps. It is widely supported via libraries by nearly every modern programming language (Java, C#, Python, etc). Therefore, Sqllite is a DBMS well worth learning.

If you are using either OSX or Linux, sqlite is already installed on your system. You can access it by typing "sqlite3 " at a shell command prompt. If you use Windows you will need to install sqlite3 from http://www.sqlite.org/download.html. I recommend installing the precompiled windows binary.

IMDb Movie database
IMDb provides one of the most extensive and, according to their website, most authoritative movie databases on the planet. They also provide dumps as a means for taking snapshots. Each fall when Comp 521 is taught, I take a snapshot in early August of IMDb with the following simplified sqlite schema:
    first TEXT DEFAULT '',
    last TEXT DEFAULT '',
    dob DATE DEFAULT '',
    gender TEXT DEFAULT '');

    title TEXT DEFAULT '',
    rating TEXT DEFAULT '');

    aid INTEGER REFERENCES Actor(aid),
    mid INTEGER REFERENCES Movie(mid),
    role TEXT DEFAULT '',
    billing INTEGER DEFAULT 0);
which I like to use as an example database for querying, indexing, and examining query execution plans. You can download a "zip" compressed copy here (149 Mb). (You might find it useful for various homework problems.)

Site built using pyWeb version 1.10
© 2010 Leonard McMillan, Alex Jackson and UNC Computational Genetics