Overview

Announcements

Syllabus

Lectures

Projects

Exams

Laboratory

E-mails

Syllabus for ICOM 6005 – Database Management Systems Design

Fall 2001

 

Professor:

            Dr. Manuel Rodríguez Martínez

            Office: T-212

            Phone: (787)-832-4040, x-3023

            E-mail: manuelr@acm.org

            Office hours: TBA, or by appointment.

 

Course Description:

Study of the techniques for building traditional, relational Database Management Systems (DBMS). This course focuses on design, implementation, performance and reliability considerations and highlights the interdependencies among the choices facing the system engineer. Topics include: Reviews of ER-model, Relational Model, Relational Algebra, and Structured Query Language (SQL). Major emphasis on Database Engine Architecture, Disk Storage Organization, Buffer Management, B+-trees indexing, Hash-based indexing, Traditional Join Algorithms, Two-Phase Locking and Concurrency, Write-Ahead Logging, Query Optimization, Database Benchmarking, Object-Oriented Databases, Data Warehousing and Data Mining. A semester-long project involves constructing modules of a small relational database system that incorporates many of the techniques studied in class.

 

Prerequisite:               ICOM 4017 or its equivalent. Proficiency with C, and UNIX.

 

Time and Place:         Tuesdays and Thursdays, 4:30 PM – 5:50 PM, S-227

 

Credits:                       3 credits

 

Web Page:                  http://www.ece.uprm.edu/~manuel/class/fall01/icom6005/index.html

Texts:

            Required:

                        Database Management Systems, 2nd edition

                        Raghu Ramakrishnan and Johannes Gehrke

                        McGraw-Hill Higher Education, 2000

ISBN: 0072322063

Recommended:

                        The C Programming Language, 2nd edition

                        Brian W. Kernighan and Dennis M. Ritchie

                        Prentice Hall, 1988

                        ISBN: 0-13-110362-8

 

                        Essential C++

                        Stanley B. Lippman

                        Addison-Wesley, 2000

                        ISBN: 0-201-48518-4

 

Grading:

Your grade will be based exclusively on the scores that you obtain in the class projects, exams and class participation. The curve used to assign a grade to your score will be as follows:

            Score (%)        Grade

100-90                         A

89.9-80                       B

79.9-70                       C

69.9-65                       D

65.0-0                           F

 

Your total score will be calculated from your individual scores in the projects, exams and class participation. The weights assigned to each of these categories are as follows:

 

Programming Projects   (5)                                           50%

            Class Participation                                                        10%

            Midterm Exams (2)                                                        20%

            Take-Home Final Exam (Comprehensive)                 20%

 

There will be no special project, no special homework, no special exam, nor any other kind of “special work” to improve grades. However, each project or exam might have an extra credit problem that you can use to help improve your score in that corresponding category.

 

Exams:

In this course, there will be two midterm exams and a comprehensive take-home final exam.  Unless otherwise indicated, all midterm exams will be taken with closed books and closed notes. The midterms exams will be administered during the regular class time. The date for each midterm exam will be announced later on. The final exam will be administered in accordance with the schedule specified by the Registrar of the University of Puerto Rico, Mayagüez Campus.

 

Each question included in each exam (midterm or final) will fall in one of the following categories:

  1. Explanation of a technical concept.
  2. Proof of a mathematical proposition.
  3. Solution to a problem using the concepts discussed in class.
  4. Tracing of either C code segments or algorithms.
  5. Implementation of C structures and code segments.

 

Exam Reposition Policy:

A student that misses a midterm exam will be given a reposition exam only under the following conditions:

  1. The student must inform the professor about the absence before the scheduled exam time. If the student cannot reach the professor, then the student must call one of the secretaries that work with the Electrical and Computer Engineering Department and inform her. Again, this must be done before the scheduled exam time.
  2. The student must present a valid excuse for missing the exam. Such excuse must be one of the following:
    1. Medical certificate indicating illness.
    2. Legal certificate indicating an appointment to attend a Court of Law.
    3. Certificate from a hospital or a physician indicating the death of either: parent, child, husband, wife or sibling.
  3. The professor will specify the time and date for the reposition exam.
  4. The reposition exam will be more difficult than the exam administered at the regular class time. (NOTE: This is done to prevent abuse of this policy.)
  5. There will be no reposition of a reposition exam.
  6. There will be no reposition of the final exam.

Incomplete Grade Policy:

A student will receive an incomplete grade if and only if the student cannot submit the final exam and has a valid excuse.  Such excuse must be one of the following:

  1. Medical certificate indicating illness.
  2. Legal certificate indicating an appointment to attend a Court of Law.
  3. Certificate from a hospital or a physician indicating death of either: parent, child, husband, wife or sibling.

 

Programming Projects:

In this course, you are expected to complete five programming projects that are designed with the following objectives:

1)      Test your knowledge of the data structures, algorithms and techniques for implementing DBMS that are presented in class.

2)      Test your skills for engineering a programming solution to a particular problem related with a DBMS.

3)      Provide experience in the design and implementation of complex software modules using advanced database system techniques.

You will be given two weeks to complete each programming project. You will work in “self-administered” teams of two students. Teams of three students will only be allowed if either: 1) a student cannot find a match to form a 2-student team, or 2) a student is left in a 1-student team as result of his/her teammate dropping the class. A student might choose to remain in a 1-student team. However, no special consideration will be given to 1-student team at the moment of grading. You must implement your project using the C programming language, and you must work within your team. You might discuss with your peers general aspects about the project and/or programming environment. However, you cannot share your code with any student from a different team, nor use code written by someone that is not a member of your team. Failure to comply with this requirement will be considered as an act of academic dishonesty and you will receive a grade of F in the class (read section below titled Academic Integrity).

 

You must submit your project electronically following a procedure that will be discussed in class. For each project, you will be given a tar file containing a directory, called the project directory, with the following items:

1)      A document explaining the tasks for the programming project.

2)      A document indicating a minimal set of operations that your program must execute to be considered a running program. If your program neither compiles nor perform this minimal set of operations it will receive a score of 0.

3)      A make file with the commands needed to compile the modules in the project.

4)      A set of .h files containing the declarations of the structures and methods to be implemented in the project.

5)      A set of .c files containing empty implementations of the methods, algorithms and other tasks related with the particular programming project. It is your job to implement the C code that executes the tasks these methods are designed to perform.

6)      A set of test input files and their corresponding test output files. You should use these to help you decide whether your program is working correctly or not, based on what type of output your program produces out of these test input files. NOTE: This set of input files will not be the only one to be used to grade your project. Hence, a program might pass all the tests in these test files, and yet fail some of the extra tests used for grading. However, if your program executes correctly on the test input files you will receive at least 70% of the total score for a given project.

 

Your project directory should contain all the files associated with your project. Once you have completed your project, you will create a new tar file that must contain everything that you have created in your project directory. You will submit this tar file for us to grade your project. Again, you will receive further instructions on how to submit your project electronically.

You are expected to work on the UNIX server of the Applied Database And Software Engineering Lab (ADASEL) provided by the University of Puerto Rico, Mayagüez Campus. But, you are free to use your own UNIX computer, if you prefer to do so. However, you must ensure that the programs that you submit for grading do compile and execute correctly on the UNIX machine available on the ADASEL lab, since the projects will be graded there. Failure to comply with this requirement will result in a score of 0 for the project being evaluated. NOTE: This policy will be strictly enforced.

 

Late Project Policy:

Each project will have a due date composed of an hour, month and day (i.e. 4:00 PM-September 12). A project will be considered late if it is submitted for grading one minute after its due date. For example, if the due date for a project is 3:00 PM-October 31, then a project submitted at 3:01 PM-October 31 is considered as one day late. Any late project will receive the following penalty:

1 day past due date                  -15%

            2 days past due date                 -30%

 

No project will be accepted for grading if submitted 3 or more days after its due date, and any such project will receive a score of 0. Any project that is not submitted for grading will automatically receive a score of 0.

 

NOTE: I will not debug your code via e-mail. I shall only look at your program source code listings, or login to see your code files during the allotted office hours.

 

Academic Integrity:

Each student is expected to work individually on all exams.  Students can only collaborate in their project with the other members of their team. You may not use code from another student who is not a member of your team, or use code that you find on the Internet or any similar resources. You may not share your code with another student who is not a member of your team. Failure to comply with these requirements will result in a grade of F in the course for the student(s) breaking these rules. Unauthorized group efforts, particularly during exams, will be considered academic dishonesty and the students involved will receive an F in the course. You should read Article 10 of the “Reglamento General de Estudiantes de la Universidad de Puerto Rico” to learn more about the possible sanctions that you might experience if caught in an act of academic dishonesty.

 

List of Topics:

The following a list of the course topics in the order in which they will be presented. This list is subject to change and it will vary depending on the pace of the lecture.

 

TOPICS:

1.      Discussion of the Class Syllabus

2.      Review Database Fundamentals

a.       Database System Architecture

b.      The Relational Model

c.       Relational Algebra

d.      Structured Query Language

e.       ER-Model

3.      Implementation of selections, projections and aggregates

4.      Classic Database Prototypes

a.       System-R

b.      INGRES

5.      Database Architectural Foundations

a.       Operating System Support

6.      Storage Management

a.       Memory Manager

b.      Record Formats

c.       Page Formats

d.      File Formats

e.       RAID

 

7.      File Organization

a.       Heap Files

b.      Sorted Files

c.       Hashed Files

8.      Indexing

a.       Clustered Indexes

b.      Unclustered Indexes

9.      Indexed Sequential Access Method (ISAM)

10.  B+ - Trees

a.       Search

b.      Insert

c.       Delete

d.      Bulk-Loading

11.  Hash-Based Indexing

12.  Implementation of Join Methods

a.       Nested Loops Join

b.      Merge-Sort Join

c.       Hash Join

d.      Hybrid-Hash Join

13.  Query Optimization

a.       Query Evaluation Plans

b.      Pipelined vs. Materialized Execution

c.       Cost of a Query Plan

d.      Optimization Heuristics

e.       Query Plan Search by Dynamic Programming

14.  Concurrency Control

a.       Consistency and Serialization

b.      Lock-Based Concurrency Control

c.       Lock Management

15.  Crash Recovery

a.       ARIES

                                                               i.      Write-Ahead Logging

b.      Recovery from a System Crash

                                                               i.      Analysis Phase

                                                             ii.      Redo Phase

                                                            iii.      Undo Phase

16.  Database Benchmarking

17.  Object-Oriented Databases

18.  Data-Warehousing and Materialized View

19. Data-Mining

 

 

 

© 2001 University of Puerto Rico-Mayaguez. All rights reserved.


Last update was on August  28th, 2001
manuelr@acm.org