Lecture: 3 hours
Laboratory: 1 hour
Prerequisite: Programming skills at the level of course 40; MAT 21C
Grading: Letter; projects (60%), midterm/final (40%)
Catalog Description:
Relational databases, SQL, non-standard databases, XML, scientific
workflows, interoperability, data analysis tools, metadata.
Goals:
This is an interdisciplinary course in data management for the purpose
of facilitating research and application development using open source
DBMS packages and large-scale scientific data sets..
Expanded Course Description:
Topics to be covered and approximate time spent on each (sequence
may vary):
Textbook:
Several papers and tutorials will be made available
Computer Usage:
Students work on projects in a Linux environment, using standard Linux/UNIX
tools as well as major database software packages and associated development
tools.
Programming Projects:
There will be several individual and group projects. In individual
projects, students have to use an existing scientific database (such as
a Protein DB, Image DB, spatial DB (satellite data)), query the database
and build simple tools on top of the database. In group projects, students
have to install a DBMS package, populate the database with scientific
data, and design and implement a complete scientific workflow on top of
that database.
Instructors: B. Ludaescher, M. Gertz
Prepared By: B. Ludaescher, M. Gertz (January 2005)
Overlap Statement:
This course offers only a very basic introduction to relational databases,
a topic that is covered in detail in ECS 165A. A much shorter introduction
to XML is taught in ECS 165B.
2/05