Skip navigation

Site Map | College of Engineering | UC Davis | MyUCDavis

ECS 166 SCIENTIFIC DATA MANAGEMENT (4) I

Lecture: 3 hours

Laboratory: 1 hour

Prerequisite: Programming skills at the level of course 40; Math 21C

Grading: Letter; projects (60%), midterm (17%), final (23%)

Catalog Description:
Relational databases, SQL, non-standard databases, XML, scientific workflows, interoperability, data analysis tools, metadata.

Expanded Course Description:
Topics to be covered and approximate time spent on each (sequence may vary):

  1. Introduction
    1. Requirements and properties of scientific databases
    2. Issues related to database support for scientific data management
    3. Types of scientific data: structured, unstructured, temporal, spatial, image, text
  2. Introduction to Relational Databases
    1. The relational data model
    2. Structured Query Language (SQL)
    3. Open source and commercial DBMS packages (Postgres and MySql)
  3. Extensible Markup Language (XML)
    1. Role of XML in Scientific Data Management
    2. XML data model and query languages (XPath, XSLT)
    3. Standards, tools, and systems
  4. Ontologies and Metadata
    1. The role of ontologies and metadata
    2. Metadata standards (e.g., RDF)
    3. Standards, tools, and systems
  5. Scientific Workflows
    1. Principles of scientific Workflows
    2. From data preprocessing to data integration to data analysis
    3. The Kepler Scientific Workflow System
    4. Web Services
    5. Examples of scientific workflows

Textbook:
Several papers and tutorials will be made available

Computer Usage:
Students work on projects in a Linux environment, using standard Linux/UNIX tools as well as major database software packages and associated development tools.

Programming Projects:
There will be several individual and group projects. In individual projects, students have to use an existing scientific database (such as a Protein DB, Image DB, spatial DB (satellite data), query the database and build simple tools on top of the database. In group projects, students have to install a DBMS package, populate the database with scientific data, and design and implement a complete scientific workflow on top of that database.

Engineering Design Statement:
The projects involve design, implementation and verification of scientific database applications using a variety of public domain and commercial database systems, including Postgres, Oracle, GRASS, and Kepler. The systems and tools used for these projects resemble those that would be found in industry to the extent possible, including the standard database query languages SQL and technologies such as XML, RDF and Ontology description languages. Projects are graded based on the design, performance, and correctness, including documentation. Examination questions are based on scientific (meta) data models and database design techniques discussed in the lecture and from the projects.

ABET Category Content:
Engineering Science: 2 units
Engineering Design: 2 units

Goals:
Students will:

Student Outcomes:

Instructors: B. Ludaescher

Prepared By: B. Ludaescher, M. Gertz (April 2005)

Overlap Statement:
This course offers only a very basic introduction to relational databases, a topic that is covered in detail in ECS 165A. A much shorter introduction to XML is taught in ECS 165B.

Back to Course Descriptions