Site Map | College of Engineering | UC Davis | MyUCDavis

ECS 289F: Graduate Course

ECS 289F TOPICS IN SCIENTIFIC DATA MANAGEMENT (4) II

Lecture: 3 hours

Discussion: 1 hour

Prerequisite: Course 165A recommended

Grading: Letter; project (50%), presentation (30%), homework (20%)

Catalog Description:
Selected topics in scientific data management, data integration, ontologies, scientific workflows.

Goals:
The goal of the course is to give a broad overview of data modeling and integration, knowledge representation, and scientific workflow challenges in scientific data management. Specific topics will be investigated in more detail, based on project and/or reading assignments.

Expanded Course Description:

  1. Introduction
    A. Overview of scientific data management
  2. Data Integration
    A. XML model and query/transformation languages
    B. Database mediation, query rewriting
    C. Knowledge-based extensions
  3. Scientific Workflows
    A. Dataflow process networks
    B. Web services and complex workflows
    C. The Kepler scientific workflow system
  4. Student Projects and Presentations


Textbook:
A selection of technical papers addressing specific topics will be used. No textbook is required..

Projects:
There are two kinds of projects: Implementation Projects (IPs) and Research Projects (RPs).
For IPs, the students will work with Java-based open source systems such as the Kepler workflow system (www.kepler-project.org) and design and implement example workflows, e.g., to create a bioinformatics workflow that connects several "bio web services". Thus, in Ips students work with existing software systems, but they typically will also implement project-specific extensions to that software

For RPs, students will read 1-3 research papers from a list of offered research topics (e.g., scientific data integration, ontologies and knowledge representation in scientific data management, scientific workflows). Students will then need to apply the results of the research papers to a specific problem (e.g., applying a certain query rewriting algorithm to a given integration scenario and set of queries). In general, the deliverable of an RP is a technical report that summarizes and compares the results of the studied papers, as their application to the given problem. Depending on the RP, the presented algorithms might have to be implemented and applied to the given problem instance

IPs and RPs are typically conducted individually or in groups of two.

Computer Usage:
For the Implementation Projects (IPs), students will primarily use and extend the Java-based Kepler workflow system, which is available under Linux, Windows and MacOS.

Computer usage is not required for homeworks.

Instructor: B. Ludaescher

Prepared by: B. Ludaescher (December 2004)

Overlap Statement:
There is no significant overlap with any other course.

Back to Course Descriptions