|
|
|
| Table of Contents |
The Problem - Accessing and managing data from several existing independent databases
The Arcitecture - Architecture of Federated Databases
The Methodology - Reverse-engineering, schemas integration and mappings building
The Case-tool - Support of the methodology and for the architecture components generating
| The Problem |
Current technologies such as de facto standards (e.g. ODBC and JDBC), or formal bodies proposals (e.g. CORBA), now ensure a high level of platform independence at a reasonable cost, so that this level can be ignored from now on. DMS level independence is effective for some families of DBMS (e.g. through ODBC or JDBC for RDB), but the general problem is still unsolved when several DMS models are to cooperate. Location independence is addressed either by specific DBMS (e.g. distributed RDBMS) or through distributed object managers such CORBA middleware products. Despite much effort spent by the scientific community, semantic independence still is an open and largely unsolved problem.
The InterDB project proposes a general architecture, a methodology and a CASE environment intended to address the problem of providing users and programmers with an abstract interface to independent, heterogeneous and distributed databases.
| The Architecture |
Location and semantic independence's are ensured by a global server. This module processes the global queries, that is, queries addressing the data independently of their distribution across the different sites. The module is based on a repository that describes the conceptual schema of each local server, its location, and the relationships between their data structures. Information such as data replication, semantic conflicts and data heterogeneity allows the server to interpret and distribute the global queries, and to collect and integrate the result sets sent back by the local servers.
Finally, platform independence is ensured by both the locals servers and ad hoc middleware such as commercial ORB
| The methodology |
Recovering the logical and conceptual schemas of an existing database
is the main goal of database reverse engineering, an important software
engineering that can now be considered mature.
Solving the syntactic and semantic conflicts of independent schemas
has long been studied in the database realm. However, coping with conceptual
schemas form populated databases brings new problems. A complete methodology,
encompassing schema recovery and database integration is provided to praticioners.
| The Case support |
All these processes are supported by the DB-Main
CASE tool. This graphical, repository-based, software engineering environment
includes, among others, a sophisticated reverse engineering toolkit, schema
mapping specification facilities and a generator development environment.
The generation of local servers (logical and conceptual) is automated.
It relies on the results of reverse engineering activities through which
the exact logical and conceptual structures of each database have been
elicited. The global server exploits the same repository, in which the
inter-database mappings have been made explicit.