| mk | Aktuell | Seminare | Reports | Homepage | Software | ||
|
| |||||||
|
@informatik.hu-berlin.de Reports - postindustr.CC - XML/Ti Report - pTA StudienArbeit - sch_llf study - Geschichte des PC schema-mappingen ig cv hg re dv ev zz mk pr java problemsen lang swing ext gtk jjtree xul boot -grub-netboot -grub-gtk -partclone freshmeat -partimage links releaseuploader
2004-02-19
|
Making Of SMACSThe SMACS is a compiler system creating schema mapping applications. In its initial form it is about reading standard SQL syntax and transforming it into SQL/J or JDBC source code. On that basis extensions are pushed in. The frontend needs to parse SQL into an internal AST (abstract syntax tree). We need walker classes to check out information in the tree. We need synthesis executions to build relation information on the concrete sql represented in the tree. Later extensions will largely start off with syntax extensions (like the SchemaSQL operator), and corresponding rule inspections (where the SchemaSQL ranges are defined and how are they used). The materialization framework is rebuilding the AST tree. For integration we need to extract data and possibly load it to temporary data tables. We need to identify the different sources as both original (remote) sources and materialized (temporal) sources as well as reference (generated) sources from the target side which is especially a topic with incremental update processing. We need to build up relations which mapping processing requires what sources. In its initial form, only original sources exist which could be expressed as Views. Extensions are ordering of materialization and optimizations for common evaluations. The transformation framework is related to the extraction and integration but it is less about data but algorithms - the conditions about data fusion (and possible data cleansing) are checked. For that we need to extract that information from the rule base and create lists of UDF/FFI that will need to be triggered with records from materializations. In its initial form only equality fusions are taken which could be expressed in plain sql. In a second stage functions are imported as FFI references. In a later step functions are generated and possibly pushed as UDF to target systems. Mixture of FFI/UDF is likely. The targetting framework needs to correctly interleave the various materializations and transformations and bind them to the (probable) multiple target databases or logging systems. In its initial form these are plain record Inserts from the return value of transformations - on the target tables listed in the rule base. Extensions will add conditions on the record returns to select a target and formatting of the created reocords. That is especially needed for heterogenous schemas. The generation step needs to bind the steps for materialization (extraction), transformation and targetting (load). In its simplest format that can be expressed as plain SQL view definitions. Simple extensions enrich it with sql function imports and split up into multiple subexpressions on the course of transforming higher order constructs (as for SchemaSQL) into first order constructs representable in an SQL view. The non-"trivial" case however needs to build java loops around a java object for a database cursor running over a given materialization. For each record the transformations are expressed with imported/generated functions. And after target selection the reformatted record is pushed to a database for actual insertion. Unlike some other frameworks we are a compiler so we do not execute the steps immediatly but we generate source code from it. The source code needs to contain all local variable instantations (such as database connects and cursors and function imports). The source code needs to show all the transformations possibly executed as define by conditions over the records. And the source code needs to show the target selection, record creation and formatting, plus database insertion call. And finally, we need to create the glue to handle invalid steps, e.g. invalid transformations, which might be just logging the record to somewhere. | ||||||