| schema-mapping | Aktuell | Seminare | Reports | Homepage | Software | ||
|
| |||||||
|
@informatik.hu-berlin.de Reports - postindustr.CC - XML/Ti Report - pTA StudienArbeit . - sch_llf study - Geschichte des PC TechDocs - Perl Objects - Installing Oracle - shell cmds in python - Using css for xml defs tricks - Unsafe mono [x] ! - Docbook Manpages - Java Bean Code rpm-suse - schema-mappingen ig cv hg re dv ev zz mk pr - java problemsen lang swing ext gtk jjtree xul boot -grub-netboot -grub-gtk -partclone freshmeat -partimage links -releaseuploader -guidod-pygtk
2004-02-05
|
schema-mapping - operationsSchema-mapping in databases (and here we look at RDMBS) might as well be a complex matter. Many different things may be involved and called to be "schema mapping"s. Therefore here we look at some variants on a larger scale. The reason behind it: a schema-mapping compiler framework needs to provide infrastructure to combine more than one of the possible schema-mapping operation sequences. -- integration --Database integration (or creation of federated databases) has been a hot topic atleast since the internet revolution. Network transfer costs have dropped, and allows us to draw information parts from remote and integrate them on some purpose - like OLAP operations. In the easiest remote instances are replicated to a central place and the multitude of tables integrated under a global schema to access them with conventional SQL queries. -- conversion --Integration might be easy if the local databases have been designed after the necessities of the global schema. But local databases might be not optimal then. Atleast they want to distributes fields into different tables but most often the field entries have a given type and format. To allow these fields to be compared in the global schema we need to convert them into a common format that is applicable to comparison operators. That is not only scaling but also mapping of enumerations and key-IDs. -- heterogenity --To find a common format to conversion of local fields into a common global fields value is one side. Yet there might be semantic heterogenity in that some enumeration is not given as fields values but structure separations. To get a decent mapping of these we need to instantiate structure information as a table and allow them to combine on queries. And there we need an engine that maps the enumerated values into the different channels to different tables and databases. -- refraction --Beyond atomic validity of the data we find problems of data clensing like recognition of record duplications. Also we find data to be combined into fields creating a one-to-many relation of field values, and sometimes these might overlap on the other side giving many-to-many mappings of field values. We need to split up and merge data essentially, and check their validity in the target possibly rejecting entries. -- subdivision --Part of enumeration refraction might be due to class relation and inheritance relations. These can span also over schematic heterogenity and hierarchies of tables possibly distributed in different places from which they need to be integrated. Some of the databases at the end might be able to intepret the meaning and aggregate over hierarchies but in other instances we need to cut queries and aggregate the result in an intermediate place. The subdivision and aggregation operations as tree mapping we have here. -- evolution --And last not least, we need to look at the way auf automatic generation of schema mappings. There are however quite some areas where automatic generation is impossible or rather inefficient. Here we can make a library of of schema mappings or library of templates and rules to make up good variants from. The actual compiler should choose from the given schema operations, expand templates with concrete parts, and unroll where parts match only via more schema mapping operations. That allows a compiler to regenerate a schema mapping application even if some partial schemas have changed slightly. -- summary --When looking over the operations involved we see the parts that would need to be provisioned for in an application framework supporting generation of schema-mapping applications. We need to have atleast loading of sql snippets (with additional extensions and annotations) and foreign function interfaces (for the generator and generated applications). And better not forget a set of tree operations, the handling of intermediate tables and cursors, plus support for secondary parts like logging services. Some interactivity would be fine even that it wouldn't be potterswheel. | ||||||