As a shot note - in the following discussion ,we consider a mapping of class structure to a traditional relation database (RDBMS). It is possible to use object-oriented extensions of some RDBMS as Oracle Database version 8 and above, but it is beyond the scope of our discussion because all such extensions are implementation-specific and in many cases have strict limitations.
3.1 Mapping strategies
Classes can be mapped to tables and views. Sometimes it is necessary to map a single class to two or more tables, joined or un-joined together. DbUML doesn't support such mapping directly; you have to create a view where the table connection has to be done behind the scene.
3.1.1 Class attributes
Attributes can be mapped to table or view columns. Usually, attributes are declared of simple types like numbers and strings. If classes have to have complex attributes like object of some class types, associations have to be used. On a database level, attributes are represented as table columns. The mapping is a simple association of the column name with a class attribute. Sometimes columns contain values different from what is expected on the object level. For example, if a class has an attribute of the Boolean type, on a database level the value can be stored as a single character with possible values "Y" and "N". In such case we have to provide "value mapping" to associate "N" with Boolean "False" and "Y" with Boolean "True". But in most cases, a value mapping is not required - integer attributes are represented "as is" in columns of numerical types and string attributes are also represented in columns of "as is" character types. Some attributes have a very significant role in object-to-database mapping. These attributes uniquely identify objects and on a database level are mapped to table primary keys. UML has no special way to identify such attributes, so the most convenient way to do so is to use stereotypes - UML property can be associated with any UML element. We recommend the "PK" stereotype to identify such attributes.
3.1.2 Associations and roles
While mapping of classes and attribute is a straightforward task, association mapping is more complex task. First of all, association is represented as a line between two classes. Association ends are called "roles". UML roles describe a role of classes involved in the association. Class role is depicted on the opposite end of the association line, not on the adjacent one. A role has two important properties - name and cardinality (multiplicity).
3.1.2.1 Object references
Typically, UML classes represent regular classes in traditional OOP sense. Hence, the association line between two such classes is usually treated either as an object reference or as a collection, depending on the cardinality of the class role. If the cardinality is "0..1" or "1" it is a representation of the object reference. On a database level, such roles are usually implemented as foreign keys - one or more columns in source table to keep the value(s) of the primary key in the target table. To provide mapping for object references, the source class has to have special attributes - holders for foreign key values. There are two ways - the first one is just to manually declare additional attributes for each association the class is involved in. But there are two problems - the designer must be very careful to keep such attributes in synch with associations every time he or she adds or deletes an association. Another way is to allow the mapping tool to create such attributes automatically, map them and use such a mapping in a source code generation process. Usually, there is no value mapping for object reference mapping. The example of an exception would be a composite foreign key with one attribute of the Boolean type.
3.1.2.1.1 Aggregations
There is one special type of association - aggregation. Such an association is depicted as a line with a solid diamond on the source class end. Generally it doesn't means a reference to an object, but an enclosing of an object. Usually such constructions are used to avoid annoying redundancy, if the same set of attributes has to be used in different classes. It minimizes the UML class diagram and in some cases allows to optimize class implementations. On a database level, aggregations can be represented in two ways - as a simple inclusion of columns for the target class into source class table, or as a foreign key to a different table with a "one to zero-or-one" association between source and target tables. The latter approach can be easily transformed to the former one - just use a database view which joins the source and target tables.
3.1.2.2 Object collections
Object collections are represented as associations with plural roles - roles where cardinality "*", "1..*" or "0..*'. On a database level, collection is represented as a set of records in a target table. As a result, there is no "name mapping" or "value mapping" associated with collections. Instead, collection mapping assumes the usage of SQL statements to retrieve and modify set of records. More specifically:
- Load query. SQL select statement returning the collection data set;
- Add query. SQL insert or update statement adding a record to the collection data set;
- Delete query. SQL update or delete statement removing a record from the collection data set.
A particular implementation of collections on a database level depends on the business logic behind the collection and database design. There are two general collection types:
- Aggregated collections represent object associations where the target object cannot exist outside of the collection. In other words, it is owned by the collection in a business sense. For example, consider class Continent which has a collection of Countries. Country must to belong to some Continent, i.e. it must to be listed in some Continent.Countries collection. The only possibility for Country to be removed from the collection would be the elimination of Country as a political entity. In other words, Country is aggregated within Continent. On the other hand, if a new Country is created it must be included in some Continent. Add and Delete queries are optional for such a type of collection because deletion from a collection can be done automatically when the entity is deleted from the system and adding to a collection can be done as simple assignment of target object property (like France.Continent = Europe). Aggregated collections are always represented by 1-to-many associations
- Associated collection covers cases where target objects can exist even outside of the collection. Such objects are owned by the system. For example, class Department has a collection of Employees. If a Person is deleted from Department.Employees collection it is not necessarily deleted from the system - it can be assigned to another Department or stay assigned for some period of time. It is also possible for an object to exist in several different associated collections. Association collections can be of two types:
-- One-to-many. Usually, such collections come together with object references from a target object back to source one. On a database level, the most natural way to implement such a collection is to use a foreign key in the target table. For example, Order class has a collection of OrderItems, table ORDER has primary key ID and ORDER_ITEM table has a foreight key ORDER_ID. The load query would look like
select * from ORDER_ITEM where ORDER_ID=:orderId
To delete the item from the collection, the foreign key has to be set to null using an update statement.
update ORDER_ITEM set ORDER_ID=null where ITEM_ID=:itemId
Assignment is done using a similar update statement setting the foreign key to a main table primary key value.
update ORDER_ITEM set ORDER_ID=:orderId where ITEM_ID=:itemId
-- Many-to-many. Such types of collections require usage of an intermediate, cross-referencing table (sometimes just called the "X-table"). Such a table contains two foreign keys - for source and target tables. To retrieve a collection of Items in ShoppingCart we should use query like
select ITEM.* from ITEM, CART_X_ITEM X where X.CART_ID=:cartId and X.ITEM_ID=ITEM.ID
Adding to and deleting from a collection are achieved using insert and delete statement correspondingly -
insert into CART_X_ITEM (CART_ID, ITEM_ID) values (:cartId, :itemId)
delete from CART_X_ITEM where CART_ID=:cartId and ITEM_ID=:itemId
3.1.2.3 Enumerators
In programming languages like C++ and C#, enumeration types include any type with a limited number of possible values, even if this number is very big, for example, integer type. In this document we use a different (close to Pascal) definition of enumerators - enumerator type is one which defines a set of values by simply listing the values. The classic example of such a type is a RGB Color enumerator - set of possible values Red, Green and Blue. While it is possible to imagine the declaration of the enumerator using a listing of, let's say, one thousand values, in practice enumerators usually have "human countable" number of available values. On the database side, the most logical implementation would be a single column containing a value of char or number type. In addition, value decoding must be provided to achieve the conversion of table value back and forth to the enumerator value. Hence, enumerators need both a name and value mappings. In the UML diagram, attributes of the enumerator type can be listed among another attributes, but in such a case it would be impossible to get information about all possible values and hence impossible to provide a value mapping. So, the better way is to describe a class for enumeration type and list all possible values as an attributes (attribute types can be ignored). To distinguish such classes in the UML model, it is recommended to mark them with a dedicated stereotype, like "Enumeration". Now the attribute of the numeration type can be represented as an association between your and "enumeration" classes. Such an approach allows the mapping tool to get an access to all possible values and makes value mapping possible.
3.1.2.4 Boolean type
Boolean type, generally speaking, is an extreme case of enumerator types. It only has two possible values - "True" and "False". A UML developer can, of course, create the class "Boolean", mark it with the "Enumeration" stereotype and draw association lines every time he or she needs a Boolean attribute. But Boolean type is so commonly used, it would be a good idea to treat it as simple type - i.e. list Boolean attributes among another attributes. The mapping tool can explicitly support value mapping for Boolean types.
3.1.2.5 Enumerator sets
Unlike enumerator types allowing to store only one value at time, a set is a collection of values. For example, ColorSet type would allow to store pure Red, Green and Blue values as well as a mixture of them - like "Red and Blue" or "Red and Green and Blue". In the UML diagram, sets can be represented as associations with enumerator classes, similar to an enumerator attribute representation with only one difference - they have to have plural roles (cardinality 0..*). IN database table set can be represented in two ways:
- a set of columns, one per each possible value. Each column contains only two possible values identifying the presence or absence of a value in the set (flag). The resulting value is identified by a set of columns with flagged values.
- a single column keeping a bit map value (integer type) or char map (string type). The resulting value is identified by a set of bits (characters) presented in the column value.
Sets require both name and value mappings.
3.2 Strategy-support templates
As we can see, the mapping strategy for different attribute types can vary depends on the UML modeling and the database design style. If possible, it is always recommended to develop some standard patterns in database design and follow them every time new table is created. Fixed patterns eliminate variety of choices when mapping is done and hence simplify the whole process.
By default, DbUML configured to support different styles of mapping.
To make it possible to adjust the mapping process to specific requirements, a mapping list for class attributes and associations is not created by DbUML. Instead, it is generated by a special template called "strategy template". The template can be modified by the user to accommodate the mapping process to company standards. The template must define two functions with names "captions" and "mapping". "Captions" function defines a list of columns in the mapping list and "Mapping" function generates a list or items shown in the list.
3.2.1 "Captions" function format
The function has to return a comma delimited list of columns to be shown on a mapping list. Each column definition defines the column name, default size and optional column ID used by the mapping process. The following column ID must be presented -
- MAP column contains mapped name (table column name)
- VALUE column contains mapped value
- MAPNAME column contains a value used to identify mapped name and value in repository
- INFO column contains additional information associated with the mapping. DbUML uses this column to show the target table column data type
- USER column contains the name of the user who made the mapping
- ATTRS column contains service information about the mapping list row. In particular, it defines if a row can be name mapped and value mapped. Usually this column is non-visible.
If some column has to be defined, but doesn't need to be shown on the list, it has to have "-" as a column name.
3.2.2 "Mapping" function format
[TBD]
3.3 Mapping inheritance
If on the UML model class is inherited from another class, it can be mapped in two different ways:
- inherit mapping of the parent class
- always use its own mapping
The second approach is more generic. In most cases, inherited classes are mapped to different tables with different structures. So DbUML always keeps the mapping information for each class even if it repeats the mapping of the parent one. But sometimes it is desirable to avoid tedious work of repetitive mapping of the same attributes to the same columns if the mapping has to be the same. To simplify the mapping in such cases, DbUML has a helpful option "Enable inherited mapping" which automatically maps all subclasses when the parent class is mapped. Each subclass keeps its own copy of mapping information, though. An additional option "Enable mapping override" controls if DbUML can override the subclass mapping if its attribute is already been mapped. The default value for both options is "False".
3.4 Default vs In-place value mapping of enumerators
As it was discussed before, enumerator mapping has to be done for both name and values. If the enumerator type values used in different places can be decoded using the same value mapping, the value mapping can be done in one place - in the enumeration class itself. However, sometimes the same enumerator used in different classes is mapped to table columns with different values. For example, Color enumerator can be represented as a string column with values "Red", "Green", "Blue" or as an integer column with values "1", "2", "3". It means that in each particular case, the default mapping provided in the enumerator class must be replaced by the appropriate one. The strategy template is responsible for allowing or not to provide default or in-place mapping of enumerator values. The default strategy template coming with DbUML allows both approaches.
3.5 Mapping-specific properties of the DbUML project
To make mapping possible, DbUML needs to have a special repository where mapping data are stored and an access to the database structure used to map to UML class model. DbUML can work with databases using the ODBC driver, and with the Oracle database directly using OCI.
There are two possibilities for mapping repository
- Plain text file. The simplest and convenient option is for only one user to use DbUML. In the case of a multi-user environment, the file can be located on a network drive. If only one user makes changes in themapping information, it is still Ok. But if several users make changes the next option has to be used.
- Database repository. The operation synchronization issue is delegated to the database. To prepare the database for usage, an initialization script has to be executed. DbUML comes with the Oracle version of initialization script which can be easily modified to work with different RDBMS.
3.6 Mapping process
All mapping operations are executed on a mapping list. DbUML shows the mapping list every time the UML class is selected from the class tree view. Each mappable UML element can have one or more lines in the mapping view. Each line has a status icon which defines what kind of mapping, if any, is allowed for this line. Also, each line has an optional entity icon, defined by the strategy template.
- Attributes of simple types like strings, floats and integers are usually represented by a single line.
- Attributes of boolean type are shown as single lines for name mapping and two optional lines for name mapping of "True" and "False"
- Enumerators usually are represented by several lines - one for attribute name mapping and several additional lines for each possible value.
- Sets are represented similarly with only one exception - additional lines are used not for value but for name mapping.
- Object references are shown by a single non-mappable line for the association itself and a number of additional lines for each foreign key value.
- Object collections are shown by sa ingle non-mappable line for the association itself and three additional lines for Load, Add and Del queries.
Name mapping is executed by a simple drag-and-drop operation. A column name from the table column list can be dragged over the mapping list.
Value mapping is executed by double-clicking on the mapping line which enables value mapping. Appropriate dialog will appear to complete the value mapping procedure.