Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Authoring XML Schemas for use with EMF
This article focuses on authoring and using XML Schemas for the purpose of generating an EMF model.
Introduction
Luckily, both Ecore models and XML Schemas have support for annotations. Annotations can be almost anything, but for this discussion, annotations serve two purposes:
- Provide additional information when transforming one metamodel to another
- Provide additional information on how to serialize/deserialize your model objects
EMF supports transformations in either direction (to or from XSDs). Currently, if you start with an Ecore model, round-tripping to XSD and back to Ecore should be lossless. However, Ecore supports some structures that have no XSD equivalent, such as multiple inheritance. Information will be lost in such cases. Also, if you start with an XSD, not all of the information is captured in the Ecore model. Therefore, "round-tripping" XSD->Ecore->XSD is not a recommended practice as of the Europa release. Information might get lost, and almost certainly will get shuffled around.
This article will focus only on the practice of authoring (and annotating) XSDs and importing them into Ecore models. However, it can be a useful learning tool to model certain structures first in Ecore, and then exporting to see the annotated XSD equivalent.Reasons to use XSDs with EMF
- The XSD already exists
- You must ship an XSD
- Using an XSD is one way to customize how EMF models are persisted
Important Differences between XSDs and Ecore Models
Multiple Inheritance
Ecore Models | XML Schemas |
EMF's EClass supports multiple inheritance, which allows you to mix-in structural features like attributes at multiple places in your generated class hiearchy. However, the java language does not support multiple class inheritance, so the actual implementation behind the shared java interface is generated multiple times. One benefit to using multiple inheritance with EMF is that the metamodel is shared. Clients that use the reflective API or receive model events will see the same shared feature even though the implementation has been duplicated. |
XML Schemas do not support multiple inheritance. A Complex Type can only extend one other Complex Type. As an approximation, XSDs allow attribute groups do be define. They can be reused multiple times. However, each time an attribute group is reused, it is treated as a copy of those attributes, rather than a shared reference. |
Global Elements and Attributes
Ecore Models | XML Schemas |
Ecore's Attributes and References always occur in the context of a containing EClass. They can not be declared on their own, and later pulled in to one or more EClasses. | XSDs allow global elements and attributes to be defined and then referenced multiple times. This type of "referencing" actually results in a copy. |
The Transformation Process
- An EMF Model is created and associated with one or more XSDs
- When the Wizard finishes, the conversion process creates *.ecore files
- At this point, EMF will not read your XSDs again. All information needed by EMF has been tucked away somewhere else.
- If you modify your schemas, the transformation can be performed again from the .genmodel Editor. Affected Ecore models are completely replaced.
Which Parts of an XML Schema are Processed?
What is the Output of the Process?
- Ecore Models
- Schemas become Packages; Simple/Complex Types become EClasses, EDataTypes, and EEnums; Elements and Attributes become EStructuralFeatures
- Annotations in your Ecore Models
- Ecore annotations are added in several places to capture the way your model should be serialized/deserialized. These annotations are what enable EMF to read and write XML documents that don't conform to EMF's default serialization format.
- A Helper EClass named "DocumentRoot"
- In certain cases, EMF must create a DocumentRoot class to store additional information.
- Helper EStructuralFeatures, such as feature maps with the suffix "group"
- To handle XSD features like substitution groups, the substitution rule to be used when writing must be known. To handle these cases, your Class' EReferences don't hold onto the actual content, but are instead derived from these "group" references which contains keys in addition to the actual referenced object.
- Your EMF Model
- References to the schemas you imported are kept so that you can invoke a reload later. Generations options are configured the first time with different defaults, under the assumption that you'll be reading and writing to XML files.
Mapping XSD to Ecore
More information on this subject can be also found in draft document XML Schema to Ecore Mapping. [1]
Mapping Schema to EPackage
Each XML Schema is mapped to an EPackage.
(in progress...)
Mapping Types
Mapping SimpleType to EDataType/EEnum
An EDataType will be created if there are no enumerations:
<xs:simpleType | EDataType | |
[Default Mappings] | ||
name="RGB" | -> Name (unless overridden) | |
[Override Default Mapping] | ||
ecore:ignore="true" | Ignores this type | |
ecore:name="overrideName" | Determines the name | |
ecore:instanceClass="package.JavaType" | Determines the fully qualified java Class or Interface | |
... > | ||
<xs:restriction base="xs:string"/> | Determines the Instance Class Name in many cases | |
</xs:simpleType> |
If the restriction includes enumerations an EEnum is created:
<xs:simpleType | EEnum | |
name="Align" | -> Name (unless overridden) | |
[Overriding Default Mappings] | ||
ecore:ignore="(true)" | Ignores this Simple Type | |
ecore:name="$name|override" | Overrides the Name, normally determined by name | |
... > | ||
<xs:restriction> | ||
<xs:enumeration | EEnum Literal (for each xs:enumeration) | |
value="Left" | ||
ecore:value="auto|int" | Overrides the Literal's Value | |
ecore:name="auto|override" | Overrides the Literal's Name, normally determined by value | |
... /> | ||
<xs:enumeration value="Center"/> | ||
<xs:enumeration value="Right"/> | ||
</xs:restriction> | ||
</xs:simpleType> ... |
Complex Type to EClass
<xsd:complexType | EClass |
name="Container" | -> Name (unless overridden) |
abstract="false|true" | -> Abstract (qualifier on the impl class) |
ecore:name="$name$|override" | Overrides name |
ecore:implements="(tns:MixinInterface)" | Adds additional supertypes to the EClass. See the section on multiple inheritance. |
ecore:interface="false|true"> | If true only the java interface is generated |
<xsd:complexContent> | |
<xsd:extension base="tns:Child"> | Determines the first Super Type of the EClass |
... (Structural Features) ... |
Elements and Attributes which appear here are described elsewhere |
</xsd:extension> | |
</xsd:complexContent> | |
</xsd:complexType> |
Mapping Structural Features
Common Mapping Options
Whether you're mapping an xsd:attribute or xsd:element to an EAttribute or EReference, most of the mapping options for EStructuralFeatures are shared.
XML Schema | Description | |
---|---|---|
<xsd:complexType name="Book"> | ||
... | ||
<xsd:feature | EStructuralFeature (EReference or EAttribute) | |
[Default Mappings] | ||
name="author" | -> name (The feature's name) | |
type=type | -> attributeType for EAttributes -> referenceType for EReferences | |
default="value"/> | -> (needs review) Default Value Literal | |
[Overriding the default mappings] | ||
ecore:ignore="false|true" | Instructs EMF not to perform any mapping | |
ecore:name="$auto$|name" | Overrides the name | |
ecore:lowerBound = "$auto$|n" | Overrides the lower bound, which is normally calculated from other information to be 0 or 1 | |
ecore:upperBound = "$auto$|n" | Overrides upper bound, normally calculated from other information to be 1 or "many" | |
ecore:required="$auto$|boolean" | Default value is calculated by EMF, perhaps when lower bound > 0 (review needed) | |
[Ecore Specific] | ||
ecore:unsettable="boolean" | Overrides unsettable, default value is true if lower bound is 0, OR upper bound is 1 (review needed) | |
ecore:suppressedGetVisibility="false|true" | Removes the accessor method from the generated interface only | |
ecore:suppressedSetVisibility="false|true" | Removes the set method method from the generated interface only | |
ecore:suppressedIsSetVisibility="false|true" | Removes the "is set" method method from the generated interface only | |
ecore:suppressedUnsetVisibility="false|true" | Removes the unset method method from the generated interface only | |
ecore:transient="$auto$|boolean" | False by default, unless a featuremap is involved. Transient features are not persisted | |
ecore:derived="$auto$|boolean" | False by default, unless a featuremap is involved. | |
ecore:volatile="false|true" | False is assumed | |
</xsd:sequence> | ||
</xsd:complexType> |
Mapping to EAttributes
XSD Elements and Attributes whose types are SimpleTypes always map to EAttributes. The only exceptions are "HREF" types such as "xs:anyURI" or "xs:IDREF", which must be treated as EReferences.</div>
<Event message="Something happened"/> |
(Or) |
<Event> |
<message>Something happened</message> |
</Event> |
(Or) |
<Event> |
<message>First Message</message> |
<message>Second Message</message> |
</Event> |
Note that elements must be used for attributes with multiplicity >1.
XML Schema | Description | |
---|---|---|
<xsd:complexType name="Event"> ... | ||
<xsd:(attribute|element) | EAttribute | |
[Default Mappings] | ||
name="message" | -> name | |
type="xsd:string" | -> attributeType (must be SimpleType -> EDataType) | |
default="Hello world" | -> Default Value Literal | |
[xsd:attributes only] | ||
use="optional|required|prohibited" | Affects upper and/or lower bounds | |
[xsd:elements only] | ||
minOccurs="value" maxOccurs="value" |
Affects upper and/or lower bounds | |
... /> | ||
... | ||
</xsd:complexType> |
Mapping Element to Contained EReference
Elements with Complex Types can only be mapped to EReferences. Since Elements represent nested data in a document, it's no surprise that the EReference must be contained. EReferences which are not contained are saved as URIs to some other location, which means that a Simple Type is sufficient.
In a document, contained content could appear as:
<List> |
<Item text="This is item 1"/> |
<Item text="This is item 2"/> |
</List> |
XML Schema | Description | |
---|---|---|
<xsd:complexType name="Container"> | ||
<xsd:sequence> | ||
<xsd:element | EReference | |
name="child" | ||
type="tns:Child" | -> EClass (type must be Complex) | |
minOccurs="0|1|n" | -> Lower Bound | |
maxOccurs="0|1|n|unbounded" | -> Upper Bound | |
[Overriding default mappings] | ||
ecore:lowerBound = "$minOccurs$|n" | ||
ecore:upperBound = "$maxOccurs$|n" | ||
[Ecore Specific] | ||
ecore:opposite="null|opposite" | Identifies the inverse EReference using the XSD element/attribute name from the ComplexType | |
ecore:changeable="false|true*" | True by default * may be false when featureMaps are involved (review needed) | |
... /> | ||
</xsd:sequence> | ||
</xsd:complexType> |
Mapping to uncontained EReference
Both Attributes and Elements can be mapped to uncontained EReferences. As with EAttribute, only XSD elements will allow for multiplicity >1.
<Event origin="_DHejd3ks8dh3je8shre"/> |
(Or) |
<Event> |
<origin>_DHejd3ks8dh3je8shre</origin> |
</Event> |
(Or) |
<Event> |
<message>_DHejd3ks8dh3je8shre</message> |
<message>_ks8dh3je8sXGejd3hre</message> |
</Event> |
Note that elements must be used for references with multiplicity > 1.
XML Schema | Description | |
---|---|---|
<xsd:complexType name="Event"> ... | ||
<xsd:(attribute|element) | EAttribute | |
[Default Mappings] | ||
name="source" | -> name | |
type="xsd:anyURI" | resolveProxies and other settings are sensitive to the ref type (review needed) | |
*type="ecore:type" | (*required) EMF must be told the type of the reference (review needed) | |
default="???" | Does a default value make any sense (review needed) | |
[xsd:attributes only] | ||
use="optional|required|prohibited" | Affects upper and/or lower bounds | |
[xsd:elements only] | ||
minOccurs="value" maxOccurs="value" |
Affects upper and/or lower bounds | |
... /> | ||
... | ||
</xsd:complexType> |
Tips and Techniques
Almost Multiple Inheritance
If the assumptions made by the architects of java are correct, then the only useful form of multiple inheritance is to "overlay" additional method signatures onto a class already extending some other class. (Or in the simplest case, just for "tagging", e.g. java.lang.Cloneable). Let's hope they're right, because these are basically the same limitations we are dealing with here.
It may be useful to think of the "single inheritance" permitted for ComplexTypes as being similar to java class extension. Java only allows you to inherit one implementation class. This "free", inherited implementation is analogous to the free, inherited attributes and elements from your base ComplexType. Let's say that we mix in an additional ComplexType using ecore:implements="tns:SecondSuper". When you try to save an xsd:attribute or xsd:element inherited this way, you can't. Your schema has no idea what you're talking about.
So, let's think of mix-in types as interfaces that can never be "implemented". Implementation here refers to being persisted using something actually defined in your XSD.
(in progress..)