Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
COSMOS Design 238000
Contents
Support locid attribute in SML Validation
Change History
Name: | Date: | Revised Sections: |
---|---|---|
David Whiteman | 06/24/2008 |
|
Hubert Leung | 07/22/2008 |
|
Workload Estimation
Process | Sizing | Names of people doing the work |
---|---|---|
Design | .5 | Hubert Leung |
Code | 2 | Hubert Leung |
Test | 1 | Hubert Leung |
Documentation | 0 | |
Build and infrastructure | 0 | |
Code review, etc. | 0 | |
TOTAL | 3.5 |
Terminologies/Acronyms
The terminologies/acronyms below are commonly used throughout this document.
Term | Definition |
---|---|
SML | Service Modeling Language |
SML-IF | Service Modeling Language - Interchange Format |
Purpose
This document is associated with bugzilla 238000.
We need to implement the optional sml:locid attibute in our validator.
The locid attribute is introduced in SML 1.1 to provide the capability to retrieve localized strings for text elements within an SML document. The specification provided an example for using the locid in schematron expressions to provide localized error messages. This enhancement implements the support of locid in schematron expressions for the SML validator.
Requirements
The locid attribute is for providing information necessary to retrieve the localized text. The specification is not technology dependent. Our implementation will use the Java resource bundle to retrieve localized strings.
The enhancement implements the example provided in Appendix F of the SML specification: http://www.w3.org/TR/sml/#LocalizationSample
The implementation will only handle the sml:locid attribute value that is defined in an element with the schematron namespace to provide localized validation error messages. The locid attribute defined in other contexts will not be handled.
Design
Locating the resource bundle
The sml:locid attribute will have two parts: a prefix and a key to a string.
e.g. sml:locid="lang:StudentIDErrorMsg"
There is a namespace URI associated with the prefix, defined in one of the parent elements. The format of the URI and how to use the URI to locate the translated resource is out of the scope of the SML specification. So it is an application specific design decision on how to use the URI to locate the translated resources and retrieve the appropriate value. In this implementation, we require the URI to be formatted in the following structure:
sml:<bundle name>[:<locale>]
- the first segment "sml" is the scheme of the URI. It is a dummy value to make the URI a well formed absolute URI.
- <bundle name> is the fully qualified name of a Java resource bundle
- <locale> is the intended locale of the message. It is an optional field.
Notes:
- URIs used in namespaces have to be in absolute form. Relative URIs are not allowed.
- The URI format above complies with the URI syntax defined here: http://www.ietf.org/rfc/rfc2396.txt
Examples:
- sml:org.eclipse.cosmos.rm.internal.messages.Message
- sml:org.eclipse.cosmos.rm.internal.messages.Message:fr
- sml:org.eclipse.cosmos.rm.internal.messages.Message:pt_BR
The string retrieved from resource bundle will replace the text content of the element, if present. The following two schematron rules are equivalent, assuming the string retrieval from resource bundle is successful:
<sch:rule context="u:Students/u:Student"> <sch:assert test="smlfn:deref(.)[starts-with(u:ID,'99')]" sml:locid="lang:StudentIDErrorMsg"> The specified ID <sch:value-of select="string(u:ID)"/> does not begin with 99. </sch:assert> </sch:rule>
<sch:rule context="u:Students/u:Student"> <sch:assert test="smlfn:deref(.)[starts-with(u:ID,'99')]" sml:locid="lang:StudentIDErrorMsg"> </sch:assert> </sch:rule>
String substitution
Section 7.1 and Appendix F of the SML specification discusses the use case of string substitution in localized strings. However, the sml:locid attribute does not provide information on string substitution. The SML specification only suggests ways to do string substitution, but it is not a normative part of the specification.
The example in Appendix F of the specification embeds the schematron "value-of" element in the message to do string substitution. This implementation will follow the example closely.
Algorithm
The enhancement will change ElementSchematronCacheBuilder data builder to replace text elements with a translated version before passing the schematron expression to the XSLT transformer.
- In the
startElement
method of ElementSchematronCacheBuilder.java, check for the presence of the sml:locid attribute if the element has the schematron namespace. - If the sml:locid attribute is present, attempt to retrieve the value indicated in the locid attribute from a resource bundle.
- get prefix and message key from the attribute value
- look up the namespace associated with the prefix (some new data structures are required to do this. SAX parsers do not provide prefix lookup directly.)
- parse the namespace URI for bundle name and the optional locale value
- load the resource bundle and retrieve string by message key
- If the retrieval failed, the sml:locid value will be ignored.
- If the retrieval is successful, then
- append the string from resource bundle to the rule fragment, right after the openning element tag of the current element.
- set a flag to suppress the text element and <sch:value-of> elements from being appended to the rule fragment.
- unset the flag in the
endElement
event of the element with the sml:locid attribute defined.
- Strings in resource bundles need to embed variables for string substitution in the messages in the correct syntax to be consumed by the schematron XSLT transformer.
Open Issues/Questions
- String substitution is an important part of string localization, but it is not supported by the sml:locid attribute. So SML can only claim partial support to localization.
- I see the two as orthogonal. Ordinarily localized strings are translated, and substituted text is not translated (it comes from some user, and has a fixed language implicitly). E.g. if I give you a server system name, glyph issues notwithstanding the name should be identical regardless of the application's locale.
- Since the mechanism for retrieving localized string is not standardized, and we have to use implementation-specific ways to handle string substitution, SML documents with localized strings are not interoperable between different implementations of validators and applications that handle the SML documents.
- Continuing my tradition of treating them separately,
- Localization: guilty as charged. Until there is a uniform interface across platforms for localization, I see little opportunity to do better than this. Allowing localization on certain platforms, e.g. Java using resource bundles, is a far better situation for users than no localization at all IMO.
- Variable substitution: in some contexts, e.g. Schematron rules, there appear to be mechanisms that are consistent across platforms... the Schematron spec requires support for an XSLT-based engine in all implementations. In other contexts, e.g. an SML-IF model's displayName (i.e. in a pure XML context) it is less clear that consistent mechanisms exist, granted. As with localization, the spec authors chose a partial solution over no solution.
- The specification suggests to use the
xsl:variable
to do string substitution (Appendix F). However, this mechanism does not work with the schematron XSL translator from http://xml.ascc.net/schematron/1.5/. The translator reports the following error:
Line #0; Column #0; org.apache.xml.utils.WrappedRuntimeException:
Could not find variable with the name of var Note that it doesn't work even without the sml:locid attribute.
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" xmlns:lang="http://www.university.example.org/translation/"> <sch:ns prefix="u" uri="http://www.university.example.org/ns" /> <sch:ns prefix="smlfn" uri="http://www.w3.org/2008/03/sml-function"/> <sch:pattern id="StudentPattern”> <sch:rule context="u:Students/u:Student"> <sch:assert test="smlfn:deref(.)[starts-with(u:ID,'99')]"> <xsl:variable name="var” select=”u:ID” /> The specified ID <sch:value-of select="string($var)"/> does not begin with 99. </sch:assert> </sch:rule> </sch:pattern> </sch:schema>
In my implementation, I expect resource bundles to have the following content:
StudentIDErrorMsg = L'identifieur specifie <sch:value-of select="string(u:ID)"/> ne commence pas par 99. StudentIDErrorMsg = Das angegebene Attributkennzeichen ID <sch:value-of select="string(u:ID)"/> beginnt nicht mit 99.
- We need to assess whether the error is a consequence of a flawed implementation, a down-level implementation (1.5 is not the ISO version IIRC), or a limitation of Schematron as currently specified.
In my opinion, hard coding the variable name "var" is not much better than assuming the prefix to be "u". So the recommendation to use an xsl variable for its portability is arguable. Java resource bundles use integers to indicate the indexes of the list of parameters, which is a more portable solution. For example:
StudentIDErrorMsg = L'identifieur specifie {0} ne commence pas par 99. StudentIDErrorMsg = Das angegebene Attributkennzeichen ID {0} beginnt nicht mit 99.
- I don't see "var" versus "0" as a meaningful difference. I would recommend named variables because they are named (the names presumably convey semantic meaning) and because they are insensitive to the insertion of additional preceding variables in the string (not true of positional parameters), i.e. for maintainability in both cases, not because of any supposed portability differences.
- In either case, I would never recommend Schematron-specific markup inside a resource bundle string, so there should be no issue with the binding of the namespace prefix "u".