xsd:schema
http://www.xml-cml.org/schema/stmmlqualified
- created by hand
2001-11-20
- First draft 2001-11-20
STMML supports domain-independent STM information components
sch:title
Schematron validation
sch:ns
http://www.xml-cml.org/schema/stmml
Data Types and Data Structure
Overview
STMML defines a number of data types suited to STM. It also defines
a number of complex data strucures such as arrays, matrices and tables.
the constraints are sometimes created through elements and sometimes
through attributes. We classify the general components as follows:
Abstract Data Structures
- scalar. A scalar quantity, expressible as a string, but
with many optional facets such as errors, units, ranges, etc. Most elements
may have countType attribute to indicate more than
one instance.
- array. An array of homogeneous scalars
whose size is described by sizeType.
Delimiters in string representations can ve varied.
- matrix. A rectangular (often square) matrix of
homogeneous scalars. Many matrices have special functions
(see matrixType) such as geometric
transformations
- table. An table where the columns are homogeneous
arrays.
- list. A list of heterogeneous components from any
namespace.
- sizeType. Size of arrays
- delimiterType. A lexical delimiter
Links and References
- link. Support for simple hyperlinks and link structures
- refType. A reference to an element
- namespaceRefType. A reference to an element, including
namespace-like prefixes
Data-based simpleTypes
Common attribute types
- idType. Specifies lexical patterns for IDs
- idGroup. ID attribute (highly encouraged)
- titleGroup. Title attribute (highly encouraged)
- convGroup. Convention attribute
General information components
General components
STMML provides a very small number of abstract elements to capture
frequently encountered concepts in STM documents. There are no predetermined
semantics or ontology; it is expected that descriptive metadata
will be added through dictionaries.
All elements can contain any element children and can carry the common
STM attributes. Currently there are the following:
- object. Almost anything - concrete, abstract, representable by
a noun. Objects can have properties added through scalar, etc.
- action. Represents an action performed during a scientific narrative.
It has attributes describing a time-line and conditions so that a procedure could be replayed.
It has a container actionList which shares these attributes
and which can describe sets of actions.
- observation. Contains narrative or other elements describing
an observation, planned or unplanned
Dictionary components
Dictionaries are a major part of STMML and supported as follows:
The dictionary itself:
- dictionary. This element defines a dictionary
and is often the root element (though a data instance might also be combined
with a dictionary). The dictionary play a similar role to a simple schema, by defining
data types and other constraints (such as enumerations). By transforming a dictionary to
schema format, schema-based tools can be used for validation. A dictionary is
normally composed of entrys.
- entry. An entry contanins information which describes
or constrains elements in a data instance. The link is made through a
dictRef attribute on the data element.
Descriptive information can apply to any type of element (not necessarily
part of or derived from the STM Schema). Constraints are similar to those in XML Schemas
and use the same vocabulary (dataTypes, value ranges, enumerations, patterns, etc.). They normally
apply to elements from the STM Schema or derived from it.
In addition entrys can constrain elements to have the same
higher-level structures and constraints defined by STM Schema. Thus entrys can require a
data element to be a matrix, of a given type, with fixed number os rows and columns. These
constraints are usually attributes on the entry element, which therefore maps
directly onto the instance. Every entry has a mandatory term attribute
which is the formal text string representing the concept. This string can contain any allowed
XML characters (e.g. greek characters) but not markup (e.g. MathML or CML).
- definition. An almost mandatory child element of entry, giving
a formal definition of the term
- description. Additional descriptive informati>on for an entry.
This can contain any content, often HTML, but also MathML, CML for description of
equations, chemical formulae, etc.
- alternative. Alternative strings for describing the
concept. These can be any of the stnadard lexical and terminological data categories
such as synonyms, abbreviations, homonyms, etc. (see ISO12620 for a full range).
- enumeration. A list of allowed values for the data
element (or elements in arrays, matrices).
- relatedEntry. A related entry. Sometimes this is
descriptive (e.g. "seeAlso" provides additional information on related concepts).
It can also be used for constraints, and there is a small controlled vocabulary
of relationships, but no universal syntax. We
support parentage (e.g. through "partitiveParent" = "partOf"). In principle this can
be used with appinfo to provide algorithmically constructed
relationships.
- attributes. A wide range of constraints is provided through attributes,
several being similar to facets on XML Schema datatypes:
- rows and columns, the structure of the data element.
- recommendedUnits, units and unitType, the units of the data element.
- minExclusive, minInclusive, maxExclusive and maxInclusive,
the value of the data element.
- totalDigits, fractionDigits, length, maxLength, minLength
and pattern. The lexical form of the data element.
- annotation. Similar to XML Schema, this has children
documentation for information about the entry (normally
curatorial) and appinfo to describe entries and constraints
in machine-processable fashion.
.
Metadata
STMML supports metadata through the element
metadata. If necessary
several of these can be contained in
a metadataList element.
Groups (for schema maintenance and re-use)
(list|scalar)