DocBook is a general purpose [XML] schema particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications).
The Version 5.0 release is a complete rewrite of DocBook in RELAX NG. The intent of this rewrite is to produce a schema that is true to the spirit of DocBook while simultaneously removing inconsistencies that have arisen as a natural consequence of DocBook's long, slow evolution. The Technical Committee has taken this opportunity to simplify a number of content models and tighten constraints where RELAX NG makes that possible.
The Technical Committee provides the DocBook 5.0 schema in other schema languages, including W3C XML Schema and an XML DTD, but the RELAX NG Schema is now the normative schema.
This Committee Draft was approved for publication by the OASIS DocBook Technical Committee. It represents the consensus of the committee.
Please send comments on this specification to the <docbook@lists.oasis-open.org> list. To subscribe, please use the OASIS Subscription Manager.
The errata page for this specification is at http://docs.oasis-open.org/docbook/specs/docbook5-errata.html.
Copyright © OASIS® 2008. All Rights Reserved.
All capitalized terms in the following text have the meanings assigned to them in the OASIS Intellectual Property Rights Policy (the "OASIS IPR Policy"). The full Policy may be found at the OASIS website.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to OASIS, except as needed for the purpose of developing any document or deliverable produced by an OASIS Technical Committee (in which case the rules applicable to copyrights, as set forth in the OASIS IPR Policy, must be followed) or as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not be revoked by OASIS or its successors or assigns.
This document and the information contained herein is provided on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
OASIS requests that any OASIS Party or any other party that believes it has patent claims that would necessarily be infringed by implementations of this OASIS Committee Specification or OASIS Standard, to notify OASIS TC Administrator and provide an indication of its willingness to grant patent licenses to such patent claims in a manner consistent with the IPR Mode of the OASIS Technical Committee that produced this specification.
OASIS invites any party to contact the OASIS TC Administrator if it is aware of a claim of ownership of any patent claims that would necessarily be infringed by implementations of this specification by a patent holder that is not willing to provide a license to such patent claims in a manner consistent with the IPR Mode of the OASIS Technical Committee that produced this specification. OASIS may include such claims on its website, but disclaims any obligation to do so.
OASIS takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on OASIS' procedures with respect to rights in any document or deliverable produced by an OASIS Technical Committee can be found on the OASIS website. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this OASIS Committee Specification or OASIS Standard, can be obtained from the OASIS TC Administrator. OASIS makes no representation that any information or list of intellectual property rights will at any time be complete, or that any claims in such list are, in fact, Essential Claims.
The name "OASIS" is a trademark of OASIS, the owner and developer of this specification, and should be used only to refer to the organization and its official outputs. OASIS welcomes reference to, and implementation and use of, specifications, while reserving the right to enforce its marks against misleading uses. Please see https://www.oasis-open.org/who/trademark.php for above guidance.
Table of Contents
DocBook is general purpose XML schema particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications).
The DocBook Technical Committee maintains the DocBook schema. Starting with V5.0, DocBook is normatively available as a [RELAX NG] Schema (with some additional Schematron assertions). W3C XML Schema and Document Type Definition (DTD) versions are also available.
The Version 5.0 release is a complete rewrite. In programming-language terms, think of it as a code refactoring.
This rewrite introduces a large number of backwards-incompatible changes. Essentially all DocBook V4.x documents will have to be modified to validate against DocBook V5.0. An XSLT 1.0 stylesheet is provided to ease this transition.
The DocBook Technical Committee welcomes bug reports and requests for enhancement (RFEs) from the user community. The current list of outstanding requests is available through the SourceForge tracker interface. This is also the preferred mechanism for submitting new requests. Old RFEs, from a previous legacy tracking system, are archived for reference.
The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this Committee Draft are to be interpreted as described in [RFC 2119]. Note that for reasons of style, these words are not capitalized in this document.
RELAX NGJames Clark, editor. RELAX NG Specification (Committee Specification). OASIS. 2001.
XMLTim Bray, Jean Paoli, C. M. Sperberg-McQueen, et. al., editors. Extensible Markup Language (XML) 1.0 (Fourth Edition). World Wide Web Consortium, 16 August 2006.
XLink11Steven DeRose, Eve Maler, David Orchard, Norman Walsh, editors. XML Linking Language (XLink) Version 1.1. World Wide Web Consortium, 2005.
RFC 2119IETF (Internet Engineering Task Force). RFC 2119: Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. 1997.
RFC 3023IETF (Internet Engineering Task Force). RFC 3023: XML Media Types. M. Murata, S. St. Laurent, D. Kohn. 2001.
DocBook: TDG5 Norman Walsh and Leonard Meullner. DocBook 5.0: The Definitive Guide.
SGMLJTC 1, SC 34. ISO 8879:1986 Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML). 1986.
W3C XML SchemaHenry S. Thompson, David Beech, Murray Maloney, et. al., editors. XML Schema Part 1: Structures. World Wide Web Consortium, 2000.
W3C XML DatatypesPaul V. Biron and Ashok Malhotra, editors. XML Schema Part 2: Datatypes. World Wide Web Consortium, 2000.
SchematronRick Jelliffe, editor. The Schematron Assertion Language 1.5. Rick Jelliffe and Acedemia Sinica Computing Centre. 2001, 2001.
The DocBook RELAX NG Schema is distributed from the DocBook site at OASIS. DocBook is also available from the mirror on https://docbook.org/.
In V5.0, DocBook has been rewritten as a native RELAX NG grammar. The goals of this redesign were to produce a schema that:
“feels like” DocBook. Most existing documents should still be valid or it should be possible to transform them in simple, mechanical ways into valid documents.
enforces as many constraints as possible in the schema. Some additional constraints are expressed with Schematron rules.
cleans up the content models.
gives users the flexibility to extend or subset the schema in an easy and straightforward way.
can be used to generate XML DTD and W3C XML Schema versions of DocBook.
Under the ordinary operating rules of DocBook evolution, the only backwards incompatible changes that could be made in DocBook V5.0 were those announced in DocBook V4.0. In light of the fact that this is a complete rewrite, the Technical Committee gave itself the freedom to make “unannounced” backwards-incompatible changes for this one release.
A number of elements have been removed from DocBook. Many of these have been replaced by simpler, more versatile alternatives. Others have simply been removed because they are not believed to be widely used:
Replaced by info, see Section 2.3, “Uniform Info Elements”.
Replaced by personblurb. This more general name better reflects the fact that it is available in elements other than author (e.g., editor).
Replaced by orgname and the updated content models of author, editor, and othercredit.
Removed in favor of mediaobject and inlinemediaobject.
Replaced by biblioid.
Replaced by simpler tocdiv element.
Replaced by ubiquitous linking, see Section 2.9, “Universal Linking”.
Replaced by tag.
Removed.
The content models of many inlines have been reduced, sometimes drastically. The parameter entity customization of DocBook V4.x and previous versions resulted in very broad content models for some inlines.
Consider, for example, command in DocBook V4.4:
command ::= (#PCDATA|link|olink|ulink|action|application|classname|methodname| interfacename|exceptionname|ooclass|oointerface|ooexception| command|computeroutput|database|email|envar|errorcode|errorname| errortype|errortext|filename|function|guibutton|guiicon|guilabel| guimenu|guimenuitem|guisubmenu|hardware|interface|keycap|keycode| keycombo|keysym|literal|code|constant|markup|medialabel| menuchoice|mousebutton|option|optional|parameter|prompt|property| replaceable|returnvalue|sgmltag|structfield|structname|symbol| systemitem|uri|token|type|userinput|varname|nonterminal|anchor| remark|subscript|superscript|inlinegraphic|inlinemediaobject| indexterm|beginpage)*
In DocBook V5.0, command has a much smaller, more rational content model:
command ::= * Zero or more of: o text o alt o anchor o annotation o biblioref o indexterm o inlinemediaobject o link o phrase o remark o replaceable o subscript o superscript o xref
DocBook V5.0 may be overzealous in its simplification of content models. The Technical Committee expects to adjust these simplifications during user testing. Users are encouraged to report places where formally valid documents can no longer be made valid because content models have been reduced.
DocBook V4.x has setinfo, bookinfo, chapterinfo, appendixinfo, sectioninfo, etc. DocBook would be smaller and simpler if it had a single info element in all these places.
There’s an historical reason for the large number of unique names: customizers might very well want to adjust the content models of info elements at different levels. For example, a copyright statement might be required at the book level, or an author forbidden at the sub-section level. In DTDs, there’s only one content model allowed per element name, so in order to support independent customization, each info element must have a different name.
In RELAX NG, no such limitation exists. We can use patterns to achieve both a single info element while still allowing customizers to change its content model in different contexts. In light of this functionality, we've replaced all the various flavors of info with a single element name.
DocBook V5.0 enforces the constraint that titles are required on articles and other large structures where they are effectively optional in DocBook V4.x. (They are optional only in the sense that DTDs are unable to enforce the constraint that they be present, the documentation has always made it clear that titles were required.)
In DocBook V4.x and earlier, the presence of a document type declaration served as a mechanism for identifying the DocBook version of a document. Although the declaration was not actually required, it was present in the vast majority of DocBook documents.
In RELAX NG, no similar declaration exists. Although a document type declaration might still be present, it seems likely that this will not usually be the case.
Nevertheless, downstream processors may benefit from some indication of the version of DocBook being used. As a result DocBook V5.0 adds a new version attribute which must be present on the document element of a DocBook document.
Mixing versions is explicitly allowed and the version attribute may be used on other elements as well. This might be the case, for example, in a compound document constructed from multiple documents each with its own version.
DocBook V5.0 enforces attribute co-constraints such as the class/otherclass attributes on biblioid.
In DocBook V5.0, HTML tables and CALS tables are independently specified. Where the DTD of DocBook V4.x allows for incoherent mixing of the two models, DocBook V5.0 forbids such mixtures.
DocBook V5.0 adds a few simple data types. For example, the cols attribute on tgroup must be a positive integer.
Some of these constraints, such as the requirement that elements like pubdate include a proper date-time type, may prove controversial. Users are encouraged to report places where formally valid documents can no longer be made valid because data types have been introduced.
Starting with DocBook V5.0, the linkend and xlink:href attributes are available on almost all elements.
The linkend attribute provides an ID/IDREF link within the document. The xlink:href attribute provides a URI-based link.
The ulink element has been removed from DocBook as URI-based links can now be achieved directly from the appropriate inline (such as productname or command). For instances where no specific semantic inline is needed, link is still available. Where link used to be limited to ID/IDREF linking, it now sports an xlink:href attribute as well.
Support for extended links are provided through the extendedlink, arc, and locator elements.
Accessibility is improved by allowing both inline and block annotations in most context. The alt element is now allowed in most places for inline annotations, the new element annotation supports block annotations.
The DocBook V4.x markup for Tables of Contents, or more generally for Lists of Titles, was complex and had not evolved quite in step with the rest of DocBook. In DocBook V5.0, it has all been replaced by a quite simple, recursive toc/tocdiv/tocentry structure.
While most Tables of Contents and Lists of Titles are generated automatically and authors never have to produce markup for them by hand, this simplified content model should make it easier for authors to generate them when necessary. One possible application of hand-authored toc markup is to generate custom hierarchies which can be assembled on-the-fly from a library of topics marked up in DocBook.
Grammar based validation technologies (like RELAX NG) and rule based validation technologies (like Schematron) are naturally complementary. Mixing them allows us to play to the strengths of each without stretching either to enforce constraints that they aren’t readily designed to enforce.
For example, DocBook NG requires that the root element of a document have an explicit version attribute. Because there are a great many elements that can be root elements in DocBook, and because they can almost all appear as descendants of a root element as well, it would be tedious to express this constraint in RELAX NG. But it is easy in a rule-based schema language.
DocBook V5.0 uses Schematron where appropriate.
From the very beginning, one of the goals of DocBook has been that users should be able to produce customizations that are either subsets of extensions of DocBook.
Customization is possible in DocBook V4.x, but because of the intricacies of XML DTD syntax and the complex and highly stylized patterns of parameter entitiy usage in DocBook, it's not as easy as we would like it to be.
In DocBook V5.0, we hope to take advantage of RELAX NGs more robust design (and it's lack of pernicious determinism rules) to make customization easier.
Three schema design patterns get us most of the way there.
DocBook elements, particularly the inlines, can be divided into broad classes: general purpose, technical, error-related, operating-system related, bibliographic, publishing, etc. In DocBook V5.0, these are collected together in named patterns.
To add a new inline, endpoint for example, to the list of technical inlines, one need only extend the appropriate pattern. If an element should appear in several classes, they can all be extended in the same way:
db.technical.inlines |= endpoint db.programming.inlines |= endpoint db.os.inlines |= endpoint
Much the same concept was used in DocBook V4.x, where instead of patterns we had parameter entities. However, the constraints of DTD validation severely limit the circumstances under which an element can appear twice in a content model. That meant that adding an element to one parameter entity might make it an error to add it to another. Such constraints do not exist in RELAX NG which greatly simplifies the customization.
Each element in DocBook V5.0 is defined by its own pattern. To change the content model of an element, only that pattern need be redefined. To remove an element from DocBook, that pattern can be redefined as “notAllowed”.
Each attribute list in DocBook V5.0 is defined by its own pattern. To change the list of attributes available on an element, only that pattern need be redefined. To remove all the attributes, that pattern can be redefined as “empty”.
There’s an XSLT 1.0 stylesheet for performing conversion from DocBook V4.x to DocBook V5.0. Presented with a valid DocBook V4.x document, it attempts to produce a valid DocBook V5.0 document.
It succeeds entirely automatically for the most part, though human intervention is suggested for constructs that might have multiple interpretations (and therefore multiple possible transformations).
Users are encouraged to report documents that are not successfully transformed by the stylesheet, especially those which do have valid DocBook V5.0 representations.
See http://www.relaxng.org/ for a list of tools that can validate an XML document using RELAX NG. Note that not all products are capable of evaluating the Schematron assertions in the schema.
This appendix registers a new MIME media type, “application/docbook+xml”.
application
docbook+xml
None.
This parameter has identical semantics to the charset parameter of the application/xml media type as specified in [RFC 3023] or its successors.
By virtue of DocBook XML content being XML, it has the same considerations when sent as “application/docbook+xml” as does XML. See [RFC 3023], Section 3.2.
Several DocBook elements may refer to arbitrary URIs. In this case, the security issues of RFC 2396, section 7, should be considered.
None.
This media type registration is for DocBook documents as described by [DocBook: TDG5].
There is no experimental, vendor specific, or personal tree predecessor to “application/docbook+xml”, reflecting the fact that no applications currently recognize it. This new type is being registered in order to allow for the deployment of DocBook on the World Wide Web, as a first class XML application.
There is no single initial octet sequence that is always present in DocBook documents.
DocBook documents are most often identified with the extension “.xml”.
TEXT
Norman Walsh, <ndw@nwalsh.com>.
COMMON
The DocBook specification is a work product of the DocBook Technical Committee at OASIS.
For documents labeled as “application/docbook+xml”, the fragment identifier notation is exactly that for “application/xml”, as specified in [RFC 3023] or its successors.
The following individuals have participated in the creation of this specification and are gratefully acknowledged:
There are no user-visible changes in 5.0 (Public Review Draft 1).
This version of DocBook V5.0 will become the official Committee Specification version of DocBook V5.0 as soon as the Technical Committee balloting process is finished.
There are no user-visible changes in 5.0CR7. Some of the sources we reorganized to make future customization easier.
If no bug reports are received before the November 7, 2007 DocBook TC meeting, this version will become the official DocBook V5.0 release.
This release contains a few bug fixes and improvements over V5.0CR5.
Fixed RFE 1759782: Allow uri anywhere email occurs.
Fixed RFE 1784312: Allow book to be completely empty; allow personblurb and titleabbrev in bibliographic contexts.
Fixed RFE 1795884: Allow MathML in inlineequation.
Fixed RFE 1800916: Allow keycap (and friends) in userinput.
There are no user-visible changes in DocBook V5.0CR5.
This release contains a few improvements over V5.0CR3.
Fixed RFE 1708032: Fixed pattern naming inconsistency; changed db.href.attribute to db.href.attributes.
Fixed RFE 1700154: Added sortas to termdef.
Fixed RFE 1686919: Added an NVDL rules file.
Fixed RFE 1705596: Aded db.programming.inlines (classname, exceptionname, function, initializer, interfacename, methodname, modifier, ooclass, ooexception, oointerface, parameter, returnvalue, type, and varname) to the content model of code.
Fixed RFE 1689228: Fixed typo in Schematron assertion.
This release contains a few improvements over V5.0CR2.
Fixed RFE 1679775: Changed semantics of termdef. A firstterm is now required (instead of a glossterm as in previous releases).
Fixed RFE 1673820: Adopted “http://docbook.org/xlink/role/olink” as an XLink role value (xlink:role) to identify OLinks expressed using XLink attributes.
Allow info in HTML tables.
Fixed RFE 1682917: Added pgwide attribute to example.
Fixed RFE 1644553: Added label attribute to CALS and HTML tables.
Fixed RFE 1588693: Added an acknowledgements element, peer to dedication, replacing ackno which had only been available at the end of article.
This release contains a few improvements over V5.0CR1 and a few bug fixes.
Fixed RFE 1630203: Allow empty glossary.
Fixed RFE 1627845: Allow optional caption on CALS table and informaltable.
Fixed RFE 1589139 (and RFE 1621178): Allow title and titleabbrev on qandaentry.
Fixed RFE 1675932: Restore localname, prefix and namespace as class attribute values on tag.
Fixed RFE 1669465: Schematron rules should refer to @xml:id, not @id.
This release contains a few improvements over V5.0b9 and a few bug fixes.
Made the content model of blockquote broader. It was restricted too far in the transition to 5.0.
Fixed RFE 1575537: Allow markup from other namespaces in info.
Fix the content model of ackno so that it's the same as DocBook 4.x.
Fix bug where caption was accidentally allowed in CALS tables.
This release contains several improvements over V5.0b8.
Fixed RFE 1537424: Allow jobtitle inline.
Fixed typo; titles are now required on task, consistent with DocBook V4.x.
Fixed RFE 1554914: Make targetdoc attribute on olink optional.
Fixed RFE 1568417: Don't generate duplicate Schematron rules.
Fixed RFE 1568419: Inverted Schematron assertion for termdef.
This release contains several improvements over V5.0b7.
Fixed RFE 1535166: Improve the data types of attributes in DocBook.
Fixed RFE 1549632: The inlineequation element should use inlinemediaobject not mediaobject.
A number of small documentation improvements in the area of attribute and attribute enumerations.
This release contains several improvements over V5.0b6.
Fixed RFE 1520074: Define separate patterns for all the effectivity attributes to make customization easier.
Attempted to address RFE 1512505: Added an audience effectivity attribute.
Rename audience, origin, and level on simplemsgentry to msgaud, msgorig, and msglevel, respectively. This is a better parallel with the descendent elements of msgentry and avoids a conflict with the newly introduced audience effectivity attribute.
Added startinglinenumber attribute to orderedlist.
Fixed bug where one of fileref or entityref was required on imagedata even when the content was inline MathML or SVG.
This release contains several improvements over V5.0b5.
Fixed RFE 1434294: Allow MathML and SVG in imagedata. Note: SVG is no longer allowed as an alternative to imagedata. The alignment, scaling, and other presentational attributes are on imagedata so it seems more reasonable to allow SVG and MathML inside it.
Fixed RFE 1468921: Add person element. Added person and org.
Fixed RFE 1306027: Support for aspect-oriented programming. Allow modifier to appear in more places, and allow xml:space on modifier.
Added db.publishing.inlines to db.bibliographic.elements so that, for example, foreignphrase can be used in bibliomixed.
This release contains several improvements over V5.0b4.
Restored the class attribute on refmiscinfo (removing the type attribute introduced in V5.0b4). The class attribute is now an enumerated list with the standard otherclass extension point.
Added parameter to db.technical.inlines. This allows parameter to occur in places like userinput and computeroutput.
Allow XInclude elements in info elements (in the docbookxi schemas).
Fixed bugs in the build process that resulted in broken DTD versions of beta 4 and earlier betas.
This release contains several improvements over V5.0b3.
Fixed RFE 1416903: Added a cover element to hold additional material for document covers. Updated reference documentation.
Corrected a typo in the list of values allowed on the class attribute of biblioid: changed “pubnumber” to “pubsnumber” (note the “s”). This is consistent with its use as a replacement for the pubsnumber tag that has been removed in DocBook V5.0.
Fixed a bug in the content model of the various “info” elements. In previous beta releases, the title-related elements (title, titleabbrev, and subtitle) were erroneously required to appear first. The requirement is only that they appear exactly or at most once, depending on the context.
Renamed the “sgmlcomment” attribute value of the class attribute of tag. There's no significant difference between XML and SGML comments and the “SGML” name implies that there ought to be an “xmlcomment” value, which there is not. The new value is simply “comment”.
Renamed the “class” attribute of refmiscinfo. The DocBook semantics of class attributes is that they have enumerated values. This attribute should always have been called “type” as it is now.
Updated renderas on bridgehead and class on othercredit to have “attribute/otherattribute” co-constraints. (In other words, if you select “other” for renderas on bridgehead or class on othercredit, you have to also provide a value for otherrenderas or othercredit, respectively.
Changed width attribute in media objects to be “text” instead of “xs:integer”.
Fixed bug in the build process that resulted in unusable XML Schema versions of beta 2 and beta 3.
Improved reference documentation for attributes on many elements.
This release contains several small improvements over V5.0b2.
Fixed RFE 1358844: allow multiple imageobjects inside an imageobjectco. Updated reference documentation.
Restored default values to the type attribute on simplelist and the choice and rep attributes on methodparam, arg, and group. Fixed a bug in paramdef where plain was accidentally allowed as a choice. These defaults are reflected in the generated XML DTD as well.
Reduced the content model of blockquote which seemed way too broad.
Improved reference documentation for attributes on many elements.
This release addresses several bugs identified in V5.0b1.
When SVG or MathML are used, allow more than one element from the respective namespace to be used in the appropriate location.
Fixed RFE 1356238: the xrefstyle attribute on olink is now “text” rather than “xsd:IDREF”.
Fixed RFE 1380477: Make xml:id optional on areas within areaset; allow linking attributes on areaset; establish the semantics that an area inside an areaset inherits its linking attributes from the areaset if it doesn't have linking attributes of its own.
Allow alt inside equation, informalequation, and inlineequation.
Fixed RFE 1356254: dbforms.rnc schema now supports the HTML form elements.