Functionality
This gem processes Asciidoctor documents following a template for generating ISO International Standards. The following outputs are generated.
-
(Optional) An HTML preview generated directly from the Asciidoctor document, using native Asciidocot formatting.
-
AsciiMathML is to be used for mathematical formatting. The gem uses the Ruby AsciiMath parser, which is syntactically stricter than the common MathJax processor; if you do not get expected results, try bracketting terms your in AsciiMathML expressions.
-
-
an XML representation of the document, intended as a document model for ISO International Standards.
-
The XML representation is processed in turn to generate the following outputs as end deliverable ISO standard drafts.
-
Microsoft Word output (
.doc
), following the style conventions of the ISO Standard Microsoft Word template. -
PDF (forthcoming)
-
HTML (in development)
-
This AsciiDoc syntax for writing ISO standards is hereby named "AsciiISO".
This README provides an overview of the functionality of the gem; see also Guidance for authoring ISO standards using the gem.
Usage
$ asciidoctor a.adoc # HTML output of Asciidoc file
$ asciidoctor -b iso -r 'asciidoctor-iso' a.adoc # ISO XML output,
The initial step is optional, and can be used as a preview of what the gem
will do; it generates a {filename}.html
file.
When invoked within Asciidoctor, the gem translates the document into ISO XML format, and then validates its output against the ISO XML document model; errors are reported to console against the XML, and are intended for users to check that they have provided all necessary components of the document.
The gem then converts the XML to Micosoft Word ({filename}.doc
), HTML ({filename}.html
),
and PDF (forthcoming).
In the process of generating the Microsoft Word document, the associated
IsoDoc gem generates temporary files:
{filename}.htm
contains the Microsoft Word-specific HTML for inclusion in that file,
and the {filename}_files
folder contains any images and headers for the Word
document.
Content Warnings
The gem also realises several format checks as prescribed in ISO/IEC DIR 2, and warns the user about them in the console:
-
Numbers with what looks like dots instead of commas for decimal points.
-
Groups of numbers without spacing for every three digits. (The gem attempts to ignore ISO references.)
-
No space before percent sign.
-
No bracketing of tolerance in percentage (e.g.
15 ± 7 % .
) -
No recommendations, permissions or requirements (detected by keyword) in: foreword, scope, introduction, term examples and examples, notes, footnotes.
-
No subclauses that are the only child of a clause. (In clauses, annexes, or scopes.)
-
5 levels of subclause nesting. (Never actuated, AsciiDoc only permits 4 levels of subsections.)
-
Non-ISO/IEC reference turning up as normative.
-
Term definition starts with an article, or ends with a period.
-
Title intro or title part appears in only one of French or English.
Approach
Document model
The document model ("IsoDoc") used in document generation intends to introduce rigour into the ISO standards authoring process; the existing Microsoft Word template from ISO do not support such rigour down to the element level. It also introduces flexibility by decoupling the document structure from its presentation.
The ISO International Standard format is prescribed in ISO/IEC DIR 2 "Principles and rules for the structure and drafting of ISO and IEC documents", to a level amenable to an explicit document model. A formal document model would allow checking for consistency in format and content, and expedite authoring and quality control of ISO standards. Authoring standards through a more abstract formal model also permit enhanced functionality such as cross-reference link checking and auto-numbering of sections, figures, tables and formulas. Outputting a document in different languages also becomes straightforward.
The document model for ISO Standards specifically is derived from a more general StandardDocument model. Other ISO-like standards can also be derived from this more general model; CSD (https://github.com/riboseinc/csd, https://github.com/riboseinc/asciidoctor-csd) is one such instance.
The document model for ISO Standards contains all the structures described in ISO/IEC DIR 2. It is expressed as a Relax NG Compact schema; actual validation occurs against its full Relax NG counterpart.
Asciidoctor
Asciidoctor has been selected as the authoring tool to generate the document model representation of ISO standards. It is a document formatting tool like Markdown and DocBook, which combines the relative ease of use of the former (using relatively lightweight markup), and the rigour and expressively of the latter (it has a well-defined syntax, and was in fact initially developed as a DocBook document authoring tool). Asciidoctor has built-in capability to output Text and HTML; so it can be used to preview the file as it is being authored. However the gem natively outputs HTML and Word output, so there should not be much need for this.
In order to generate HTML preview output close to what is intended
in the ISO standard, the Asciidoc
document includes a fair amount of formatting instructions (e.g. disabling
section numbering where appropriate, the titling of Appendixes as Annexes), as
well as ISO boilerplate text, and predefined section headers (sections are
recognised by fixed titles such as Normative References
). Authoring ISO
standards in this fashion assumes that users will be populating an Asciidoc
template, and not removing needed formatting instructions.
Asciidoctor has some formatting constraints because of its own document model, that users need to be aware. For example, Asciidoc has a strict division between inline and block elements, which disallows certain kinds of nesting; so a list cannot be embedded within a paragraph, it can only constitute its own paragraph (though lists themselves can be nested within each other). Asciidoctor also disallows multiple paragraphs in footnotes, by design. (The document model does not impose this constraint, so you could edit the generated XML to break up paragraphs within a footnote.)
Asciidoctor model additions
Section titles
ISO has special section types: "Scope", "Normative References", "Terms and Definitions", "Symbols and Abbreviated Terms", "Bibliography". By default, these are identified in Asciidoc by using those titles. The gem allows you to override the title by using a heading
attribute on the node, so that the actual title in your Asciidoc can be something different; that is useful, for example, if you are translating the document into different languages. So:
[heading=scope]
== 范围
Note that both the XML population, and the isodoc gem will overwrite any supplied title. If you are translating ISO documents into other languages, you will still need access to versions of the asciidoctor-iso and isodoc gems in those languages.
Obligation
The obligation of sections (whether they are normative or informative) is indicated with the attribute "obligation". For most sections, this is fixed; for annexes and clauses, the default value of the obligation is "normative", and users need to set the obligation to "informative" as a section attribute.
[[AnnexA]]
[appendix,obligation=informative]
== Determination of defects
Term markup
To ensure the structure of Terms and Definitions is captured accurately, the following macros are defined, and must be used to mark up their respective content:
alt:[TERM]
-
for alternative terms
deprecated:[TERM]
-
for deprecated terms
domain:[TERM]
-
for term domains
The macro contents can contain their own markup.
=== paddy
alt:[_paddy_ rice]
deprecated:[#[smallcap]#cargo# rice]
domain:[rice]
_paddy_ (<<paddy>>) from which the husk only has been removed
Terms and Definitions markup
If the Terms and Definitions of a standard are partly or fully sourced from
another standard, that standard is cited in a source
attribute to the section,
which is set to the reference anchor of the standard (given under the Normative
Referencecs)..
The boilerplate of the Terms and Definitions section is adjusted accordingly.
[source=ISO712]
== Terms and Definitions
Multiple sources are allowed, and need to be quoted and comma-delimited:
[source="ISO712,ISO24333"]
== Terms and Definitions
Paragraph alignment
Alignment is defined as an attribute for paragraphs:
[align=left]
This paragraph is aligned left
[align=right]
This paragraph is aligned right
[align=center]
This paragraph is aligned center
[align=justified]
This paragraph is justified, which is the default
Reviewer notes
Reviewer notes are encoded as sidebars, and can be separated at a distance from the
text they are annotating; the text they are annotating is indicated through anchors.
Reviewer notes are only rendered if the document has a :draft:
attribute.
The following attributes on reviewer notes are mandatory:
-
reviewer
attribute (naming the reviewer) -
the starting target anchor of the note (
from
attribute)
The following attributes are optional:
-
date
attribute, optionally including the time (as xs:date or xs:datetime) -
the ending target anchor of the note (
to
attribute)
The span of text covered by the reviewer note is from the start of the
text encompassed by the from
element, to the end of the text encompassed
by the to
element. If only the from
element supplied, the reviewer note
covers the from
element. The from
and to
elements can be bookmarks,
which cover no space.
[[clause_address_profile_definition]]
=== Address Profile Definition (AddressProfileDescription)
[[para1]]
This is a clause address [[A]]profile[[B]] definition
[reviewer="Nick Nicholas",date=20180125T0121,from=clause_address_profile_definition,to=para1]
****
I do not agree with this statement.
****
[reviewer="Nick Nicholas",date=20180125T0121,from=A,to=B]
****
Profile?!
****
Strikethrough and Small Caps
The following formatting macros are used for strikethrough and small caps text:
[strike]#strike through text#
[smallcap]#small caps text#
Count of table header and footer rows
In Asciidoc, a table can have at most one header row or footer row. In ISO,
a nominal single header row is routinely broken up into multiple rows in order
to accommodate units or symbols, that line up against each other, though
they are displayed as merged cells with no grid between them. To address this,
tables can be marked up with an optional headerrows
attribute:
[headerrows=2]
|===
.2+|Defect 4+^| Maximum permissible mass fraction of defects in husked rice +
stem:[w_max]
| in husked rice | in milled rice (non-glutinous) | in husked parboiled rice | in milled parboiled rice
| Extraneous matter: organic footnote:[Organic extraneous matter includes foreign seeds, husks, bran, parts of straw, etc.] | 1,0 | 0,5 | 1,0 | 0,5
|===
Inline clause numbers
For some clauses (notably test methods), the clause heading appears inline with the clause, instead of being separated on a different line. This is indicated in Asciidoc by the option
attribute inline-header
:
[%inline-header]
[[AnnexA-2-1]]
==== Sample divider,
consisting of a conical sample divider
Bibliographic details
Citations can include details of where in the document the citation is located; these are entered by suffixing the type of locality, followed by the reference. Multiple instances of locality and reference can be provided, delimited by comma or colon. For example:
<<ISO712,section 5, page 8-10>> # renders as: ISO 712, Section 5, Page 8-10
<<ISO712,section 5, page 8-10: 5:8-10>> # renders as ISO 712, 5:8-10
<<ISO712,whole>> # renders as: ISO 712, Whole of text
The references cannot contain spaces. Any text following the sequence of localities will be displayed instead of the localities.
A custom locality can be entered by prefixing it with locality:
:
<<ISO712,locality:frontispiece 5, page 8-10>> # renders as: ISO 712, Frontispiece 5, Page 8-10
Custom localities may not contain commas, colons, or space. Localities with the locality:
prefix are recognised in internationalisation configuration files.
Additional warning types
Asciidoctor natively supports the ISO admonitions "Caution", "Warning", and "Important" through its admonition syntax:
CAUTION: This is a single-block caution
[WARNING]
====
This is a
multiple-block warning
====
If the admonitions "Danger" and "Safety Precaution" are needed, they should be indicated
through a type
attribute, which will override the admonition type appearing in the Asciidoc:
[type=Danger]
CAUTION: This is a single-block caution
[WARNING,type=Safety Precaution]
====
This is a
multiple-block warning
====
Block Quotes
As in normal Asciidoctor, block quotes are preceded with an author and a citation; but the citation is expected to be in the same format as all other citations, a cross-reference optionally followed by text, which may include the bibliographic sections referenced:
[quote, ISO, "ISO7301,section 1"]
_____
This International Standard gives the minimum specifications for rice (_Oryza sativa_ L.)
which is subject to international trade. It is applicable to the following types: husked rice
and milled rice, parboiled or not, intended for direct human consumption. It is neither
applicable to other products derived from rice, nor to waxy rice (glutinous rice).
_____
Features not visible in HTML preview
The gem uses built-in Asciidoc formatting as much as possible, so that users can retain the ability to preview documents; for Terms and Definitions clauses, which have a good deal of explicit structure, macros have been introduced for semantic markup (admitted terms, deprecated terms, etc).
The default HTML output of an Asciidoc-formatted ISO document is quite close to the intended final output, with the following exceptions, and with the additional exceptions listed above as markup introduced for ISO markup. Note that the final outputs of the conversion (Microsoft Word, PDF, HTML) do not have these exceptions, and comply with the ISO Standard specifications.
-
Terms and Definitions: each term is marked up as an unnumbered subclause, the semantic markup of alternate and other terms is not rendered visually.
-
Formulas: Asciidoctor has no provision for the automated numbering of isolated block formulas ("stem"), and does not display the number assigned a block formula in its default HTML processor—although it does provide automated numbering of examples. Formula numbering is provided in the final outputs of the conversion.
-
Missing elements: The document model does not yet include Asciidoc elements that do not appear to be relevant to ISO Standards; these will be ignored in generating ISO XML. Those elements include:
-
sidebars (
aside
) (as distinct from warnings), -
ASCII art/preformatted text (
literal
) (as distinct from sourcecode listings), -
page breaks (
thematic break
).
-
-
Markup: Some connecting text which is used to convey markup structure is left out: in particular,
DEPRECATED
andSOURCE
(replaced by formatting macros). -
Tables: Table footnotes are treated like all other footnotes: they are rendered at the bottom of the document, rather than the bottom of the table, and they are not numbered separately.
-
Cross-references: Footnoted cross-references are indicated with the reference text
fn
in isolation, orfn:
as a prefix to the reference text. The default HTML processor leaves these as is: if no reference text is given, onlyfn
will be displayed (though it will still hyperlink to the right reference). -
References: The convention for references is that ISO documents are cited without brackets by ISO number, and optionally year, whether they are normative or in the bibliography (e.g.
ISO 20483:2013
); while all other references are cited by bracketed number in the bibliography (e.g.[1]
). The default HTML processor treats all references the same, and will bracket them (e.g.[ISO 20483:2013]
). For the same reason, ISO references listed in the bibliography will be listed under an ISO reference, rather than a bracketed number. -
References: References are rendered cited throughout, since they are automated. For that reason, if reference is to be made to both an undated and a dated version of an ISO reference, these need to be explicitly listed as separate references. (This is not done in the Rice model document, which lists ISO 6646, but under Terms and Definitions cites the dated ISO 6646:2011.
-
References: ISO references that are undated but published have their date indicated under the ISO standards format in an explanatory footnote. Because of constraints introduced by Asciidoctor, that explanation is instead given in square brackets in Asciidoc format.
-
Annexes: Subheadings cannot preserve subsection numbering, while also appearing inline with their text (e.g. Rice document, Annex B.2): they appear as headings in separate lines.
-
Annexes: Cross-references to Annex subclauses are automatically prefixed with
Clause
rather thanAnnex
or nothing. -
Metadata: Document metadata such as document numbers, technical committees and title wording are not rendered in the default HTML output.
-
Patent Notice: Patent notices are treated and rendered as a subsection of the introduction, with an explicit subheading.
-
Numbering: The numbering of figures and tables is sequential in the default HTML processor: it does not include the Clause or Annex number. This, Figure 1, not Figure A.1.
-
Notes: There is no automatic note numbering by the default HTML processor.
-
Review Notes: The reviewer on the review note is not displayed.
-
Keys: Keys to formulas and figures are expected to be marked up as definition lists consistently, rather than as inline prose.
-
Figures: Simple figures are marked up as images, figures containing subfigures as examples. Numbering by the default HTML processor may be inconsistent. Subfigures are automatically numbered as independent figures.
-
Markup: The default HTML processor does not support CSS extensions such as small caps or strike through, though these can be marked up as CSS classes through custom macros in Asciidoctor: a custom CSS stylesheet will be needed to render them.
Document Attributes
The gem relies on Asciidoctor document attributes to provide necessary metadata about the document. These include:
:nodoc:
-
Do not generate Word and HTML output, only generate XML output. Can be used as a command-line option (like all other document attributes):
asciidoctor -a nodoc -b iso -r "asciidoctor-iso" a.adoc
:novalid:
-
Suppress validation.
:i18nyaml:
-
Name of YAML file of internationalisation text, to use instead of the built-in English, French or Chinese text used to label parts of the document (e.g. "Table", "Foreword", boilerplate text for Normative References, etc.) Use if you wish to output an ISO standard in a language other than those three. A sample YAML file for English, with "Foreword" replaced with "Frontispiece", is available at spec/examples/english.yaml.
:docnumber:
-
The ISO document number (mandatory)
:tc-docnumber:
-
The document number assigned by the Technical committee
:partnumber:
-
The ISO document part number. (This can be "part-subpart" if this is an IEC document.)
:edition:
-
The document edition
:revdate:
-
The date the document was last updated
:draft:
-
The document draft (used in addition to document stage, for multiple iterations: expected format X.Y)
:copyright-year:
-
The year which will be claimed as when the copyright for the document was issued
:library-ics:
-
The ICS (International Categorization for Standards) number for the standard. There may be more than one ICS for a document; if so, they should be comma-delimited. (The ics identifier is added to the document metadata, but is not output to the current document templates.)
:title-intro-en:
-
The introductory component of the English title of the document
:title-main-en:
-
The main component of the English title of the document (mandatory). (The first line of the AsciiDoc document, which contains the title introduced with
=
, is ignored) :title-part-en:
-
The English title of the document part
:title-intro-fr:
-
The introductory component of the French title of the document. (This document template presupposes authoring in English; a different template will be needed for French, including French titles of document components such as annexes.)
:title-main-fr:
-
The main component of the French title of the document (mandatory).
:title-part-fr:
-
The French title of the document part
:doctype:
-
The document type (see ISO deliverables: The different types of ISO publications ) (mandatory). The permitted types are:
international-standard, technical-specification, technical-report, publicly-available-specification, international-workshop-agreement, guide
. :docstage:
-
The stage code for the document status (see International harmonized stage codes)
:docsubstage:
-
The substage code for the document status (see International harmonized stage codes)
:iteration:
-
The iteration of a stage, in case there have been multiple drafts (e.g.
2
on aCD
: this is the second iteration through theCD
stage). :secretariat:
-
The national body acting as the secretariat for the document in the deafting stage
:technical-committee-number:
-
The number of the relevant ISO technical committee
:technical-committee-type:`
-
The type of the relevant technical committee. Defaults to
TC
if not supplied. Values:TC1, `PC
,JTC
,JPC
. :technical-committee:
-
The name of the relevant ISO technical committee (mandatory)
:subcommittee-number:
-
The number of the relevant ISO subcommittee
:subcommittee-type:
-
The type of the relevant ISO subcommittee. Defaults to
SC
if not supplied. Values:SC
,JSC
. :subcommittee:
-
The name of the relevant ISO subcommittee
:workgroup-number:
-
The number of the relevant ISO workgroup
:workgroup-type:
-
The type of the relevant ISO workgroup. Defaults to
WG
if not supplied. Example values:JWG
,JAG
,AG
(advisory group),AHG
,SWG
,SG
,MA
(maintenance agency),CORG
,JCG
,CAG
:workgroup:
-
The name of the relevant ISO workgroup
:language:
-
The language of the document (
en
orfr
) (mandatory) :script:
-
The script of the document (defaults to
Latn
). Must be supplied asHans
for Simplified Chinese. :publisher:
-
The standards agency publishing the standard; can be multiple (comma-delimited). Defaults to
ISO
. :body-font:
-
Font for body text; will be inserted into CSS. Defaults to Cambria for Latin script, SimSun for Simplified Chinese.
:header-font:
-
Font for headers; will be inserted into CSS. Defaults to Cambria for Latin script, SimHei for Simplified Chinese.
:monospace-font
-
Font for monospace; will be inserted into CSS. Defaults to Courier New.
The attribute :draft:
, if present, includes review notes in the XML output;
these are otherwise suppressed.
The document proper can reference the values of document attributes, which is convenient for reusability. For example, the Rice Model document references the editorial groups that have contributed to the document as
This document was prepared by Technical Committee ISO/TC {technical-committee-number}, _{technical-committee}_, Subcommittee SC {subcommittee-number}, _{subcommittee}_.
If the corresponding document attributes are not populated in the header, then the references themselves will not be populated.
Data Models
The IsoDoc data model (IsoStandardDocument) is instantiated from the StandardDocument model. For details please visit that page.
Code Structure
The gem invokes the following other gems as a division of labour.
-
This gem generates the IsoDoc XML proper
-
https://github.com/riboseinc/isodoc renders IsoDoc XML into HTML
-
https://github.com/riboseinc/html2doc converts HTML into Microsoft Word
-
https://github.com/riboseinc/isodoc-models derives the ISO Standard grammar used for validation from the generic IsoDoc grammar
Examples
The gem has been tested to date against the "Rice document", the ISO’s model document of an international standard. Sample representation of the Rice document in Asciidoctor, and output formats, are included in the https://github.com/riboseinc/isodoc-rice repository.
See also spec/asciidoctor-iso
for individual features.