YAMultilingualMarkdown
- English / Japanese
YAMultilingualMarkdown is a utility to convert Yet Another Multilingual Markdown to HTML and other formats.
Yet Another Multilingual Markdown (format) is a Markdown dialect designed for hosting multilingual content. YAMultilingualMarkdown (tool) converts Yet Another Multilingual Markdown to other formats while extracting only the content in specified language(s).
Usage
Synopsis
ya_multilingual_markdown [OPTIONS] [FILE]
Options
Type ya_multilingual_markdown --help to show command line options.
Convert Yet Another Multilingual Markdown to HTML and other formats
Usage:
ya_multilingual_markdown [OPTIONS] [FILE]
Options:
--output-format=FORMAT Output format
(html|kramdown|markdown)
(default: html)
--langs=LANG1,LANG2,... Languages to be included
(omitting this option implies all)
--heading-lang-sep=STRING Languages separator in headings
(default: " / ")
--lang-attr-name=STRING Attribute name for language
(default: lang)
--html-output-type=TYPE Output type for HTML output
(fragment|document)
(default: fragment)
--html-template-file=FILE HTML document template file in eRuby format
--html-link-suffixes=FROM:TO,...
Link suffixes to rewrite in HTML output
(default: .md:.html)
--show-default-html-template Show default HTML document template
--log-level=SEVERITY Log level
(unknown|fatal|error|warn|info|debug)
(default: warn)
--help Show help
--version Show version
Examples
Multilingual contents: headings, paragraphs, and other elements
A simple Yet Aonther Multilingual Markdown document looks like the following (snow_white.md):
# Schneeweißchen
{: lang="de"}
# Little Snow-white
{: lang="en"}
Es war einmal mitten im Winter,...
{: lang="de"}
Once upon a time in the middle of winter,...
{: lang="en"}
(You can use kramdown-style (PHP Markdown Extra-style) extended syntax ({: name="value"}) to add attributes to block elements.)
Keep all languages
Without language-related options, the output will contain all languages.
ya_multilingual_markdown snow_white.md
Excerpt from the output:
<h1><span lang="de">Schneeweißchen</span> / <span lang="en">Little Snow-white</span></h1>
<p lang="de">Es war einmal mitten im Winter,...</p>
<p lang="en">Once upon a time in the middle of winter,...</p>
In a browser, above output may look like the following:
Schneeweißchen / Little Snow-white
Es war einmal mitten im Winter, ...
Once upon a time in the middle of winter, ...
Extract single language
With option --langs=en, the output will contain only the elements with lang whose value is set to en (and elements without lang attribute).
ya_multilingual_markdown --langs=en snow_white.md
Excerpt from the output:
<h1><span lang="en">Little Snow-white</span></h1>
<p lang="en">Once upon a time in the middle of winter, ...</p>
In a browser, above output may look like the following:
Little Snow-white
Once upon a time in the middle of winter, ...
Extract multiple languages
With option --langs=de,en, the output will contain elements with lang set to de or en (and elements without lang).
ya_multilingual_markdown --langs=de,en snow_white.md
Excerpt from the output:
<h1><span lang="de">Schneeweißchen</span> / <span lang="en">Little Snow-white</span></h1>
<p lang="de">Es war einmal mitten im Winter,...</p>
<p lang="en">Once upon a time in the middle of winter,...</p>
In a browser, the output may look like the following:
Schneeweißchen / Little Snow-white
Es war einmal mitten im Winter, ...
Once upon a time in the middle of winter, ...
Metadata in YAML front matter
Document metadata can be stored in the document using Jekyll-style YAML front matter.
A simple Yet Aonther Multilingual Markdown document with YAML front matter looks like the following (snow_white_with_metadata.md):
---
title: Little Snow-white
author:
- Jacob Ludwig Karl Grimm
- Wilhelm Carl Grimm
meta:
- name: original title
content: Schneewei
(The key author is a shortcut to <meta name="author" .../>.)
Let us include all languages in the output:
ya_multilingual_markdown snow_white_with_metadata.md
Excerpt from the output:
<title>Little Snow-white</title>
<meta name="author" content="Jacob Ludwig Karl Grimm" />
<meta name="author" content="Wilhelm Carl Grimm" />
<meta name="original title" content="Schneeweißchen" lang="de" />
<meta name="translator" content="Margaret Hunt" lang="en" />
<p>...</p>
You can filter metadata based on their languages.
Let us include en only (thus exclude de) in the output:
ya_multilingual_markdown --langs=en snow_white_with_metadata.md
Excerpt from the output:
<title>Little Snow-white</title>
<meta name="author" content="Jacob Ludwig Karl Grimm" />
<meta name="author" content="Wilhelm Carl Grimm" />
<meta name="translator" content="Margaret Hunt" lang="en" />
<p>...</p>
Output complete HTML document
Use --html-output-type=document to print complete HTML document rather than HTML fragments.
Input:
---
title: Little Snow-white
---
Once upon a time in the middle of winter, ...
Command line:
ya_multilingual_markdown --html-output-type=document snow_white_with_title.md
Output:
<!DOCTYPE html>
<html>
<head>
<title>Little Snow-white</title>
</head>
<body>
<p>Once upon a time in the middle of winter, ...</p>
</body>
</html>
You can provide a custom template using --html-template-file=FILE. Templates must be in eRuby format. Use --show-default-html-template to see the built-in default template.
Installation
gem install ya_multilingual_markdown
or
git clone https://github.com/hisashim/ya_multilingual_markdown
cd ya_multilingual_markdown
rake install
Requirements
Runtime requirements:
Development requirements (in addition to runtime requirements):
Notes
Limitations and known problems
Only a small subset of kramdown's extended syntax is supported, although YAMultilingualMarkdown is built upon kramdown.
As for multilingual headings, ALD (Attribute List Definition) for each heading must be placed only after the heading.
Supported:
# Schneeweißchen {: lang="de"} # Little Snow-white {: lang="en"}Not supported:
{: lang="de"} # Schneeweißchen {: lang="en"} # Little Snow-whiteThis compromise allows us to write id at the beginning of headings as well as at the end, with less code.
{: #title} # Schneeweißchen {: lang="de"} # Little Snow-white {: lang="en"}# Schneeweißchen {: lang="de"} # Little Snow-white {: lang="en"} {: #title}
Motivation
Yet Aonther Multilingual Markdown and its processor were born out of the need for a manuscript format for translated books.
Having a side-by-side version of the galley proof that includes both the original and translated texts helps translators review their work. Being able to search and edit manuscripts in a (sort of) side-by-side format is also useful.
While placing translated text in separate files from the original is a common and effective approach for localization/multilingualization projects, a format allowing multiple languages within a single file comes in handy for small projects. Yet Aonther Multilingual Markdown is an attempt to develop a proof of concept for such a format and a processing tool.
See also
Requirements for Japanese Text Layout is an excellent example of a multilingual document in HTML format.
Lightweight text formats and processing tools that allow multiple languages to be written in a single file (not necessarily feature or aim at extracting or representing multiple languages side-by-side):
License
This software is distributed under the terms of the MIT license.
Acknowledgments
Many thanks to:
- Koichi Sasada, whose manuscript preprocessor inspired me to come up with a lightweight markup format that features multilingualization.
- kramdown developers
Contributors
- Hisashi Morita - creator and maintainer