Tate ✍️
Tate converts accented characters and transliterates non-latin scripts to their closest ASCII equivalent.
Tate is a productivity tool, it behaves like a standard Unix application and can be chained with other Unix commands. It reads from standard input and writes to standard output. You can use it either as a commandline utility or a library.
Examples
Let's say you have a French sentence with a lot of weird characters and you want to convert it into ASCII in the most representative way. You can use:
echo 'Le cœur de la crémiére' | tate #=> Le coeur de la cremiere
Or some Bulgarian text you can't read:
echo 'Здравей!' | tate --lang=bg #=> Zdravey!
Set language using lang
option for custom filters, e.g. German:
echo 'Von gro<b>ß</b>en Bl<b>ö</b>cken haut man gro<b>ß</b>e St<b>ü</b>cke.' | tate --lang=de
Letters ö, ü and ß will be transliterated based on German transliteration rules:
Von gro<b>ss</b>en Bl<b>oe</b>cken haut man grosse St<b>ue</b>cke.
Language specific punctuation will be converted to closest ASCII equivalent.
For example, in Catalan, notice how the quotes (cometes franceses) and the interpunct (punt volat) are transliterated:
«Dóna amor que seràs feliç!». Això, il·lús company geniüt, ja és un lluït rètol blavís d’onze kWh.
"Dona amor que seras felic!". Aixo, il-lus company geniut, ja es un lluit retol blavis d'onze kWh.
Installation
Add this line to your application's Gemfile:
gem 'tate'
And then execute:
$ bundle
Or install it yourself as:
$ gem install tate
Usage
Ruby Library
require 'tate'
Tate::transliterate('Zəfər', language='az') #=> Zefer
Commandline Utility
Usage: tate [options]
-l, --lang=[LANGUAGE] Set language for custom filters
-h, --help Show this message
-v, --version Show version
Interactive Mode
If you call tate
without providing any arguments, it will expect you to provide input using standard input (keyboard). After you are done typing you can use cmd + D
to trigger EOL (End of Line)
and the result will printed in the next line.
Standard Streams
You can pipe the output of another command into tate.
curl gov.bg/bg | tate --lang=bg > index.html
Language Support
There are custom filters for:
Azeri, Bulgarian, Catalan, French, German, Hungarian, Polish, Romanian, Spanish, and Vietnamese.
The following languages are known to work (w/o custom filters):
Croatian, Czech, Danish, Esperanto, Estonian, Finnish, Icelandic, Latvian, Lithuania, Norwegian, Portuguese, Scottish, Slovak, Slovenian, Swedish, Turkish, and Welsh.
What's next?
Russian, Irish, Arabic, and Yoruba.
Is it any good?
Yes.
Contributing
- Fork it (https://github.com/krmbzds/tate/fork)
- Create your feature branch (
git checkout -b add-irish-support
) - Commit your changes (
git commit -am 'Add Irish language support'
) - Push to the branch (
git push origin add-irish-support
) - Create a new Pull Request
Custom Filters
You can add custom language filters under lib/rules
directory.
Donations
You can donate me at Liberapay. Thanks! ☕️
Trivia
tate is short for transliter*ate*.
Nobody has time to type transliterate in the terminal.
License
Copyright © 2019 Kerem Bozdas
This project is available under the terms of the MIT License.