Module: Sanscript::Transliterate

Defined in:
lib/sanscript/transliterate.rb,
lib/sanscript/transliterate/schemes.rb

Overview

Sanskrit transliteration module. Derived from Sanscript (github.com/sanskrit/sanscript.js), which is released under the MIT and GPL Licenses.

“Sanscript is a Sanskrit transliteration library. Currently, it supports other Indian languages only incidentally.”

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.all_alternatesHash (readonly)

Returns the alternate-character data for all schemes.

Returns:

  • (Hash)

    the alternate-character data for all schemes



27
28
29
# File 'lib/sanscript/transliterate.rb', line 27

def all_alternates
  @all_alternates
end

.brahmic_schemesArray<Symbol> (readonly)

Returns the names of all Brahmic schemes.

Returns:

  • (Array<Symbol>)

    the names of all Brahmic schemes



18
19
20
# File 'lib/sanscript/transliterate.rb', line 18

def brahmic_schemes
  @brahmic_schemes
end

.defaultsHash (readonly)

Returns the default transliteration options.

Returns:

  • (Hash)

    the default transliteration options



30
31
32
# File 'lib/sanscript/transliterate.rb', line 30

def defaults
  @defaults
end

.roman_schemesArray<Symbol> (readonly)

Returns the names of all roman schemes.

Returns:

  • (Array<Symbol>)

    the names of all roman schemes



21
22
23
# File 'lib/sanscript/transliterate.rb', line 21

def roman_schemes
  @roman_schemes
end

.scheme_namesArray<Symbol> (readonly)

Returns the names of all supported schemes.

Returns:

  • (Array<Symbol>)

    the names of all supported schemes



15
16
17
# File 'lib/sanscript/transliterate.rb', line 15

def scheme_names
  @scheme_names
end

.schemesHash (readonly)

Returns the data for all schemes.

Returns:

  • (Hash)

    the data for all schemes



24
25
26
# File 'lib/sanscript/transliterate.rb', line 24

def schemes
  @schemes
end

Class Method Details

.add_brahmic_scheme(name, scheme) ⇒ Hash

Add a Brahmic scheme to Sanscript.

Schemes are of two types: “Brahmic” and “roman”. Brahmic consonants have an inherent vowel sound, but roman consonants do not. This is the main difference between these two types of scheme.

A scheme definition is a Hash that maps a group name to a list of characters. For illustration, see ‘transliterate/schemes.rb`.

You can use whatever group names you like, but for the best results, you should use the same group names that Sanscript does.

Parameters:

  • name (Symbol)

    the scheme name

  • scheme (Hash)

    the scheme data, constructed as described above

Returns:

  • (Hash)

    the frozen scheme data as it exists inside the module



73
74
75
76
77
78
79
80
# File 'lib/sanscript/transliterate.rb', line 73

def add_brahmic_scheme(name, scheme)
  name = name.to_sym
  scheme = scheme.deep_dup
  @schemes[name] = scheme.deep_freeze
  @brahmic_schemes.add(name)
  @scheme_names.add(name)
  scheme
end

.add_roman_scheme(name, scheme) ⇒ Hash

Add a roman scheme to Sanscript.

Parameters:

  • name (Symbol)

    the scheme name

  • scheme (Hash)

    the scheme data, constructed as in add_brahmic_scheme. The “vowel_marks” field can be omitted

Returns:

  • (Hash)

    the frozen scheme data as it exists inside the module



88
89
90
91
92
93
94
95
96
# File 'lib/sanscript/transliterate.rb', line 88

def add_roman_scheme(name, scheme)
  name = name.to_sym
  scheme = scheme.deep_dup
  scheme[:vowel_marks] = scheme[:vowels][1..-1] unless scheme.key?(:vowel_marks)
  @schemes[name] = scheme.deep_freeze
  @roman_schemes.add(name)
  @scheme_names.add(name)
  scheme
end

.brahmic_scheme?(name) ⇒ Boolean

Check whether the given scheme encodes Brahmic Sanskrit.

Parameters:

  • name (Symbol)

    the scheme name

Returns:

  • (Boolean)


46
47
48
# File 'lib/sanscript/transliterate.rb', line 46

def brahmic_scheme?(name)
  @brahmic_schemes.include?(name.to_sym)
end

.roman_scheme?(name) ⇒ Boolean

Check whether the given scheme encodes romanized Sanskrit.

@param name [Symbol] the scheme name
@return [Boolean]

Returns:

  • (Boolean)


54
55
56
# File 'lib/sanscript/transliterate.rb', line 54

def roman_scheme?(name)
  @roman_schemes.include?(name.to_sym)
end

.transliterate(data, from, to, **opts) ⇒ String

Transliterate from one script to another.

Parameters:

  • data (String)

    the String to transliterate

  • from (Symbol)

    the source script

  • to (Symbol)

    the destination script

  • opts (Hash)

    a customizable set of options

Options Hash (**opts):

  • :skip_sgml (Boolean) — default: false

    escape SGML-style tags in text string

  • :syncope (Boolean) — default: false

    activate Hindi-style schwa syncope

Returns:

  • (String)

    the transliterated string

Raises:



142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
# File 'lib/sanscript/transliterate.rb', line 142

def transliterate(data, from, to, **opts)
  from = from.to_sym
  to = to.to_sym
  return data if from == to
  raise SchemeNotSupportedError, from unless @schemes.key?(from)
  raise SchemeNotSupportedError, to unless @schemes.key?(to)

  data = data.to_str.dup
  options = @defaults.merge(opts)
  map = make_map(from, to)

  data.gsub!(/(<.*?>)/, "##\\1##") if options[:skip_sgml]

  # Easy way out for "{\m+}", "\", and ".h".
  if from == :itrans
    data.gsub!(/\{\\m\+\}/, ".h.N")
    data.gsub!(/\.h/, "")
    data.gsub!(/\\([^'`_]|$)/, "##\\1##")
  end

  if map[:from_roman?]
    transliterate_roman(data, map, options)
  else
    transliterate_brahmic(data, map)
  end
end