Module: Sanscript::Transliterate

Defined in:
lib/sanscript/transliterate.rb,
lib/sanscript/transliterate/schemes.rb

Overview

Sanskrit transliteration module. Derived from Sanscript, released under the MIT and GPL Licenses. “Sanscript is a Sanskrit transliteration library. Currently, it supports other Indian languages only incidentally.”

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.all_alternatesHash (readonly)

Returns the alternate-character data for all schemes.

Returns:

  • (Hash)

    the alternate-character data for all schemes



26
27
28
# File 'lib/sanscript/transliterate.rb', line 26

def all_alternates
  @all_alternates
end

.brahmic_schemesArray<Symbol> (readonly)

Returns the names of all Brahmic schemes.

Returns:

  • (Array<Symbol>)

    the names of all Brahmic schemes



17
18
19
# File 'lib/sanscript/transliterate.rb', line 17

def brahmic_schemes
  @brahmic_schemes
end

.defaultsHash (readonly)

Returns the default transliteration options.

Returns:

  • (Hash)

    the default transliteration options



29
30
31
# File 'lib/sanscript/transliterate.rb', line 29

def defaults
  @defaults
end

.roman_schemesArray<Symbol> (readonly)

Returns the names of all roman schemes.

Returns:

  • (Array<Symbol>)

    the names of all roman schemes



20
21
22
# File 'lib/sanscript/transliterate.rb', line 20

def roman_schemes
  @roman_schemes
end

.scheme_namesArray<Symbol> (readonly)

Returns the names of all supported schemes.

Returns:

  • (Array<Symbol>)

    the names of all supported schemes



14
15
16
# File 'lib/sanscript/transliterate.rb', line 14

def scheme_names
  @scheme_names
end

.schemesHash (readonly)

Returns the data for all schemes.

Returns:

  • (Hash)

    the data for all schemes



23
24
25
# File 'lib/sanscript/transliterate.rb', line 23

def schemes
  @schemes
end

Class Method Details

.add_brahmic_scheme(name, scheme) ⇒ Hash

Add a Brahmic scheme to Sanscript.

Schemes are of two types: “Brahmic” and “roman”. Brahmic consonants have an inherent vowel sound, but roman consonants do not. This is the main difference between these two types of scheme.

A scheme definition is a Hash that maps a group name to a list of characters. For illustration, see ‘transliterate/schemes.rb`.

You can use whatever group names you like, but for the best results, you should use the same group names that Sanscript does.

Parameters:

  • name (Symbol)

    the scheme name

  • scheme (Hash)

    the scheme data, constructed as described above

Returns:

  • (Hash)

    the frozen scheme data as it exists inside the module



72
73
74
75
76
77
78
79
# File 'lib/sanscript/transliterate.rb', line 72

def add_brahmic_scheme(name, scheme)
  name = name.to_sym
  scheme = scheme.deep_dup
  @schemes[name] = scheme.deep_freeze
  @brahmic_schemes.add(name)
  @scheme_names.add(name)
  scheme
end

.add_roman_scheme(name, scheme) ⇒ Hash

Add a roman scheme to Sanscript.

Parameters:

  • name (Symbol)

    the scheme name

  • scheme (Hash)

    the scheme data, constructed as in add_brahmic_scheme. The “vowel_marks” field can be omitted

Returns:

  • (Hash)

    the frozen scheme data as it exists inside the module



87
88
89
90
91
92
93
94
95
# File 'lib/sanscript/transliterate.rb', line 87

def add_roman_scheme(name, scheme)
  name = name.to_sym
  scheme = scheme.deep_dup
  scheme[:vowel_marks] = scheme[:vowels][1..-1] unless scheme.key?(:vowel_marks)
  @schemes[name] = scheme.deep_freeze
  @roman_schemes.add(name)
  @scheme_names.add(name)
  scheme
end

.brahmic_scheme?(name) ⇒ Boolean

Check whether the given scheme encodes Brahmic Sanskrit.

Parameters:

  • name (Symbol)

    the scheme name

Returns:

  • (Boolean)


45
46
47
# File 'lib/sanscript/transliterate.rb', line 45

def brahmic_scheme?(name)
  @brahmic_schemes.include?(name.to_sym)
end

.roman_scheme?(name) ⇒ Boolean

Check whether the given scheme encodes romanized Sanskrit.

@param name [Symbol] the scheme name
@return [Boolean]

Returns:

  • (Boolean)


53
54
55
# File 'lib/sanscript/transliterate.rb', line 53

def roman_scheme?(name)
  @roman_schemes.include?(name.to_sym)
end

.transliterate(data, from, to, **opts) ⇒ String

Transliterate from one script to another.

Parameters:

  • data (String)

    the String to transliterate

  • from (Symbol)

    the source script

  • to (Symbol)

    the destination script

  • opts (Hash)

    a customizable set of options

Options Hash (**opts):

  • :skip_sgml (Boolean) — default: false

    escape SGML-style tags in text string

  • :syncope (Boolean) — default: false

    activate Hindi-style schwa syncope

Returns:

  • (String)

    the transliterated string



141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/sanscript/transliterate.rb', line 141

def transliterate(data, from, to, **opts)
  from = from.to_sym
  to = to.to_sym
  return data if from == to
  raise "Scheme not known ':#{from}'" unless @schemes.key?(from)
  raise "Scheme not known ':#{to}'" unless @schemes.key?(to)

  data = data.to_str.dup
  options = @defaults.merge(opts)
  map = make_map(from, to)

  data.gsub!(/(<.*?>)/, "##\\1##") if options[:skip_sgml]

  # Easy way out for "{\m+}", "\", and ".h".
  if from == :itrans
    data.gsub!(/\{\\m\+\}/, ".h.N")
    data.gsub!(/\.h/, "")
    data.gsub!(/\\([^'`_]|$)/, "##\\1##")
  end

  if map[:from_roman?]
    transliterate_roman(data, map, options)
  else
    transliterate_brahmic(data, map)
  end
end