Class: Typogrowth::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/typogrowth.rb

Overview

Parses and corrects the typography in strings. It supports different language rules and easy user rules customization.

Constant Summary collapse

DEFAULT_SET =
'typogrowth'
HTML_TAG_RE =
/<[^>]*>/
@@instance =

Ready-to-use single instance

Parser.new

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(file = nil) ⇒ Parser

Returns a new instance of Parser.



163
164
165
166
167
168
# File 'lib/typogrowth.rb', line 163

def initialize file = nil
  file = DEFAULT_SET unless file
  @yaml = YAML.load_file "#{File.dirname(__FILE__)}/config/#{file}.yaml"
  @yaml.delete(:placeholder)
  @shadows = [HTML_TAG_RE, URI.regexp(['ftp', 'http', 'https', 'mailto'])]
end

Instance Attribute Details

#shadowsObject (readonly)

Returns the value of attribute shadows.



34
35
36
# File 'lib/typogrowth.rb', line 34

def shadows
  @shadows
end

#yamlObject (readonly)

Returns the value of attribute yaml.



34
35
36
# File 'lib/typogrowth.rb', line 34

def yaml
  @yaml
end

Class Method Details

.is_ru?(str, shadows: []) ⇒ Boolean

Out-of-place version of String typographing. See #parse!

Returns:

  • (Boolean)


156
157
158
# File 'lib/typogrowth.rb', line 156

def self.is_ru? str, shadows: []
  @@instance.is_ru? str, shadows: shadows
end

.parse(str, lang: :default, shadows: []) ⇒ Object

Out-of-place version of String typographing. See #parse!



146
147
148
# File 'lib/typogrowth.rb', line 146

def self.parse str, lang: :default, shadows: []
  Parser.new.parse str, lang: lang, shadows: shadows
end

.parse!(str, lang: :default, shadows: []) ⇒ Object

Out-of-place version of String typographing. See #parse!



151
152
153
# File 'lib/typogrowth.rb', line 151

def self.parse! str, lang: :default, shadows: []
  str.replace self.parse str, lang: lang, shadows: shadows
end

.safe_delimiters(str) ⇒ Object



36
37
38
39
40
41
42
# File 'lib/typogrowth.rb', line 36

def self.safe_delimiters str
  delimiters = ['', '']
  loop do
    break delimiters unless str.match(/#{delimiters.join('|')}/)
    delimiters.map! {|d| d*2}
  end
end

Instance Method Details

#add_shadows(re) ⇒ Object



137
138
139
# File 'lib/typogrowth.rb', line 137

def add_shadows re
  @shadows.concat [*re]
end

#del_shadows(re) ⇒ Object



141
142
143
# File 'lib/typogrowth.rb', line 141

def del_shadows re
  @shadows.delete_if { |stored| [*re].include? stored }
end

#is_ru?(str, shadows: []) ⇒ Boolean

Returns:

  • (Boolean)


130
131
132
133
134
135
# File 'lib/typogrowth.rb', line 130

def is_ru? str, shadows: []
  clean = @shadows.concat([*shadows]).uniq.inject(str) { |memo, re|
    memo.gsub(re, '')
  }
  clean.scan(/[А-Яа-я]/).size > clean.length / 3
end

#merge(custom) ⇒ Object

Recursively merges the initial settings with custom.

To supply your own rules to processing:

  • create a hash of additional rules in the same form as in the

standard typogrowth.yaml file shipped with a project

  • merge the hash with the standard one using this function

For instance, to add french rules one is to merge in the following yaml:

:quotes :
  :punctuation :
    :fr : "\\k<quote>\\k<punct>"
…


60
61
62
# File 'lib/typogrowth.rb', line 60

def merge custom
  yaml.rmerge!(custom)
end

#parse(str, lang: :default, shadows: []) ⇒ Object

Inplace version of string typographying.

Retrieves the string and changes all the typewriters quotes (doubles and sigles), to inches, minutes, seconds, proper quotation signs.

While the input strings are e.g.

And God said "Baz heard "Bar" once" , and there was light.
That's a 6.3" man, he sees sunsets at 10°20'30" E.

It will produce:

And God said “Baz heard ‘Bar’ once,” and there was light.
That’s a 6.3″ man, he sees sunsets at 10°20′30″ E.

The utility also handles dashes as well.

Parameters:

  • str (String)

    the string to be typographyed inplace

  • lang (defaults to: :default)

    the language to use rules for



85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# File 'lib/typogrowth.rb', line 85

def parse str, lang: :default, shadows: []
  lang = lang.to_sym
  delims = Parser.safe_delimiters str
  str.split(/\R{2,}/).map { |para|
    @shadows.concat([*shadows]).uniq.each { |re|
      para.gsub!(re) { |m| "#{delims.first}#{Base64.encode64 m}#{delims.last}" }
    }
    @yaml.each { |key, values|
      values.each { |k, v|
        if !!v[:re]
          v[lang] = v[:default] if (!v[lang] || v[lang].size.zero?)
          raise MalformedRulesFile.new "Malformed rules file (no subst for #{v})" \
            if !v[lang] || v[lang].size.zero?
          substituted = !!v[:pattern] ?
              para.gsub!(/#{v[:re]}/) { |m| m.gsub(/#{v[:pattern]}/, v[lang].first) } :
              para.gsub!(/#{v[:re]}/, v[lang].first)
          # logger.warn "Unsafe substitutions were made to source:\n# ⇒ #{para}"\
          #  if v[:alert] && substituted
          if v[lang].size > 1
            para.gsub!(/#{v[lang].first}/) { |m|
              prev = $`
              obsoletes = prev.count(v[lang].join)
              compliants = values[v[:compliant].to_sym][lang] ||
                           values[v[:compliant].to_sym][:default]
              obsoletes -= prev.count(compliants.join) \
                if !!v[:compliant]
              !!v[:slave] ?
                obsoletes -= prev.count(v[:original]) + 1 :
                obsoletes += prev.count(v[:original])

              v[lang][obsoletes % v[lang].size]
            }
          end
        end
      }
    }
    para
  }.join(%Q(

))
  .gsub(/#{delims.first}(.*?)#{delims.last}/m) { |m|
    Base64.decode64(m).force_encoding('UTF-8')
  }
end