Class: Typogrowth::Parser
- Inherits:
-
Object
- Object
- Typogrowth::Parser
- Defined in:
- lib/typogrowth.rb
Overview
Parses and corrects the typography in strings. It supports different language rules and easy user rules customization.
Constant Summary collapse
- DEFAULT_SET =
'typogrowth'- HTML_TAG_RE =
/<[^>]*>/- @@instance =
Ready-to-use single instance
Parser.new
Instance Attribute Summary collapse
-
#shadows ⇒ Object
readonly
Returns the value of attribute shadows.
-
#yaml ⇒ Object
readonly
Returns the value of attribute yaml.
Class Method Summary collapse
-
.is_ru?(str, shadows: []) ⇒ Boolean
Out-of-place version of
Stringtypographing. -
.parse(str, lang: :default, shadows: []) ⇒ Object
Out-of-place version of
Stringtypographing. -
.parse!(str, lang: :default, shadows: []) ⇒ Object
Out-of-place version of
Stringtypographing. - .safe_delimiters(str) ⇒ Object
Instance Method Summary collapse
- #add_shadows(re) ⇒ Object
- #del_shadows(re) ⇒ Object
-
#initialize(file = nil) ⇒ Parser
constructor
A new instance of Parser.
- #is_ru?(str, shadows: []) ⇒ Boolean
-
#merge(custom) ⇒ Object
Recursively merges the initial settings with custom.
-
#parse(str, lang: :default, shadows: []) ⇒ Object
Inplace version of string typographying.
Constructor Details
#initialize(file = nil) ⇒ Parser
Returns a new instance of Parser.
163 164 165 166 167 168 |
# File 'lib/typogrowth.rb', line 163 def initialize file = nil file = DEFAULT_SET unless file @yaml = YAML.load_file "#{File.dirname(__FILE__)}/config/#{file}.yaml" @yaml.delete(:placeholder) @shadows = [HTML_TAG_RE, URI.regexp(['ftp', 'http', 'https', 'mailto'])] end |
Instance Attribute Details
#shadows ⇒ Object (readonly)
Returns the value of attribute shadows.
34 35 36 |
# File 'lib/typogrowth.rb', line 34 def shadows @shadows end |
#yaml ⇒ Object (readonly)
Returns the value of attribute yaml.
34 35 36 |
# File 'lib/typogrowth.rb', line 34 def yaml @yaml end |
Class Method Details
.is_ru?(str, shadows: []) ⇒ Boolean
Out-of-place version of String typographing. See #parse!
156 157 158 |
# File 'lib/typogrowth.rb', line 156 def self.is_ru? str, shadows: [] @@instance.is_ru? str, shadows: shadows end |
.parse(str, lang: :default, shadows: []) ⇒ Object
Out-of-place version of String typographing. See #parse!
146 147 148 |
# File 'lib/typogrowth.rb', line 146 def self.parse str, lang: :default, shadows: [] Parser.new.parse str, lang: lang, shadows: shadows end |
.parse!(str, lang: :default, shadows: []) ⇒ Object
Out-of-place version of String typographing. See #parse!
151 152 153 |
# File 'lib/typogrowth.rb', line 151 def self.parse! str, lang: :default, shadows: [] str.replace self.parse str, lang: lang, shadows: shadows end |
.safe_delimiters(str) ⇒ Object
36 37 38 39 40 41 42 |
# File 'lib/typogrowth.rb', line 36 def self.safe_delimiters str delimiters = ['❮', '❯'] loop do break delimiters unless str.match(/#{delimiters.join('|')}/) delimiters.map! {|d| d*2} end end |
Instance Method Details
#add_shadows(re) ⇒ Object
137 138 139 |
# File 'lib/typogrowth.rb', line 137 def add_shadows re @shadows.concat [*re] end |
#del_shadows(re) ⇒ Object
141 142 143 |
# File 'lib/typogrowth.rb', line 141 def del_shadows re @shadows.delete_if { |stored| [*re].include? stored } end |
#is_ru?(str, shadows: []) ⇒ Boolean
130 131 132 133 134 135 |
# File 'lib/typogrowth.rb', line 130 def is_ru? str, shadows: [] clean = @shadows.concat([*shadows]).uniq.inject(str) { |memo, re| memo.gsub(re, '') } clean.scan(/[А-Яа-я]/).size > clean.length / 3 end |
#merge(custom) ⇒ Object
Recursively merges the initial settings with custom.
To supply your own rules to processing:
-
create a
hashof additional rules in the same form as in the
standard typogrowth.yaml file shipped with a project
-
merge the hash with the standard one using this function
For instance, to add french rules one is to merge in the following yaml:
:quotes :
:punctuation :
:fr : "\\k<quote>\\k<punct>"
…
60 61 62 |
# File 'lib/typogrowth.rb', line 60 def merge custom yaml.rmerge!(custom) end |
#parse(str, lang: :default, shadows: []) ⇒ Object
Inplace version of string typographying.
Retrieves the string and changes all the typewriters quotes (doubles and sigles), to inches, minutes, seconds, proper quotation signs.
While the input strings are e.g.
And God said "Baz heard "Bar" once" , and there was light.
That's a 6.3" man, he sees sunsets at 10°20'30" E.
It will produce:
And God said “Baz heard ‘Bar’ once,” and there was light.
That’s a 6.3″ man, he sees sunsets at 10°20′30″ E.
The utility also handles dashes as well.
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
# File 'lib/typogrowth.rb', line 85 def parse str, lang: :default, shadows: [] lang = lang.to_sym delims = Parser.safe_delimiters str str.split(/\R{2,}/).map { |para| @shadows.concat([*shadows]).uniq.each { |re| para.gsub!(re) { |m| "#{delims.first}#{Base64.encode64 m}#{delims.last}" } } @yaml.each { |key, values| values.each { |k, v| if !!v[:re] v[lang] = v[:default] if (!v[lang] || v[lang].size.zero?) raise MalformedRulesFile.new "Malformed rules file (no subst for #{v})" \ if !v[lang] || v[lang].size.zero? substituted = !!v[:pattern] ? para.gsub!(/#{v[:re]}/) { |m| m.gsub(/#{v[:pattern]}/, v[lang].first) } : para.gsub!(/#{v[:re]}/, v[lang].first) # logger.warn "Unsafe substitutions were made to source:\n# ⇒ #{para}"\ # if v[:alert] && substituted if v[lang].size > 1 para.gsub!(/#{v[lang].first}/) { |m| prev = $` obsoletes = prev.count(v[lang].join) compliants = values[v[:compliant].to_sym][lang] || values[v[:compliant].to_sym][:default] obsoletes -= prev.count(compliants.join) \ if !!v[:compliant] !!v[:slave] ? obsoletes -= prev.count(v[:original]) + 1 : obsoletes += prev.count(v[:original]) v[lang][obsoletes % v[lang].size] } end end } } para }.join(%Q( )) .gsub(/#{delims.first}(.*?)#{delims.last}/m) { |m| Base64.decode64(m).force_encoding('UTF-8') } end |