Class: Parsey

Inherits:
Object
  • Object
show all
Defined in:
lib/parsey.rb

Overview

Parsey is a simple class to match a string with a pattern and retrieve data from it. It takes a string, a pattern, and a hash of regular expressions. The pattern is filled with the regular expressiobs and then that is matched to the string given.

The pattern uses {} to surround the name of the regex it should be replaced with. You can also use <> to surround parts of the pattern that are optional, though these obviously must be nested properly.

Examples:


partials = {'folder'    => '([a-zA-Z0-9-]+)', 
            'file-name' => '([a-zA-Z0-9_ -]+)', 
            'ext'       => '(txt|jpg|png)'}

Parsey.parse('my-folder/my file.txt', '{folder}/{file-name}.{ext}', partials)
  #=> {"folder"=>"my-folder", "file-name"=>"my file", "ext"=>"txt"}

Parsey.parse('my file.txt', '<{folder}/>{file-name}.{ext}', partials)
  #=> {"file-name"=>"my file", "ext"=>"txt"}

Defined Under Namespace

Classes: ParseError, ScanArray

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(to_parse, pattern, partials) ⇒ Parsey

Creates a new Parsey instance.

Parameters:

  • to_parse (String)

    the string which is to be parsed

  • pattern (String)

    for the string to match

  • partials (Hash{String => String})

    the regex patterns (as strings) to use when matching



43
44
45
46
47
48
49
50
# File 'lib/parsey.rb', line 43

def initialize(to_parse, pattern, partials)
  @to_parse = to_parse
  @pattern  = pattern
  @partials = partials

  @scanners = []
  @depth = -1
end

Instance Attribute Details

#depthObject

Depth keeps track of how many levels the optional blocks go down, so that the scanner to use can be properly tracked. Each level of recursion needs a new scanner object to refer to or it will just clear the text that was stored.



32
33
34
# File 'lib/parsey.rb', line 32

def depth
  @depth
end

#partialsObject

Returns the value of attribute partials.



27
28
29
# File 'lib/parsey.rb', line 27

def partials
  @partials
end

#patternObject

Returns the value of attribute pattern.



27
28
29
# File 'lib/parsey.rb', line 27

def pattern
  @pattern
end

#scannersObject

Returns the value of attribute scanners.



27
28
29
# File 'lib/parsey.rb', line 27

def scanners
  @scanners
end

#to_parseObject

Returns the value of attribute to_parse.



27
28
29
# File 'lib/parsey.rb', line 27

def to_parse
  @to_parse
end

Class Method Details

.parse(to_parse, pattern, partials) ⇒ Hash{String => String}

This is a convenience method to allow you to easily parse something in just one line

Parameters:

  • to_parse (String)

    the string which is to be parsed

  • pattern (String)

    for the string to match

  • partials (Hash{String => String})

    the regex patterns (as strings) to use when matching

Returns:

  • (Hash{String => String})

    the data retrieved from to_parse



65
66
67
68
# File 'lib/parsey.rb', line 65

def self.parse(to_parse, pattern, partials)
  a = Parsey.new(to_parse, pattern, partials)
  a.parse
end

Instance Method Details

#parseHash{String => String}

Finds matches from to_parse using #regex. Then uses this data and the pattern created with #scan to match the data with names.

Returns:

  • (Hash{String => String})

    the data taken fron to_parse



89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/parsey.rb', line 89

def parse
  match = @to_parse.match(self.regex).captures
  data = {}
  
  self.scan.flatten.each_with_type_indexed do |t, c, i|
    if (t == :block) && (match[i] != nil)
      data[c] = match[i]
    end
  end
  
  data
end

#r_place(pat) ⇒ String

Puts the regexps in the correct place, but returns a string so it can still work recursively

Parameters:

  • pat (ScanArray)

    the pattern to turn into a regular expression

Returns:

  • (String)

    the regular expression as a string



205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
# File 'lib/parsey.rb', line 205

def r_place(pat)
  str = ''
  pat.each_with_type do |t, c|
    case t
    when :block
      str << @partials[c]
    when :text
      str << c
    when :optional
      str << "(#{r_place(c)})?"
    end
  end
  
  str
end

#r_scan(str) ⇒ ScanArray

Creates a new StringScanner, then scans for blocks, optionals or text and adds the result to parsed until it reaches the end of str.

Parameters:

  • str (String)

    the string to scan through

Returns:



119
120
121
122
123
124
125
126
127
128
129
130
131
# File 'lib/parsey.rb', line 119

def r_scan(str)
  parsed = ScanArray.new
  
  @depth += 1
  @scanners[@depth] = StringScanner.new(str)
  until self.scanner.eos?
    a = scan_blocks ||  a = scan_optionals ||  a = scan_text
    parsed << a
  end
  @depth -= 1
  
  parsed
end

#regexRegexp

This is a front for r_place so that a regex is returned as expected

Parameters:

  • pat (Array)

    the pattern to turn into a regular expression

Returns:

  • (Regexp)

    the regex that will be used for parsing

See Also:



75
76
77
# File 'lib/parsey.rb', line 75

def regex
  Regexp.new(r_place(scan))
end

#scanScanArray

Need to reset scanners after every full run, so this provides a front for r_scan, which resets scanners and still returns the correct value.

Returns:

See Also:



108
109
110
111
112
# File 'lib/parsey.rb', line 108

def scan
  r = self.r_scan(@pattern)
  @scanners =[]
  r
end

#scan_blocksArray

Finds next … in the StringScanner, and checks that it is closed.

Returns:

  • (Array)

    an array of the form [:block, …]

Raises:



137
138
139
140
141
142
143
144
145
# File 'lib/parsey.rb', line 137

def scan_blocks
  return unless self.scanner.scan(/\{/)
  content = scan_until(:block)
  
  raise ParseError unless self.scanner.scan(/\}/) # no closing block
  raise NoPartialError unless @partials[content]
  
  [:block, content]
end

#scan_optionalsArray

Finds next <…> in the StringScanner, and checks that it is closed. Then scans the contents of the optional block.

Returns:

  • (Array)

    an array of the form [:optional, […]]

Raises:



152
153
154
155
156
157
158
159
# File 'lib/parsey.rb', line 152

def scan_optionals
  return unless self.scanner.scan(/</)
  content = scan_until(:optional)
  
  raise ParseError unless self.scanner.scan(/>/) # no closing block
  
  [:optional, r_scan(content)]
end

#scan_textArray

Finds plain text, and checks whether there are any blocks left.

Returns:

  • (Array)

    text before next block, or rest of text in the form [:text, …]



165
166
167
168
169
170
171
172
173
174
# File 'lib/parsey.rb', line 165

def scan_text
  text = scan_until(:open)
  
  if text.nil?
    text = self.scanner.rest
    self.scanner.clear
  end
  
  [:text, text]
end

#scan_until(type) ⇒ String?

Scans the string until a tag is found of the type given.

Parameters:

  • type (Symbol)

    of tag to look for. :block for a closing block tag (+}+), :optional for a closing optional tag (+>+), :open for an opening tag (+{+ or <).

Returns:

  • (String, nil)

    the text before the tag, or nil if no match found



184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# File 'lib/parsey.rb', line 184

def scan_until(type)
  case type
  when :block
    regex = /\}/
  when :optional
    regex = />/
  when :open
    regex = /(\{|<)/
  end
  pos = self.scanner.pos
  if self.scanner.scan_until(regex)
    self.scanner.pos -= self.scanner.matched.size
    self.scanner.pre_match[pos..-1]
  end
end

#scannerStringScanner

Returns the current scanner to use.

Returns:

  • (StringScanner)

    the current scanner to use



80
81
82
# File 'lib/parsey.rb', line 80

def scanner
  @scanners[@depth]
end