Module: Puppet::Pops::Parser::SlurpSupport
- Included in:
- Lexer2
- Defined in:
- lib/puppet/pops/parser/slurp_support.rb
Overview
This module is an integral part of the Lexer. It defines the string slurping behavior - finding the string and non string parts in interpolated strings, translating escape sequences in strings to their single character equivalence.
PERFORMANCE NOTE: The various kinds of slurping could be made even more generic, but requires additional parameter passing and evaluation of conditional logic. TODO: More detailed performance analysis of excessive character escaping and interpolation.
Constant Summary collapse
- SLURP_SQ_PATTERN =
/(?:[^\\]|^|[^\\])(?:[\\]{2})*[']/- SLURP_DQ_PATTERN =
/(?:[^\\]|^|[^\\])(?:[\\]{2})*(["]|[$]\{?)/- SLURP_UQ_PATTERN =
/(?:[^\\]|^|[^\\])(?:[\\]{2})*([$]\{?|\z)/- SLURP_ALL_PATTERN =
/.*(\z)/m
- SQ_ESCAPES =
%w{ \\ ' }- DQ_ESCAPES =
%w{ \\ $ ' " r n t s u}+["\r\n", "\n"]
- UQ_ESCAPES =
%w{ \\ $ r n t s u}+["\r\n", "\n"]
Instance Method Summary collapse
-
#slurp(scanner, pattern, escapes, ignore_invalid_escapes) ⇒ Object
Slurps a string from the given scanner until the given pattern and then replaces any escaped characters given by escapes into their control-character equivalent or in case of line breaks, replaces the pattern r?n with an empty string.
- #slurp_dqstring ⇒ Object
- #slurp_sqstring ⇒ Object
-
#slurp_uqstring ⇒ Object
Copy from old lexer - can do much better.
Instance Method Details
#slurp(scanner, pattern, escapes, ignore_invalid_escapes) ⇒ Object
Slurps a string from the given scanner until the given pattern and then replaces any escaped characters given by escapes into their control-character equivalent or in case of line breaks, replaces the pattern r?n with an empty string. The returned string contains the terminating character. Returns nil if the scanner can not scan until the given pattern.
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/puppet/pops/parser/slurp_support.rb', line 62 def slurp(scanner, pattern, escapes, ignore_invalid_escapes) str = scanner.scan_until(pattern) || return # Process unicode escapes first as they require getting 4 hex digits # If later a \u is found it is warned not to be a unicode escape if escapes.include?('u') str.gsub!(/\\u([\da-fA-F]{4})/m) { [$1.hex].pack("U") } end str.gsub!(/\\([^\r\n]|(?:\r?\n))/m) { ch = $1 if escapes.include? ch case ch when 'r' ; "\r" when 'n' ; "\n" when 't' ; "\t" when 's' ; " " when 'u' Puppet.warning(("Unicode escape '\\u' was not followed by 4 hex digits")) "\\u" when "\n" ; '' when "\r\n"; '' else ch end else Puppet.warning(("Unrecognized escape sequence '\\#{ch}'")) unless ignore_invalid_escapes "\\#{ch}" end } str end |
#slurp_dqstring ⇒ Object
26 27 28 29 30 31 32 33 34 35 36 37 |
# File 'lib/puppet/pops/parser/slurp_support.rb', line 26 def slurp_dqstring scn = @scanner last = scn.matched str = slurp(scn, SLURP_DQ_PATTERN, DQ_ESCAPES, false) unless str lex_error("Unclosed quote after #{format_quote(last)} followed by '#{followed_by}'") end # Terminator may be a single char '"', '$', or two characters '${' group match 1 (scn[1]) from the last slurp holds this terminator = scn[1] [str[0..(-1 - terminator.length)], terminator] end |
#slurp_sqstring ⇒ Object
19 20 21 22 23 24 |
# File 'lib/puppet/pops/parser/slurp_support.rb', line 19 def slurp_sqstring # skip the leading ' @scanner.pos += 1 str = slurp(@scanner, SLURP_SQ_PATTERN, SQ_ESCAPES, :ignore_invalid_escapes) || lex_error("Unclosed quote after \"'\" followed by '#{followed_by}'") str[0..-2] # strip closing "'" from result end |
#slurp_uqstring ⇒ Object
Copy from old lexer - can do much better
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# File 'lib/puppet/pops/parser/slurp_support.rb', line 40 def slurp_uqstring scn = @scanner last = scn.matched ignore = true str = slurp(scn, @lexing_context[:uq_slurp_pattern], @lexing_context[:escapes], :ignore_invalid_escapes) # Terminator may be a single char '$', two characters '${', or empty string '' at the end of intput. # Group match 1 holds this. # The exceptional case is found by looking at the subgroup 1 of the most recent match made by the scanner (i.e. @scanner[1]). # This is the last match made by the slurp method (having called scan_until on the scanner). # If there is a terminating character is must be stripped and returned separately. # terminator = scn[1] [str[0..(-1 - terminator.length)], terminator] end |