Module: RightDevelop::CI::Util

Defined in:
lib/right_develop/ci/util.rb

Constant Summary collapse

JAVA_CLASS_NAME =

Regular expression used to determine which characters of a string are allowed in Java class names.

/[A-Za-z0-9_]/
JAVA_PACKAGE_SEPARATOR =

The dot character gets special treatment: even if we XML-escape it, Jenkins will assume that it functions as a package separator. So, we’ll replace it with an equivalent Unicode character. Hooray homographic character attacks!

'.'
JAVE_PACKAGE_SEPARATOR_HOMOGLYPH =

Replacement codepoint that looks a bit like a period

'·'
INVALID_CDATA_CHARACTER =

Regular expression that matches characters that need to be escaped inside CDATA c.f. www.w3.org/TR/xml11/#charsets RestrictedChar ::= [#x1-#x8] | [#xB-#xC] | [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F]

Regexp.new '[\x01-\x08\x0b-\x0c\x0e-\x1f\x7f-\x84\x86-\x9f]', nil, 'n'

Class Method Summary collapse

Class Method Details

.pseudo_java_class_name(name) ⇒ String

Make a string suitable for parsing by Jenkins JUnit display plugin by escaping any non-valid Java class name characters as an XML entity. This prevents Jenkins from interpreting “hi1.2” as a package-and-class name.

Parameters:

  • name (String)

Returns:

  • (String)

    string with all non-alphanumerics replaced with an equivalent XML hex entity



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/right_develop/ci/util.rb', line 32

def pseudo_java_class_name(name)
  result = ''

  name.each_char do |chr|
    if chr =~ JAVA_CLASS_NAME
      result << chr
    elsif chr == JAVA_PACKAGE_SEPARATOR
      result << JAVE_PACKAGE_SEPARATOR_HOMOGLYPH
    else
      chr = chr.unpack('U')[0].to_s(16)
      result << "&#x#{chr};"
    end
  end

  result
end

.purify(untrusted) ⇒ String

Strip invalid UTF-8 sequences from a string and entity-escape any character that can’t legally appear inside XML CDATA. If test output contains weird data, we could end up generating invalid JUnit XML which will choke Java. Preserve the purity of essence of our precious XML fluids!

Parameters:

  • untrusted (String)

    a string (of any encoding) that might contain invalid UTF-8 sequences

Returns:

  • (String)

    the input with all invalid UTF-8 replaced by the empty string



56
57
58
59
60
61
62
63
64
65
66
67
68
69
# File 'lib/right_develop/ci/util.rb', line 56

def purify(untrusted)
  # First pass: strip bad UTF-8 characters
  if RUBY_VERSION =~ /^1\.8/
    iconv = Iconv.new('UTF-8//IGNORE', 'UTF-8')
    result = iconv.iconv(untrusted)
  else
    result = untrusted.force_encoding(Encoding::BINARY).encode('UTF-8', :undef=>:replace, :replace=>'')
  end

  # Second pass: entity escape characters that can't appear in XML CDATA.
  result.gsub(INVALID_CDATA_CHARACTER) do |ch|
    "&#x%s;" % [ch.unpack('H*').first]
  end
end