Class: Arachni::Support::Signature

Inherits:
Object
  • Object
show all
Defined in:
lib/arachni/support/signature.rb

Overview

Represents a signature, used to maintain a lightweight representation of a String and refine it using similar Strings to remove noise.

Author:

Constant Summary collapse

CACHE =
{
    tokens: Cache::LeastRecentlyPushed.new( 100 )
}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data, options = {}) ⇒ Signature

Note:

The string will be tokenized based on whitespace.

Returns a new instance of Signature.

Parameters:

  • data (String, Signature)

    Seed data to use to initialize the signature.

  • options (Hash) (defaults to: {})

Options Hash (options):



31
32
33
34
35
36
37
38
# File 'lib/arachni/support/signature.rb', line 31

def initialize( data, options = {} )
    @tokens  = tokenize( data )
    @options = options

    if @options[:threshold] && !@options[:threshold].is_a?( Numeric )
        fail ArgumentError, 'Option :threshold must be a number.'
    end
end

Instance Attribute Details

#tokensObject (readonly)

Returns the value of attribute tokens.



21
22
23
# File 'lib/arachni/support/signature.rb', line 21

def tokens
  @tokens
end

Instance Method Details

#==(other) ⇒ Object

Parameters:



96
97
98
# File 'lib/arachni/support/signature.rb', line 96

def ==( other )
    hash == other.hash
end

#differences(other) ⇒ Float

Returns Ratio of difference between signatures.

Parameters:

Returns:

  • (Float)

    Ratio of difference between signatures.



67
68
69
70
71
72
73
# File 'lib/arachni/support/signature.rb', line 67

def differences( other )
    return 1 if other.nil?
    return 0 if self == other

    ((tokens - other.tokens) | (other.tokens - tokens)).size /
        Float((other.tokens | tokens).size)
end

#dupSignature

Returns Copy of ‘self`.

Returns:



87
88
89
# File 'lib/arachni/support/signature.rb', line 87

def dup
    self.class.new( '' ).tap { |s| s.copy( tokens, @options ) }
end

#hashObject



91
92
93
# File 'lib/arachni/support/signature.rb', line 91

def hash
    tokens.hash
end

#refine(data) ⇒ Signature

Note:

The string will be tokenized based on whitespace.

Returns New, refined signature.

Parameters:

Returns:



59
60
61
# File 'lib/arachni/support/signature.rb', line 59

def refine( data )
    dup.refine!( data )
end

#refine!(data) ⇒ Signature

Note:

The string will be tokenized based on whitespace.

Returns ‘self`.

Parameters:

Returns:



47
48
49
50
# File 'lib/arachni/support/signature.rb', line 47

def refine!( data )
    @tokens &= tokenize( data )
    self
end

#similar?(other, threshold = @options[:threshold]) ⇒ Bool

Parameters:

  • other (Signature)
  • threshold (Integer) (defaults to: @options[:threshold])

    Threshold of differences.

Returns:

  • (Bool)


80
81
82
83
# File 'lib/arachni/support/signature.rb', line 80

def similar?( other, threshold = @options[:threshold] )
    fail 'No threshold given.' if !threshold
    self == other || differences( other ) < threshold
end