Class: Arachni::Support::Signature

Inherits:
Object
  • Object
show all
Defined in:
lib/arachni/support/signature.rb

Overview

Represents a signature, used to maintain a lightweight representation of a String and refine it using similar Strings to remove noise.

Author:

Constant Summary collapse

CACHE =
{
    tokens: Cache::LeastRecentlyPushed.new( 100 )
}

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data, options = {}) ⇒ Signature

Note:

The string will be tokenized based on whitespace.

Returns a new instance of Signature.

Parameters:

  • data (String, Signature)

    Seed data to use to initialize the signature.

  • options (Hash) (defaults to: {})

Options Hash (options):



31
32
33
34
35
36
37
38
# File 'lib/arachni/support/signature.rb', line 31

def initialize( data, options = {} )
    @tokens  = tokenize( data )
    @options = options

    if @options[:threshold] && !@options[:threshold].is_a?( Numeric )
        fail ArgumentError, 'Option :threshold must be a number.'
    end
end

Instance Attribute Details

#tokensObject (readonly)

Returns the value of attribute tokens.



21
22
23
# File 'lib/arachni/support/signature.rb', line 21

def tokens
  @tokens
end

Instance Method Details

#<<(data) ⇒ Object



53
54
55
56
57
# File 'lib/arachni/support/signature.rb', line 53

def <<( data )
    @hash_cache = nil
    @tokens.merge tokenize( data )
    self
end

#==(other) ⇒ Object

Parameters:



107
108
109
# File 'lib/arachni/support/signature.rb', line 107

def ==( other )
    hash == other.hash
end

#differences(other) ⇒ Float

Returns Ratio of difference between signatures.

Parameters:

Returns:

  • (Float)

    Ratio of difference between signatures.



74
75
76
77
78
79
80
# File 'lib/arachni/support/signature.rb', line 74

def differences( other )
    return 1 if other.nil?
    return 0 if self == other

    ((tokens - other.tokens) | (other.tokens - tokens)).size /
        Float((other.tokens | tokens).size)
end

#dupSignature

Returns Copy of self.

Returns:



98
99
100
# File 'lib/arachni/support/signature.rb', line 98

def dup
    self.class.new( '' ).tap { |s| s.copy( @hash_cache, tokens, @options ) }
end

#empty?Boolean

Returns:

  • (Boolean)


92
93
94
# File 'lib/arachni/support/signature.rb', line 92

def empty?
    @tokens.empty?
end

#hashObject



102
103
104
# File 'lib/arachni/support/signature.rb', line 102

def hash
    @hash_cache ||= tokens.hash
end

#refine(data) ⇒ Signature

Note:

The string will be tokenized based on whitespace.

Returns New, refined signature.

Parameters:

Returns:



66
67
68
# File 'lib/arachni/support/signature.rb', line 66

def refine( data )
    dup.refine!( data )
end

#refine!(data) ⇒ Signature

Note:

The string will be tokenized based on whitespace.

Returns self.

Parameters:

Returns:



47
48
49
50
51
# File 'lib/arachni/support/signature.rb', line 47

def refine!( data )
    @hash_cache = nil
    @tokens &= tokenize( data )
    self
end

#similar?(other, threshold = @options[:threshold]) ⇒ Bool

Parameters:

  • other (Signature)
  • threshold (Integer) (defaults to: @options[:threshold])

    Threshold of differences.

Returns:

  • (Bool)


87
88
89
90
# File 'lib/arachni/support/signature.rb', line 87

def similar?( other, threshold = @options[:threshold] )
    fail 'No threshold given.' if !threshold
    self == other || differences( other ) < threshold
end