Class: Chars::CharSet

Inherits:
Set
  • Object
show all
Defined in:
lib/chars/char_set.rb

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(*arguments) ⇒ CharSet

Creates a new CharSet object.

Parameters:

  • arguments (Array<String, Integer, Enumerable>)

    The chars for the CharSet.

Raises:

  • (TypeError)

    One of the arguments was not a String, Integer or Enumerable.



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/chars/char_set.rb', line 17

def initialize(*arguments)
  super()

  @chars = Hash.new do |hash,key|
    hash[key] = if key > 0xff
                  key.chr(Encoding::UTF_8)
                else
                  key.chr(Encoding::ASCII_8BIT)
                end
  end

  arguments.each do |subset|
    case subset
    when String, Integer
      self << subset
    when Enumerable
      subset.each { |char| self << char }
    else
      raise(TypeError,"arguments must be a String, Integer or Enumerable")
    end
  end
end

Class Method Details

.[](*arguments) ⇒ CharSet

Creates a new Chars::CharSet.

Parameters:

  • arguments (Array<String, Integer, Enumerable>)

    The chars for the CharSet.

Returns:

  • (CharSet)

    The new character set.

See Also:

Since:

  • 0.2.1



65
66
67
# File 'lib/chars/char_set.rb', line 65

def self.[](*arguments)
  new(*arguments)
end

Instance Method Details

#<<(other) ⇒ CharSet

Adds a character to the set.

Parameters:

Returns:

  • (CharSet)

    The modified character set.

Raises:

Since:

  • 0.2.1



83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# File 'lib/chars/char_set.rb', line 83

def <<(other)
  case other
  when String
    other.each_char do |char|
      byte = char.ord

      @chars[byte] = char
      super(byte)
    end

    return self
  when Integer
    super(other)
  else
    raise(TypeError,"can only append Strings and Integers")
  end
end

#===(other) ⇒ Boolean Also known as: =~

Compares the bytes within a given string with the bytes of the Chars::CharSet.

Examples:

Chars.alpha === "hello"
# => true

Parameters:

Returns:

  • (Boolean)

    Specifies whether all of the bytes within the given string are included in the Chars::CharSet.



686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
# File 'lib/chars/char_set.rb', line 686

def ===(other)
  case other
  when String
    other.each_char.all? { |char| include_char?(char) }
  when Enumerable
    other.all? do |element|
      case element
      when String
        include_char?(element)
      when Integer
        include_byte?(element)
      end
    end
  else
    false
  end
end

#charsArray<String>

The characters within the Chars::CharSet.

Returns:



131
132
133
# File 'lib/chars/char_set.rb', line 131

def chars
  map { |byte| @chars[byte] }
end

#each_char {|char| ... } ⇒ Enumerator

Iterates over every character within the Chars::CharSet.

Yields:

  • (char)

    If a block is given, it will be passed each character in the Chars::CharSet.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an enumerator object will be returned.



148
149
150
151
152
# File 'lib/chars/char_set.rb', line 148

def each_char
  return enum_for(__method__) unless block_given?

  each { |byte| yield @chars[byte] }
end

#each_random_byte(n, **kwargs) {|byte| ... } ⇒ Enumerator

Pass random bytes to a given block.

Parameters:

  • n (Integer)

    Specifies how many times to pass a random byte to the block.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Yields:

  • (byte)

    The block will receive the random bytes.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an enumerator object will be returned.



238
239
240
241
242
243
244
245
# File 'lib/chars/char_set.rb', line 238

def each_random_byte(n,**kwargs,&block)
  return enum_for(__method__,n,**kwargs) unless block_given?

  n.times do
    yield random_byte(**kwargs)
  end
  return nil
end

#each_random_char(n, **kwargs) {|char| ... } ⇒ Enumerator

Pass random characters to a given block.

Parameters:

  • n (Integer)

    Specifies how many times to pass a random character to the block.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Yields:

  • (char)

    The block will receive the random characters.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an enumerator object will be returned.



268
269
270
271
272
273
274
# File 'lib/chars/char_set.rb', line 268

def each_random_char(n,**kwargs,&block)
  return enum_for(__method__,n,**kwargs) unless block_given?

  each_random_byte(n,**kwargs) do |byte|
    yield @chars[byte]
  end
end

#each_string_of_length(length) {|string| ... } ⇒ Enumerator

Enumerates through every possible string belonging to the Chars::CharSet and of the given length.

Parameters:

  • length (Range, Array, Integer)

    The desired length(s) of each string.

Yields:

  • (string)

    The given block will be passed each sequential string.

Yield Parameters:

Returns:

  • (Enumerator)

    If no block is given, an Enumerator will be returned.

Since:

  • 0.3.0



625
626
627
628
629
630
631
632
633
634
635
636
# File 'lib/chars/char_set.rb', line 625

def each_string_of_length(length,&block)
  return enum_for(__method__,length) unless block

  case length
  when Range, Array
    length.each do |len|
      StringEnumerator.new(self,len).each(&block)
    end
  else
    StringEnumerator.new(self,length).each(&block)
  end
end

#each_substring(data, **kwargs) ⇒ Enumerator

Enumerates over all substrings within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

Options Hash (**kwargs):

  • :min_length (Integer)

    The minimum length of sub-strings found within the given data.

Returns:

  • (Enumerator)

    If no block is given, an Enumerator object will be returned.

See Also:

Since:

  • 0.3.0



521
522
523
524
525
526
527
# File 'lib/chars/char_set.rb', line 521

def each_substring(data,**kwargs)
  return enum_for(__method__,data,**kwargs) unless block_given?

  each_substring_with_index(data,**kwargs) do |substring,index|
    yield substring
  end
end

#each_substring_with_index(data, min_length: 4) {|match, index| ... } ⇒ Enumerator

Enumerates over all substrings and their indices within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

  • data (String)

    The data to find sub-strings within.

  • min_length (Integer) (defaults to: 4)

    The minimum length of sub-strings found within the given data.

Yields:

  • (match, index)

    The given block will be passed every matched sub-string and it's index.

  • (String)

    match A sub-string containing the characters from the Chars::CharSet.

  • (Integer)

    index The index the sub-string was found at.

Returns:

  • (Enumerator)

    If no block is given, an Enumerator object will be returned.

Since:

  • 0.3.0



435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
# File 'lib/chars/char_set.rb', line 435

def each_substring_with_index(data, min_length: 4)
  unless block_given?
    return enum_for(__method__,data, min_length: min_length)
  end

  return if data.size < min_length

  index = 0

  match_start = nil
  match_end   = nil

  while index < data.size
    unless match_start
      if self.include_char?(data[index])
        match_start = index
      end
    else
      unless self.include_char?(data[index])
        match_end    = index
        match_length = (match_end - match_start)

        if match_length >= min_length
          match = data[match_start,match_length]

          yield match, match_start
        end

        match_start = match_end = nil
      end
    end

    index += 1
  end

  # yield the remaining match
  if match_start
    yield data[match_start, data.size - match_start], match_start
  end
end

#include_char?(char) ⇒ Boolean

Determines if a character is contained within the Chars::CharSet.

Parameters:

  • char (String)

    The character to search for.

Returns:

  • (Boolean)

    Specifies whether the character is contained within the Chars::CharSet.



117
118
119
120
121
122
123
# File 'lib/chars/char_set.rb', line 117

def include_char?(char)
  unless char.empty?
    @chars.has_value?(char) || include_byte?(char.ord)
  else
    false
  end
end

#initialize_copy(other) ⇒ Object

Initializes the copy of another Chars::CharSet object.

Parameters:



46
47
48
49
50
# File 'lib/chars/char_set.rb', line 46

def initialize_copy(other)
  super(other)

  @chars = other.instance_variable_get('@chars').dup
end

#inspectString

Inspects the Chars::CharSet.

Returns:



712
713
714
715
716
717
718
719
720
721
722
723
724
725
# File 'lib/chars/char_set.rb', line 712

def inspect
  "#<#{self.class.name}: {" + map { |byte|
    case byte
    when (0x07..0x0d), (0x20..0x7e)
      @chars[byte].dump
    when 0x00
      # sly hack to make char-sets more friendly
      # to us C programmers
      '"\0"'
    else
      sprintf("0x%02x",byte)
    end
  }.join(', ') + "}>"
end

#map_chars {|char| ... } ⇒ Array<String>

Maps the characters of the Chars::CharSet.

Yields:

  • (char)

    The given block will be used to transform the characters within the Chars::CharSet.

Yield Parameters:

Returns:



184
185
186
# File 'lib/chars/char_set.rb', line 184

def map_chars(&block)
  each_char.map(&block)
end

#random_byte(random: Random) ⇒ Integer

Returns a random byte from the Chars::CharSet.

Parameters:

  • random (Random, SecureRandom) (defaults to: Random)

    The random number generator to use.

Returns:

  • (Integer)

    A random byte value.



197
198
199
# File 'lib/chars/char_set.rb', line 197

def random_byte(random: Random)
  self.entries[random.rand(self.length)]
end

#random_bytes(length, random: Random) ⇒ Array<Integer>

Creates an Array of random bytes from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random bytes.

  • random (Random, SecureRandom) (defaults to: Random)

    The random number generator to use.

Returns:

  • (Array<Integer>)

    The randomly selected bytes.



288
289
290
291
292
293
294
295
296
297
298
299
300
301
# File 'lib/chars/char_set.rb', line 288

def random_bytes(length, random: Random)
  case length
  when Array
    Array.new(length.sample(random: random)) do
      random_byte(random: random)
    end
  when Range
    Array.new(random.rand(length)) do
      random_byte(random: random)
    end
  else
    Array.new(length) { random_byte(random: random) }
  end
end

#random_char(**kwargs) ⇒ String

Returns a random character from the Chars::CharSet.

Parameters:

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (String)

    A random char value.



213
214
215
# File 'lib/chars/char_set.rb', line 213

def random_char(**kwargs)
  @chars[random_byte(**kwargs)]
end

#random_chars(length, **kwargs) ⇒ Array<String>

Creates an Array of random characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (Array<String>)

    The randomly selected characters.



343
344
345
# File 'lib/chars/char_set.rb', line 343

def random_chars(length,**kwargs)
  random_bytes(length,**kwargs).map { |byte| @chars[byte] }
end

#random_distinct_bytes(length, random: Random) ⇒ Array<Integer>

Creates an Array of random non-repeating bytes from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random non-repeating bytes.

  • random (Random, SecureRandom) (defaults to: Random)

    The random number generator to use.

Returns:

  • (Array<Integer>)

    The randomly selected non-repeating bytes.



315
316
317
318
319
320
321
322
323
324
325
326
# File 'lib/chars/char_set.rb', line 315

def random_distinct_bytes(length, random: Random)
  shuffled_bytes = bytes.shuffle(random: random)

  case length
  when Array
    shuffled_bytes[0,length.sample(random: random)]
  when Range
    shuffled_bytes[0,random.rand(length)]
  else
    shuffled_bytes[0,length]
  end
end

#random_distinct_chars(length, **kwargs) ⇒ Array<Integer>

Creates an Array of random non-repeating characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the Array of random non-repeating characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (Array<Integer>)

    The randomly selected non-repeating characters.



385
386
387
# File 'lib/chars/char_set.rb', line 385

def random_distinct_chars(length,**kwargs)
  random_distinct_bytes(length,**kwargs).map { |byte| @chars[byte] }
end

#random_distinct_string(length, **kwargs) ⇒ String

Creates a String containing randomly selected non-repeating characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the String of random non-repeating characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (String)

    The String of randomly selected non-repeating characters.

See Also:



407
408
409
# File 'lib/chars/char_set.rb', line 407

def random_distinct_string(length,**kwargs)
  random_distinct_chars(length,**kwargs).join
end

#random_string(length, **kwargs) ⇒ String

Creates a String containing randomly selected characters from the Chars::CharSet.

Parameters:

  • length (Integer, Array, Range)

    The length of the String of random characters.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments.

Options Hash (**kwargs):

  • :random (Random, SecureRandom)

    The random number generator to use.

Returns:

  • (String)

    The String of randomly selected characters.

See Also:



365
366
367
# File 'lib/chars/char_set.rb', line 365

def random_string(length,**kwargs)
  random_chars(length,**kwargs).join
end

#select_chars {|char| ... } ⇒ Array<String>

Selects characters from the Chars::CharSet.

Yields:

  • (char)

    If a block is given, it will be used to select the characters from the Chars::CharSet.

Yield Parameters:

  • char (String)

    The character to select or reject.

Returns:



167
168
169
# File 'lib/chars/char_set.rb', line 167

def select_chars(&block)
  each_char.select(&block)
end

#strings_in(data, options = {}) {|match, (index)| ... } ⇒ Array, Hash

Finds sub-strings within given data that are made of characters within the Chars::CharSet.

Parameters:

  • data (String)

    The data to find sub-strings within.

  • options (Hash) (defaults to: {})

    Additional options.

Options Hash (options):

  • :length (Integer) — default: 4

    The minimum length of sub-strings found within the given data.

  • :offsets (Boolean) — default: false

    Specifies whether to return a Hash of offsets and matched sub-strings within the data, or to just return the matched sub-strings themselves.

Yields:

  • (match, (index))

    The given block will be passed every matched sub-string, and the optional index.

  • (String)

    match A sub-string containing the characters from the Chars::CharSet.

  • (Integer)

    index The index the sub-string was found at.

Returns:

  • (Array, Hash)

    If no block is given, an Array or Hash of sub-strings is returned.



588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
# File 'lib/chars/char_set.rb', line 588

def strings_in(data,options={},&block)
  kwargs = {min_length: options.fetch(:length,4)}

  unless block
    if options[:offsets]
      return Hash[substrings_with_indexes(data,**kwargs)]
    else
      return substrings(data,**kwargs)
    end
  end

  case block.arity
  when 2
    each_substring_with_index(data,**kwargs,&block)
  else
    each_substring(data,**kwargs,&block)
  end
end

#strings_of_length(length) ⇒ Enumerator

Returns an Enumerator that enumerates through every possible string belonging to the Chars::CharSet and of the given length.

Parameters:

  • length (Range, Array, Integer)

    The desired length(s) of each string.

Returns:

  • (Enumerator)

See Also:

  • #each_string


649
650
651
# File 'lib/chars/char_set.rb', line 649

def strings_of_length(length)
  each_string_of_length(length)
end

#substrings(data, **kwargs) ⇒ Array<String>

Returns an Array of all substrings within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

Options Hash (**kwargs):

  • :min_length (Integer)

    The minimum length of sub-strings found within the given data.

Returns:

  • (Array<String>)

    Tthe array of substrings within the given data.

See Also:

Since:

  • 0.3.0



549
550
551
# File 'lib/chars/char_set.rb', line 549

def substrings(data,**kwargs)
  each_substring(data,**kwargs).to_a
end

#substrings_with_indexes(data, **kwargs) ⇒ Array<(String, Integer)>

Returns an Array of all substrings and their indices within the given string, of minimum length and that are made up of characters from the Chars::CharSet.

Parameters:

Options Hash (**kwargs):

  • :min_length (Integer)

    The minimum length of sub-strings found within the given data.

Returns:

  • (Array<(String, Integer)>)

    Tthe array of substrings and their indices within the given data.

See Also:

Since:

  • 0.3.0



497
498
499
# File 'lib/chars/char_set.rb', line 497

def substrings_with_indexes(data,**kwargs)
  each_substring_with_index(data,**kwargs).to_a
end

#|(set) ⇒ CharSet Also known as: +

Creates a new CharSet object by unioning the Chars::CharSet with another Chars::CharSet.

Parameters:

Returns:



663
664
665
666
667
# File 'lib/chars/char_set.rb', line 663

def |(set)
  set = CharSet.new(set) unless set.kind_of?(CharSet)

  return super(set)
end