Method: Wordlist::Builder#initialize

Defined in:
lib/wordlist/builder.rb

#initialize(path, format: Format.infer(path), append: false, **kwargs) ⇒ Builder

Creates a new word-list Builder object.

Parameters:

  • path (String)

    The path of the wordlist file.

  • format (:txt, :gz, :bzip2, :xz, :zip, :7zip, nil) (defaults to: Format.infer(path))

    The format of the wordlist. If not given the format will be inferred from the file extension.

  • append (Boolean) (defaults to: false)

    Indicates whether new words will be appended to the wordlist or overwrite the wordlist.

  • kwargs (Hash{Symbol => Object})

    Additional keyword arguments for Lexer#initialize.

Options Hash (**kwargs):

  • :lang (Symbol)

    The language to use. Defaults to Lexer::Lang.default.

  • :stop_words (Array<String>)

    The explicit stop-words to ignore. If not given, default stop words will be loaded based on lang or Lexer::Lang.default.

  • :ignore_words (Array<String, Regexp>)

    Optional list of words to ignore. Can contain Strings or Regexps.

  • :digits (Boolean) — default: true

    Controls whether parsed words may contain digits or not.

  • :special_chars (Array<String>) — default: Lexer::SPCIAL_CHARS

    The additional special characters allowed within words.

  • :numbers (Boolean) — default: false

    Controls whether whole numbers will be parsed as words.

  • :acronyms (Boolean) — default: true

    Controls whether acronyms will be parsed as words.

  • :normalize_case (Boolean) — default: false

    Controls whether to convert all words to lowercase.

  • :normalize_apostrophes (Boolean) — default: false

    Controls whether apostrophes will be removed from the end of words.

  • :normalize_acronyms (Boolean) — default: false

    Controls whether acronyms will have . characters removed.

Raises:

  • (ArgumentError)

    The format could not be inferred from the file extension, or the ignore_words keyword contained a value other than a String or Regexp.

Since:

  • 1.0.0



90
91
92
93
94
95
96
97
98
99
100
# File 'lib/wordlist/builder.rb', line 90

def initialize(path, format: Format.infer(path), append: false, **kwargs)
  @path   = ::File.expand_path(path)
  @format = format
  @append = append

  @lexer = Lexer.new(**kwargs)
  @unique_filter = UniqueFilter.new

  load! if append? && ::File.file?(@path)
  open!
end