Demystify

Demystify is a gem to help you deal with text, for text analysis or NLP projects.

Installation

Add this line to your application's Gemfile:

  gem 'demystify'

And then execute:

$ bundle

Or install it yourself as:

$ gem install demystify

Usage

Make a Text object using your text file.

  text = Demystify::Text.new('./my_text_file.txt')

Get an array of all characters, words or sentences:

  text.chars
  text.words
  text.sentences

Count the number of all characters, spaces, new lines, non-whitespace characters, punctuation, symbols, letters, non-letters, words and sentences:

  text.char_count
  text.spaces_count
  text.new_line_count
  text.non_whitespace_char_count
  text.punctuation_count
  text.symbol_count
  text.letter_count
  text.non_letter_count
  text.word_count
  text.sentence_count

Check for the number of occurrences of a particular sequence of characters:

text.sequence_count(sequence)

Get the first word or last word of every sentence in an array:

text.first_words
text.last_words

Get the average length of a word or average number of words per sentence:

text.average_word_length
text.average_sentence_length

Get a hash of every word in the text of pointing to an array of all of its following or preceding words in the text:

text.forwards_probability_hash
text.backwards_probability_hash