IndoorVoice: Lowercase all-caps strings excluding acronyms

Gem Version Build Status Dependency Status Coverage Status Code Climate

DOES YOUR DATA CONTAIN ALL-CAPS TEXT THAT YOU WISH WAS PROPERLY CASED?

Have your data use its indoor voice.

require 'open-uri'

require 'indoor_voice'

# You can use any word list. Here we use Scrabble words. 
url = 'https://scrabblehelper.googlecode.com/svn/trunk/ScrabbleHelper/src/dictionaries/TWL06.txt'
words = open(url).readlines.map(&:chomp)

# You can use any language. :en is the BCP 47 code for English.
model = IndoorVoice.new(words, :en)
model.setup # wait a moment

model.downcase('HP, IBM AND MICROSOFT ARE TECHNOLOGY CORPORATIONS.')
# => "HP, IBM and microsoft are technology corporations."

model.titlecase('HP, IBM AND MICROSOFT ARE TECHNOLOGY CORPORATIONS.')
# => "HP, IBM And Microsoft Are Technology Corporations."

model.titlecase('HP, IBM AND MICROSOFT ARE TECHNOLOGY CORPORATIONS.', except: %w(a an and as at but by en for if in of on or the to via))
# => "HP, IBM and Microsoft Are Technology Corporations."

model.titlecase('HP, IBM AND MICROSOFT ARE TECHNOLOGY CORPORATIONS.', except: words)
# => "HP, IBM and Microsoft are technology corporations."

This gem is magic.

IndoorVoice is based on the assumption that most acronyms contain non-word character sequences. For example, no English word has the character sequence bm in a word-final position, therefore IBM must be an acronym.

Once you have a string with only acronyms in uppercase, you can (in your own code) selectively uppercase letters, like the first letter in each sentence, or the first letter of each word. Since most titlecasing gems recase acronyms, titlecasing is a planned feature.

Why?

No gem for titlecasing dealt with acronyms well. In case this gem doesn't suit your needs, see:

Copyright (c) 2015 James McKinney, released under the MIT license