force_encoding

force_encoding is a Swiss Army knife for all the problems with Ruby 1.9 encodings.

Description

Ruby 1.9.1 introduces really annoying mechanism - string encodings. Normally in 1.8 we didn’t have to care about anything like this, now in 1.9 we have to use force_encoding method before any operations on the String object:

"some string".length
#=> 11

We can’t trust length method any more, because we don’t know the encoding of string. To get the exact size (in bytes) of the string we have to write:

"some string".force_encoding(Encoding::ASCII_8BIT).length
#=> 11

Using the force_encoding gem, we can just write:

"some string".feb.length
#=> 11

Now we can compare the saved bytes in the source code (using force_encoding gem of course):

'"some string".force_encoding(Encoding::ASCII_8BIT).length'.feb.length
#=> 57
'"some string".feb.length'.feb.length
#=> 24

What a difference!

Because sometimes we don’t want to modify the original string (there’re still some people that don’t use force_encoding before using a string!), the force_encoding gem has awesome methods dfeu and dfeb:

a = "asd".feu
a.encoding
#=> #<Encoding:UTF-8>
a.dfeb.length
#=> 3
a.encoding
#=> #<Encoding:UTF-8>

AWESOME! Think about the profit you can get from the library:

github.com/rails/rails/commit/e590508a9b7ab5cf99d7a7675a92a1257cb9f6f8

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but

    bump version in a commit by itself I can ignore when I pull)
    
  • Send me a pull request. Bonus points for topic branches.

Copyright © 2009 Jakub Kuźma. See LICENSE for details.