force_encoding
force_encoding is a Swiss Army knife for all the problems with Ruby 1.9 encodings.
Description
Ruby 1.9.1 introduces really annoying mechanism - string encodings. Normally in 1.8 we didn’t have to care about anything like this, now in 1.9 we have to use force_encoding method before any operations on the String object:
"some string".length
#=> 11
We can’t trust length method any more, because we don’t know the encoding of string. To get the exact size (in bytes) of the string we have to write:
"some string".force_encoding(Encoding::ASCII_8BIT).length
#=> 11
Using the force_encoding gem, we can just write:
"some string".feb.length
#=> 11
Now we can compare the saved bytes in the source code (using force_encoding gem of course):
'"some string".force_encoding(Encoding::ASCII_8BIT).length'.feb.length
#=> 57
'"some string".feb.length'.feb.length
#=> 24
What a difference!
Because sometimes we don’t want to modify the original string (there’re still some people that don’t use force_encoding before using a string!), the force_encoding gem has awesome methods dfeu and dfeb:
a = "asd".feu
a.encoding
#=> #<Encoding:UTF-8>
a.dfeb.length
#=> 3
a.encoding
#=> #<Encoding:UTF-8>
AWESOME! Think about the profit you can get from the library:
github.com/rails/rails/commit/e590508a9b7ab5cf99d7a7675a92a1257cb9f6f8
Note on Patches/Pull Requests
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but
bump version in a commit by itself I can ignore when I pull)
-
Send me a pull request. Bonus points for topic branches.
Copyright
Copyright © 2009 Jakub Kuźma. See LICENSE for details.