Encoda

This is a simple file encoding converter

Installation

sudo gem install encoda

Getting Started

Single file conversion

If from_path is pointed to a file, that file will be converted to the specified encoding.

Encoda.convert do 
  from_path 'path/to/your/file'
  to_dir 'path/to/output/dir'        
  from_encoding 'GB2312'
  to_encoding 'UTF8'
end

Note that the from_path and to_dir are required, to_dir is pointed to the output directory, if that directory doesn’t exist yet, the encoda will create for you.

Multiple files conversion

If from_path is pointed to a directory, all files under that directory will be converted to the specified encoding.

Encoda.convert do  	
  from_path "/from/path"
  to_dir "path/to/output/dir"
  to_encoding 'UTF8'
  prefix 'converted_'
end

You can specify the prefix for the output files. As you can see, if you omit the from_encoding option, the Encoda will try to guess the file’s original encoding. This feature is based on chardet library.

Note that when the file is converted to the same encoding, the conversion won’t happen, instead it will copy that file over to the output directory directly.

Retrieve the conversion result

encoda = Encoda.convert do  	
  from_path "/from/path"
  to_dir "path/to/output/dir"
  to_encoding 'UTF8'
end

puts encoda.failed.inspect #=> ['failed_file1.txt', 'failed_file2.txt']
puts encoda.success.inspect #=> ['success_file1.txt', 'success_file3.txt, file4.srt']
puts encoda.guess_details.inspect #=> {"nedivx-tbl.gb.srt"=>{"confidence"=>0.99, "encoding"=>"GB2312"}, "nedivx-tbl.eng.srt"=>{"confidence"=>1.0, "encoding"=>"ascii"}, "utf8.gb.srt"=>{"confidence"=>0.94, "encoding"=>"utf-8"}, "nedivx-tbl.big5.srt"=>{"confidence"=>0.99, "encoding"=>"BIG5"}}

TODOs

Add more test and Refactoring!!!

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010 Zipme. See LICENSE for details.