Class: File

Inherits:
Object
  • Object
show all
Defined in:
lib/file_with_bom.rb

Overview

Extend File with some BOM-Handling.

Constant Summary collapse

BOM_LIST_hex =

BOMs for different encodings.

{
  'UTF_8'      => "\xEF\xBB\xBF", #"\uEFBBBF"

  'UTF_16BE' => "\xFE\xFF", #"\uFEFF",

  'UTF_16LE' => "\xFF\xFE",
  'UTF_32BE' => "\x00\x00\xFE\xFF",
  'UTF_32LE' => "\xFE\xFF\x00\x00",
}

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.open(filename, mode_string = 'r', options = {}, &block) ⇒ Object

Redefine open to support BOM.

This modification allow the usage of encodings like “utf-8-bom”. This encodings can be used in read- and write-mode.

Examples:

File.open("file.txt", "w:utf-16le-bom"){|f|
  f << 'some content'
}
File.open("file.txt", "w:utf-16le", :bom => true ){|f|
  f << 'some content'
}

Remark

Ruby 1.9.2 supports already BOMs in read mode (e.g. “r:bom|utf-8”).

The syntactical difference (uft-8-bom instead bom|utf-8) is wanted to separate the two logics.

  • This gem does not support ruby 1.8 (makes no sense, you may store the BOM, but the conntent will not ne unicode).

  • This gem supports also BOMs in write mode.



103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
# File 'lib/file_with_bom.rb', line 103

def open(filename, mode_string = 'r', options = {}, &block)
  #~ puts "! %-10s %-20s %s" % [mode_string, filename, options.inspect] #only for tests

  
  #check for bom-flag in mode_string

  options[:bom] = true if mode_string.sub!('-bom','')
  
  f = open_old(filename, mode_string, options)
    
  if options[:bom]
    case mode_string
      when /\Ar/   #read mode -> remove BOM

        #remove BOM

        bom = f.read(f.utf_bom_hex.bytesize) 
        #check, if it was really a bom

        if bom != f.utf_bom_hex
          f.rewind  #return to position 0 if BOM was no BOM

        end
      when /\Aw/  #write mode -> attach BOM

        f << f.utf_bom_hex
    end #mode_string

  end
  
  if block_given?
    yield f 
    f.close
  end
end

.open_oldObject

Store the old File.open



77
# File 'lib/file_with_bom.rb', line 77

alias :open_old :open

Instance Method Details

#utf_bom_hex(encoding = external_encoding) ⇒ Object

Get BOM for the ‘external_encoding’.

You may use it like this:

File.open(filename, "w:utf-16le"){|f|
  f << f.utf_bom  #add the BOM manual
  f << 'some content'
}


71
72
73
# File 'lib/file_with_bom.rb', line 71

def utf_bom_hex(encoding = external_encoding )
  BOM_LIST_hex[encoding].force_encoding(encoding) #ruby 1.9

end