Module: RemoteTable::Plaintext

Defined in:
lib/remote_table/plaintext.rb

Overview

Helper methods that act on plaintext files before they are parsed

Constant Summary collapse

UTF8_BOM =

UTF-8 byte order mark

'\xef\xbb\xbf'
EOL_TO_UNIX =
's/\r\n|\n|\r/\n/g'

Instance Method Summary collapse

Instance Method Details

#convert_eol_to_unix!Object

No matter what the EOL are SUPPOSED to be, run it through Perl with a regex that will convert all EOLS to n

Examples:

perl -pe 's/\r\n|\n|\r/\n/g'


50
51
52
# File 'lib/remote_table/plaintext.rb', line 50

def convert_eol_to_unix!
  local_copy.in_place :perl, EOL_TO_UNIX
end

#crop_rows!Object

If the user has specified :crop, use a combination of tail and head

Examples:

:crop => (184..263)

tail +184 | head 80


68
69
70
71
72
73
# File 'lib/remote_table/plaintext.rb', line 68

def crop_rows!
  if crop
    local_copy.in_place :tail, "+#{crop.first}"
    local_copy.in_place :head, (crop.last - crop.first + 1)
  end
end

#cut_columns!Object

If the user has specified :cut, use cut

Examples:

:cut => ‘13-’

cut -c 13-


79
80
81
82
83
# File 'lib/remote_table/plaintext.rb', line 79

def cut_columns!
  if cut
    local_copy.in_place :cut, cut
  end
end

#delete_harmful!Object

Remove bytes that are both useless and harmful in the vast majority of cases.



27
28
29
30
# File 'lib/remote_table/plaintext.rb', line 27

def delete_harmful!
  harmful = [ Plaintext.soft_hyphen(encoding), UTF8_BOM ]
  local_copy.in_place :perl, "s/#{harmful.join('//g; s/')}//g"
end

#skip_rows!Object

If the user has specified :skip, use tail

Examples:

:skip => 6

tail +7


58
59
60
61
62
# File 'lib/remote_table/plaintext.rb', line 58

def skip_rows!
  if skip > 0
    local_copy.in_place :tail, "+#{skip + 1}"
  end
end

#transliterate_whole_file_to_utf8!Object

No matter what the file encoding is SUPPOSED to be, run it through the system iconv binary to make sure it’s UTF-8

Examples:

iconv -c -t UTF-8//TRANSLIT -f WINDOWS-1252


36
37
38
39
40
41
42
43
44
# File 'lib/remote_table/plaintext.rb', line 36

def transliterate_whole_file_to_utf8!
  if ::UnixUtils.available?('iconv')
    local_copy.in_place :iconv, RemoteTable::EXTERNAL_ENCODING_ICONV, encoding
  else
    ::Kernel.warn %{[remote_table] iconv not available in your $PATH, not performing transliteration}
  end
  # now that we've force-transliterated to UTF-8, act as though this is what the user had specified
  @encoding = RemoteTable::EXTERNAL_ENCODING
end