Class: RFile

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/rfile.rb

Overview

This class is a line oriented file object that operates without keeping the file in memory.

Enumerable is mixed in, see Enumerable for more information.

Constant Summary collapse

VERSION =
"0.2.0"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, recycle = false, sep_string = $/) ⇒ RFile

parses and indexes filename.

if recycle == true, the randomline method will reload the index (fast) when it runs out of unique lines to produce

if sep_string is passed, “lines” will be determined by sep_string instead of $/ – storing line information (length, offset, ) ++



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# File 'lib/rfile.rb', line 25

def initialize(filename, recycle=false, sep_string=$/)
  @filename  = filename
  @recycle = recycle
  @sep_string = sep_string
  @index     = Array.new
  @rndindex  = []
  count      = 0
  offset     = 1

  File.open(@filename).each_line(@sep_string) do |line|
    @index[count] = IndexElement.new([line.length, offset-1, count+1])     
    offset += line.length
    count+=1
  end
  @rndindex = RandomStack.new(@index.clone)
end

Instance Attribute Details

#filenameObject

Returns the value of attribute filename.



12
13
14
# File 'lib/rfile.rb', line 12

def filename
  @filename
end

#recycleObject

Returns the value of attribute recycle.



12
13
14
# File 'lib/rfile.rb', line 12

def recycle
  @recycle
end

Instance Method Details

#eachObject

yields each line in the file, in turn.

note: currently IO intensive as it will open and close the file for each line.



97
98
99
100
101
# File 'lib/rfile.rb', line 97

def each # :yields:line
  @index.each do |entry|
    yield line(entry.linum) unless entry.nil?
  end
end

#lengthObject

returns the number of lines available to randomline based methods in the current cycle. useful if you want to know how close you are to recycling the file, or how close to r_eof? == true



78
79
80
# File 'lib/rfile.rb', line 78

def length
  @rndindex.length
end

#line(num) ⇒ Object

returns the line at num (provided num is greater than or equal to 1) returns nil if num is larger than the lines available



84
85
86
87
88
89
90
# File 'lib/rfile.rb', line 84

def line(num)
  if (num < 1) or (num > @index.length)
    raise "line number: #{num} is out of bounds"
  end
  entry = @index[num-1]
  IO.read(@filename, entry.length, entry.offset).chomp(@sep_string)
end

#r_eof?Boolean

return true if there are no lines left for randomline(s) ( only useful if recycle=true )

Returns:

  • (Boolean)


58
59
60
61
# File 'lib/rfile.rb', line 58

def r_eof?
  return true if @rndindex.length == 0
  false
end

#randomlineObject

returns a random line from the file. will not repeat lines. returns nil when the file is exausted. note: does not modify file.



44
45
46
47
48
49
50
51
52
53
54
# File 'lib/rfile.rb', line 44

def randomline
  entry = nil
  if @recycle and @rndindex.length == 0 
    @rndindex = RandomStack.new(@index)
  end
  while(entry.nil? and @rndindex.length > 0)
    entry = @rndindex.pop
  end
  entry.nil? and return nil
  return line(entry.linum) 
end

#randomlines(num) ⇒ Object

yields num random lines or returns them as an array. see randomline for details



64
65
66
67
68
69
70
71
72
73
# File 'lib/rfile.rb', line 64

def randomlines(num) #:yields:line
  arr = Array.new
  doyield = block_given?
  num.times do |i|
    rline = randomline()
      yield rline if doyield
      arr.push rline
  end
  arr if not doyield
end