Module: TraceVisualization

Included in:
Algorithm
Defined in:
lib/trace_visualization/bwt.rb,
lib/trace_visualization.rb,
lib/trace_visualization/utils.rb,
lib/trace_visualization/assert.rb,
lib/trace_visualization/mapping.rb,
lib/trace_visualization/profile.rb,
lib/trace_visualization/reorder.rb,
lib/trace_visualization/version.rb,
lib/trace_visualization/algorithm.rb,
lib/trace_visualization/data/token.rb,
lib/trace_visualization/generators.rb,
lib/trace_visualization/repetitions.rb,
lib/trace_visualization/suffix_array.rb,
lib/trace_visualization/data/repetition.rb,
lib/trace_visualization/repetitions_psy.rb,
lib/trace_visualization/data/sorted_array.rb,
lib/trace_visualization/repetitions_score.rb,
lib/trace_visualization/repetitions/filter.rb,
lib/trace_visualization/repetitions/context.rb,
lib/trace_visualization/lexeme_overlap_filter.rb,
lib/trace_visualization/longest_common_prefix.rb,
lib/trace_visualization/repetitions/concatenation.rb,
lib/trace_visualization/repetitions/incrementation.rb,
lib/trace_visualization/repetitions_incrementation.rb,
lib/trace_visualization/visualization/console_color_print.rb

Overview

The Burrows–Wheeler transform (BWT, also called block-sorting compression), is an algorithm used in data compression techniques such as bzip2. When a character string is transformed by the BWT, none of its characters change value. The transformation permutes the order of the characters. If the original string had several substrings that occurred often, then the transformed string will have several places where a single character is repeated multiple times in a row. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding.

The transform is done by sorting all rotations of the text in lexicographic order, then taking the last column. For example, the text “^BANANA|” is transformed into “BNN^AA|A”.

Defined Under Namespace

Modules: Algorithm, BurrowsWheelerTransform, Data, Generators, LexemeOverlapFilter, LongestCommonPrefix, Profile, Reorder, Repetitions, RepetitionsIncrementation, RepetitionsScore, SuffixArray, Utils, Visualization Classes: Mapping

Constant Summary collapse

TERMINATION_CHAR =

Should be ‘greater’ of all possible chars in the lexicographical order

255.chr
FORBIDDEN_CHARS =
/\n/
VERSION =
'0.0.5'

Class Method Summary collapse

Class Method Details

.assert_instance_of(object, expected_class) ⇒ Object



2
3
4
# File 'lib/trace_visualization/assert.rb', line 2

def self.assert_instance_of(object, expected_class)
  raise "Illegal parameter type: expected #{expected_class}, actual #{object.class}. Object: #{object}" if not object.instance_of? expected_class
end

.process(options = {}) ⇒ Object



27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# File 'lib/trace_visualization.rb', line 27

def self.process(options = {})
  options = set_default_options(options)
  logger  = options[:logger]

  # Preprocess
  file_name = options[:file_name]

  # Read & mapping file
  mapping = TraceVisualization::Mapping.new
  mapping.process do
    from_file options[:file_name]
  end

=begin
  logger.info 'start process'

  str        = nil
  str_mapped = nil

  Benchmark.bm(14) do |x|
    x.report('read file') { str = options[:str] || TraceVisualization::Utils.read_file(options) }
    x.report('mapping') { str_mapped = TraceVisualization::Mapping.new(str) }
  end
  
  str_len = str.length
  map_len = str_mapped.length
  logger.info("str.length = #{str_len}, str_mapped.length = #{map_len}, compression = #{((str_len.to_f - map_len) / str_len.to_f).round(2)}%")

  return []

  rs = TraceVisualization::Repetitions.psy1(str_mapped, options[:p_min], true)

  logger.info 'PSY1 finish. build context'
  
  context = TraceVisualization::Repetitions::Context.new(str_mapped, rs)
  
  logger.info 'first concat step'
  
  TraceVisualization::RepetitionsConcatenation.process(rs, 1, context)
  
  # Approximate
  # Vissss   
=end
  #rs
end

.set_default_options(options) ⇒ Object



73
74
75
76
77
78
# File 'lib/trace_visualization.rb', line 73

def self.set_default_options(options)
  options = {
    :p_min => 3, 
    :logger => Logger.new(STDOUT)
  }.merge options
end