Module: TraceVisualization
- Included in:
- Algorithm
- Defined in:
- lib/trace_visualization/bwt.rb,
lib/trace_visualization.rb,
lib/trace_visualization/utils.rb,
lib/trace_visualization/assert.rb,
lib/trace_visualization/mapping.rb,
lib/trace_visualization/profile.rb,
lib/trace_visualization/reorder.rb,
lib/trace_visualization/version.rb,
lib/trace_visualization/algorithm.rb,
lib/trace_visualization/data/token.rb,
lib/trace_visualization/generators.rb,
lib/trace_visualization/repetitions.rb,
lib/trace_visualization/suffix_array.rb,
lib/trace_visualization/data/repetition.rb,
lib/trace_visualization/repetitions_psy.rb,
lib/trace_visualization/data/sorted_array.rb,
lib/trace_visualization/repetitions_score.rb,
lib/trace_visualization/repetitions/filter.rb,
lib/trace_visualization/repetitions/context.rb,
lib/trace_visualization/lexeme_overlap_filter.rb,
lib/trace_visualization/longest_common_prefix.rb,
lib/trace_visualization/repetitions/concatenation.rb,
lib/trace_visualization/repetitions/incrementation.rb,
lib/trace_visualization/repetitions_incrementation.rb,
lib/trace_visualization/visualization/console_color_print.rb
Overview
The Burrows–Wheeler transform (BWT, also called block-sorting compression), is an algorithm used in data compression techniques such as bzip2. When a character string is transformed by the BWT, none of its characters change value. The transformation permutes the order of the characters. If the original string had several substrings that occurred often, then the transformed string will have several places where a single character is repeated multiple times in a row. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding.
The transform is done by sorting all rotations of the text in lexicographic order, then taking the last column. For example, the text “^BANANA|” is transformed into “BNN^AA|A”.
Defined Under Namespace
Modules: Algorithm, BurrowsWheelerTransform, Data, Generators, LexemeOverlapFilter, LongestCommonPrefix, Profile, Reorder, Repetitions, RepetitionsIncrementation, RepetitionsScore, SuffixArray, Utils, Visualization Classes: Mapping
Constant Summary collapse
- TERMINATION_CHAR =
Should be ‘greater’ of all possible chars in the lexicographical order
255.chr
- FORBIDDEN_CHARS =
/\n/- VERSION =
'0.0.5'
Class Method Summary collapse
- .assert_instance_of(object, expected_class) ⇒ Object
- .process(options = {}) ⇒ Object
- .set_default_options(options) ⇒ Object
Class Method Details
.assert_instance_of(object, expected_class) ⇒ Object
2 3 4 |
# File 'lib/trace_visualization/assert.rb', line 2 def self.assert_instance_of(object, expected_class) raise "Illegal parameter type: expected #{expected_class}, actual #{object.class}. Object: #{object}" if not object.instance_of? expected_class end |
.process(options = {}) ⇒ Object
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/trace_visualization.rb', line 27 def self.process( = {}) = () logger = [:logger] # Preprocess file_name = [:file_name] # Read & mapping file mapping = TraceVisualization::Mapping.new mapping.process do from_file [:file_name] end =begin logger.info 'start process' str = nil str_mapped = nil Benchmark.bm(14) do |x| x.report('read file') { str = options[:str] || TraceVisualization::Utils.read_file(options) } x.report('mapping') { str_mapped = TraceVisualization::Mapping.new(str) } end str_len = str.length map_len = str_mapped.length logger.info("str.length = #{str_len}, str_mapped.length = #{map_len}, compression = #{((str_len.to_f - map_len) / str_len.to_f).round(2)}%") return [] rs = TraceVisualization::Repetitions.psy1(str_mapped, options[:p_min], true) logger.info 'PSY1 finish. build context' context = TraceVisualization::Repetitions::Context.new(str_mapped, rs) logger.info 'first concat step' TraceVisualization::RepetitionsConcatenation.process(rs, 1, context) # Approximate # Vissss =end #rs end |
.set_default_options(options) ⇒ Object
73 74 75 76 77 78 |
# File 'lib/trace_visualization.rb', line 73 def self.() = { :p_min => 3, :logger => Logger.new(STDOUT) }.merge end |