Class: Bioroebe::RawSequence
- Inherits:
-
Object
- Object
- Bioroebe::RawSequence
- Defined in:
- lib/bioroebe/raw_sequence/raw_sequence.rb
Overview
Bioroebe::RawSequence
Direct Known Subclasses
Instance Method Summary collapse
-
#+(i) ⇒ Object
# === +.
-
#<<(i) ⇒ Object
(also: #add, #append, #concat)
# === <<.
-
#[]=(start_position, end_position, new_content = '') ⇒ Object
# === []=.
-
#calculate_levensthein_distance(a, b = sequence?) ) ⇒ Object
# === calculate_levensthein_distance ========================================================================= #.
-
#chars? ⇒ Boolean
(also: #chars)
# === chars?.
-
#complement(i = @sequence) ⇒ Object
# === complement ========================================================================= #.
-
#composition? ⇒ Boolean
(also: #composition)
# === composition.
-
#count(this_character) ⇒ Object
# === count ========================================================================= #.
-
#delete(i) ⇒ Object
# === delete ========================================================================= #.
-
#delete!(i) ⇒ Object
# === delete! ========================================================================= #.
-
#downcase ⇒ Object
(also: #lowercase, #lower)
# === downcase.
-
#each_char(&block) ⇒ Object
# === each_char ========================================================================= #.
-
#empty? ⇒ Boolean
# === empty?.
-
#find_substring_indices(this_substring) ⇒ Object
(also: #find_this_subsequence)
# === find_substring_indices.
-
#first_position=(i) ⇒ Object
(also: #first_nucleotide=)
# === first_position=.
-
#freeze ⇒ Object
# === freeze.
-
#gsub(replace_this, with_that) ⇒ Object
# === gsub ========================================================================= #.
-
#gsub!(replace_this, with_that) ⇒ Object
# === gsub! ========================================================================= #.
-
#include?(i) ⇒ Boolean
# === include?.
-
#initialize(commandline_arguments = ARGV) ⇒ RawSequence
constructor
# === initialize ========================================================================= #.
-
#insert_at_this_position(position, insert_this_new_content) ⇒ Object
# === insert_at_this_position.
-
#prepend(i) ⇒ Object
# === prepend.
-
#remove_n_characters_from_the_left_side(n_characters) ⇒ Object
# === remove_n_characters_from_the_left_side.
-
#reset ⇒ Object
# === reset (reset tag) ========================================================================= #.
-
#reverse ⇒ Object
# === reverse ========================================================================= #.
-
#reverse! ⇒ Object
# === reverse! ========================================================================= #.
-
#reverse_complement(i = sequence?) ) ⇒ Object
# === reverse_complement.
-
#scan(i) ⇒ Object
# === scan ========================================================================= #.
-
#set_raw_sequence(i) ⇒ Object
(also: #assign)
# === set_raw_sequence ========================================================================= #.
-
#shuffle ⇒ Object
(also: #randomize)
# === shuffle ========================================================================= #.
-
#size? ⇒ Boolean
(also: #size, #length, #length?)
# === size?.
-
#split(i) ⇒ Object
# === split ========================================================================= #.
-
#start_with?(i) ⇒ Boolean
# === start_with? ========================================================================= #.
-
#strip ⇒ Object
# === strip.
-
#subseq(start_position, end_position = :ask_the_user_for_an_end_position_number) ⇒ Object
(also: #[], #subsequence, #start_end)
# === subseq.
-
#to_s ⇒ Object
(also: #sequence?, #sequence, #string?, #seq, #seq?, #s?, #main_string?, #main_sequence_as_string?)
# === to_s.
-
#to_str ⇒ Object
# === to_str.
-
#tr!(a, b) ⇒ Object
# === tr! ========================================================================= #.
-
#upcase! ⇒ Object
(also: #upcase, #up, #upper)
# === upcase!.
Constructor Details
#initialize(commandline_arguments = ARGV) ⇒ RawSequence
#
initialize
#
21 22 23 24 25 26 27 28 29 30 31 32 33 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 21 def initialize( commandline_arguments = ARGV ) reset if commandline_arguments and commandline_arguments.is_a?(Array) and !commandline_arguments.empty? set_raw_sequence(commandline_arguments) elsif commandline_arguments and commandline_arguments.is_a?(String) set_raw_sequence(commandline_arguments) end end |
Instance Method Details
#+(i) ⇒ Object
#
+
This method can “combine” - aka add - two sequences to one another.
#
113 114 115 116 117 118 119 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 113 def +(i) if i.is_a?(Bioroebe::RawSequence) or i.respond_to?(:sequence?) # This line will match for Bioroebe::Sequence return @sequence+ i.sequence? end end |
#<<(i) ⇒ Object Also known as: add, append, concat
#
<<
The method called << is an “input method”, that is, it will simply append onto the main sequence (stored as @sequence).
In simpler words: the @sequence stores the DNA or RNA or aminoacid sequence.
If a Sequence object is passed (Bioroebe::Sequence) then this method will tap into the main sequence (the main String) that it stores, through the .sequence? method, before continuing.
#
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 450 def <<(i) if i.is_a?(::Bioroebe::Sequence) or i.is_a?(::Bioroebe::Sequence) i = i.sequence? elsif i.is_a? Symbol case i # ===================================================================== # # === :stop # ===================================================================== # when :stop if Bioroebe.stop_codons.empty? Bioroebe.initialize_default_stop_codons end i = ::Bioroebe.stop_codons?.sample end end @sequence << i self # Returning self here since that will allow method-chaining. end |
#[]=(start_position, end_position, new_content = '') ⇒ Object
#
[]=
Note that we will start to count at 1 here, since we also start at the first nucleotide position in a given DNA/RNA strand.
We will, however had, NOT do so when a negative number is passed to this method.
#
481 482 483 484 485 486 487 488 489 490 491 492 493 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 481 def []=( start_position, end_position, new_content = '' ) start_position = start_position.to_i end_position = end_position.to_i unless start_position < 0 start_position -= 1 unless start_position < 1 end_position -= 1 unless end_position < 1 end @sequence[start_position, end_position] = new_content end |
#calculate_levensthein_distance(a, b = sequence?) ) ⇒ Object
#
calculate_levensthein_distance
#
498 499 500 501 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 498 def calculate_levensthein_distance(a, b = sequence?) require 'bioroebe/calculate/calculate_levensthein_distance.rb' ::Bioroebe.calculate_levensthein_distance(a,b) end |
#chars? ⇒ Boolean Also known as: chars
#
chars?
This method will return the characters of the main sequence, as an Array.
#
83 84 85 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 83 def chars? @sequence.chars end |
#complement(i = @sequence) ⇒ Object
#
complement
#
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 199 def complement( i = @sequence ) _ = ''.dup i.chars.each {|this_char| case this_char when 'G' _ << 'C' when 'C' _ << 'G' when 'A' _ << 'T' when 'T' _ << 'A' end } _ end |
#composition? ⇒ Boolean Also known as: composition
#
composition
This method will return a hash displaying the nucleotide or aminoacid composition of the sequence at hand.
Usage example:
seq = Bioroebe::Sequence.new("ATGC"); seq.composition # => {"A"=>1, "T"=>1, "C"=>1, "G"=>1}
seq = Bioroebe::Sequence.new("EFGGHHGG"); seq.is_a_protein_now; seq.composition # => {"A"=>1, "T"=>1, "C"=>1, "G"=>1}
#
149 150 151 152 153 154 155 156 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 149 def composition? hash = {} # This Hash will be returned for all the three cases defined below. available_keys = @sequence.chars.uniq available_keys.each {|this_key| hash[this_key] = @sequence.count(this_key) } return hash end |
#count(this_character) ⇒ Object
#
count
#
168 169 170 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 168 def count(this_character) @sequence.count(this_character) end |
#delete(i) ⇒ Object
#
delete
#
104 105 106 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 104 def delete(i) @sequence.delete(i) end |
#delete!(i) ⇒ Object
#
delete!
#
321 322 323 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 321 def delete!(i) @sequence.delete!(i) end |
#downcase ⇒ Object Also known as: lowercase, lower
#
downcase
This method will always downcase our given sequence object at hand.
.lower() has been added in September 2021 for (slight) compatibility towards biopython.
#
244 245 246 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 244 def downcase @sequence.downcase! # Will always modify. end |
#each_char(&block) ⇒ Object
#
each_char
#
97 98 99 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 97 def each_char(&block) @sequence.each_char(&block) end |
#empty? ⇒ Boolean
#
empty?
Determine whether our sequence is empty or not. It is empty if it is a String of zero length, an “empty” String such as ''.
#
192 193 194 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 192 def empty? @sequence.empty? end |
#find_substring_indices(this_substring) ⇒ Object Also known as: find_this_subsequence
#
find_substring_indices
This method taps into the method called Bioroebe.find_substring().
It will return an Array of all substring indices (if we have found any, that is) - otherwise it will return nil.
#
530 531 532 533 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 530 def find_substring_indices(this_substring) require 'bioroebe/toplevel_methods/searching_and_finding.rb' return ::Bioroebe.find_substring_indices(string?, this_substring) end |
#first_position=(i) ⇒ Object Also known as: first_nucleotide=
#
first_position=
Use this method to assign a new sequence at the start. If this is DNA, then it is a new first nucleotide.
#
224 225 226 227 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 224 def first_position=(i) @sequence = @sequence.dup if @sequence.frozen? # Prevent frozen String error here. @sequence[0,1] = i end |
#freeze ⇒ Object
#
freeze
If you wish to free the sequence object and thus disallow further modifications to it, use this method.
#
287 288 289 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 287 def freeze @sequence.freeze end |
#gsub(replace_this, with_that) ⇒ Object
#
gsub
#
294 295 296 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 294 def gsub(replace_this, with_that) @sequence.gsub(replace_this, with_that) end |
#gsub!(replace_this, with_that) ⇒ Object
#
gsub!
#
338 339 340 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 338 def gsub!(replace_this, with_that) @sequence.gsub!(replace_this, with_that) end |
#include?(i) ⇒ Boolean
#
include?
Check whether our sequence includes some other sequence.
#
126 127 128 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 126 def include?(i) @sequence.to_s.include? i.to_s end |
#insert_at_this_position(position, insert_this_new_content) ⇒ Object
#
insert_at_this_position
This method can be specifically used to insert content into a sequence object. For example, a His6-tag sequence into a DNA sequence object.
The second argument is the new (DNA, RNA or Aminoacid) sequence that you wish to add. You can also use '|' tokens there if you like to - they will be removed.
#
428 429 430 431 432 433 434 435 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 428 def insert_at_this_position( position, insert_this_new_content ) if insert_this_new_content.include? '|' insert_this_new_content.delete!('|') end @sequence[position, 0] = insert_this_new_content end |
#prepend(i) ⇒ Object
#
prepend
If you wish to prepend something to your target sequence then this is the right method to use.
#
331 332 333 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 331 def prepend(i) @sequence.prepend(i) end |
#remove_n_characters_from_the_left_side(n_characters) ⇒ Object
#
remove_n_characters_from_the_left_side
This method will remove n characters from the left side (aka 5').
It can be applied to DNA, RNA and an aminoacid sequence, so it can be retained on the main Sequence class definition as-is.
#
406 407 408 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 406 def remove_n_characters_from_the_left_side(n_characters) @sequence[0, n_characters] = '' end |
#reset ⇒ Object
#
reset (reset tag)
#
38 39 40 41 42 43 44 45 46 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 38 def reset # ======================================================================= # # === @sequence # # This instance variable keeps our whole sequence. It is the most # important variable for objects instantiated from this class. # ======================================================================= # @sequence = ''.dup end |
#reverse ⇒ Object
#
reverse
#
90 91 92 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 90 def reverse @sequence.reverse end |
#reverse! ⇒ Object
#
reverse!
#
161 162 163 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 161 def reverse! @sequence.reverse! end |
#reverse_complement(i = sequence?) ) ⇒ Object
#
reverse_complement
Complement to the other strand via this method here, which is actually called “reverse complement”.
The complement thus refers to the “complementary DNA strand”, towards a 5'-NUCLEOTIDE-3' sequence.
Usage example:
x = Bioroebe::Sequence.new('ATTGCCACAACTGAGACA'); x.complement # => "TGTCTCAGTTGTGGCAAT"
#
517 518 519 520 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 517 def reverse_complement(i = sequence?) require 'bioroebe/toplevel_methods/nucleotides.rb' return ::Bioroebe.complementary_dna_strand(i).reverse end |
#scan(i) ⇒ Object
#
scan
#
182 183 184 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 182 def scan(i) @sequence.scan(i) end |
#set_raw_sequence(i) ⇒ Object Also known as: assign
#
set_raw_sequence
#
72 73 74 75 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 72 def set_raw_sequence(i) i = i.flatten.compact.first if i.is_a? Array @sequence = i end |
#shuffle ⇒ Object Also known as: randomize
#
shuffle
#
232 233 234 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 232 def shuffle @sequence = @sequence.chars.shuffle.join end |
#size? ⇒ Boolean Also known as: size, length, length?
#
size?
Return the size of the string/sequence in question.
#
312 313 314 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 312 def size? @sequence.size end |
#split(i) ⇒ Object
#
split
#
175 176 177 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 175 def split(i) @sequence.split(i) end |
#start_with?(i) ⇒ Boolean
#
start_with?
#
133 134 135 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 133 def start_with?(i) to_s.start_with?(i) end |
#strip ⇒ Object
#
strip
Similar to the method .strip() on class String.
#
303 304 305 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 303 def strip @sequence.strip end |
#subseq(start_position, end_position = :ask_the_user_for_an_end_position_number) ⇒ Object Also known as: [], subsequence, start_end
#
subseq
This method will obtain a subsequence of the given sequence object at hand.
We start to count at the first nucleotide. The second argument given to this method will denote the nucleotide position at where we will STOP. So (3,8) will translate to “take nucleotide 3, up to and including nucleotide 8, and then return this result”.
See the following examples to understand this more easily.
Usage examples:
seq = Bioroebe::RawSequence.new("ATGCATGCAAAA"); seq.subseq(1, 3) # => "ATG"
seq = Bioroebe::RawSequence.new("ATGCATGCAAAA"); seq.subseq(3, 8) # => "GCATGC"
seq = Bioroebe::RawSequence.new("atgcatgcaaaa"); seq.subseq(3, 8) # => "GCATGC"
seq = Bioroebe::RawSequence.new("ATGCATGCAAAA"); seq.subseq(3, 833333333333) # => "GCATGCAAAA"
seq = Bioroebe::RawSequence.new("ATGCATGCAAATCCACAA"); seq.start_end(1, 10) # => "ATGCATGCAA"
#
383 384 385 386 387 388 389 390 391 392 393 394 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 383 def subseq( start_position, end_position = :ask_the_user_for_an_end_position_number ) if end_position == :ask_the_user_for_an_end_position_number puts 'Please provide a valid end position (an Integer value).' else start_position -= 1 end_position -= start_position sequence?[start_position, end_position] end end |
#to_s ⇒ Object Also known as: sequence?, sequence, string?, seq, seq?, s?, main_string?, main_sequence_as_string?
#
to_s
Query method over the given Sequence that this class stores, as a String.
This method has several aliases, but it can not be guaranteed that all aliases will continue to work for the remainder of this project's lifecycle. For example, the method s? as alias for sequence? may be removed one day - but until then, it will be remain available.
Still, it is recommended to use the slightly longer method name .sequence? or .to_s; the alias s? exists mostly so that we can be lazy in IRB and elsewhere. So perhaps it will be retained, but there is no guarantee - for your own scripts you should use either .to_s or .sequence? really.
If you wish to test the output of this method, try:
require 'bioroebe'; x = Bioroebe::Seq.new('AGTACACTGGT'); puts x
#
270 271 272 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 270 def to_s @sequence.to_s end |
#to_str ⇒ Object
#
to_str
We need this method to allow to chain Sequence-objects together, in a String-like behaviour.
Specifically this allows us to make use of the '+' method call.
Objects in ruby implement the to_str method so that they can be treated like a String, for all practical purposes.
This can be tested like in this way:
x = Bioroebe::RawSequence.new('ATGGATCGATGC'); y = Bioroebe::RawSequence.new('TTTGATCGATGC'); z = x + y
#
64 65 66 67 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 64 def to_str # self # ← Old code since up to May 2020. @sequence.to_s # ← This became the new default as of May 2020 again. end |
#tr!(a, b) ⇒ Object
#
tr!
#
413 414 415 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 413 def tr!(a, b) @sequence.tr!(a, b) end |
#upcase! ⇒ Object Also known as: upcase, up, upper
#
upcase!
This method will upcase the given sequence, so “atg” becomes “ATG”.
Note that .upcase() is an alias to .upcase!() - use whichever variant you want to, but keep in mind that the receiver will be modified in both variants.
.upper() has been added in September 2021 for (slight) compatibility towards biopython.
#
354 355 356 357 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 354 def upcase! @sequence.upcase! return @sequence end |