Class: Bioroebe::RawSequence
- Inherits:
-
Object
- Object
- Bioroebe::RawSequence
- Defined in:
- lib/bioroebe/raw_sequence/raw_sequence.rb
Overview
Bioroebe::RawSequence
Direct Known Subclasses
Instance Method Summary collapse
-
#+(i) ⇒ Object
# === +.
-
#<<(i) ⇒ Object
(also: #add, #append, #concat)
# === <<.
-
#[]=(start_position, end_position, new_content = '') ⇒ Object
# === []=.
-
#calculate_levensthein_distance(a, b = sequence?) ) ⇒ Object
# === calculate_levensthein_distance ========================================================================= #.
-
#chars? ⇒ Boolean
(also: #chars)
# === chars?.
-
#complement(i = @sequence) ⇒ Object
# === complement ========================================================================= #.
-
#composition? ⇒ Boolean
(also: #composition)
# === composition.
-
#count(this_character) ⇒ Object
# === count ========================================================================= #.
-
#delete(i) ⇒ Object
# === delete ========================================================================= #.
-
#delete!(i) ⇒ Object
# === delete! ========================================================================= #.
-
#downcase ⇒ Object
(also: #lowercase, #lower)
# === downcase.
-
#each_char(&block) ⇒ Object
# === each_char ========================================================================= #.
-
#empty? ⇒ Boolean
# === empty?.
-
#find_substring_indices(this_substring) ⇒ Object
(also: #find_this_subsequence)
# === find_substring_indices.
-
#first_position=(i) ⇒ Object
(also: #first_nucleotide=)
# === first_position=.
-
#freeze ⇒ Object
# === freeze.
-
#gsub(replace_this, with_that) ⇒ Object
# === gsub ========================================================================= #.
-
#gsub!(replace_this, with_that) ⇒ Object
# === gsub! ========================================================================= #.
-
#include?(i) ⇒ Boolean
# === include?.
-
#initialize(commandline_arguments = ARGV) ⇒ RawSequence
constructor
# === initialize ========================================================================= #.
-
#insert_at_this_position(position, insert_this_new_content) ⇒ Object
# === insert_at_this_position.
-
#prepend(i) ⇒ Object
# === prepend.
-
#remove_n_characters_from_the_left_side(n_characters) ⇒ Object
# === remove_n_characters_from_the_left_side.
-
#reset ⇒ Object
# === reset (reset tag) ========================================================================= #.
-
#reverse ⇒ Object
# === reverse ========================================================================= #.
-
#reverse! ⇒ Object
# === reverse! ========================================================================= #.
-
#reverse_complement(i = sequence?) ) ⇒ Object
# === reverse_complement.
-
#scan(i) ⇒ Object
# === scan ========================================================================= #.
-
#set_raw_sequence(i) ⇒ Object
(also: #assign)
# === set_raw_sequence ========================================================================= #.
-
#shuffle ⇒ Object
(also: #randomize)
# === shuffle ========================================================================= #.
-
#size? ⇒ Boolean
(also: #size, #length, #length?)
# === size?.
-
#split(i) ⇒ Object
# === split ========================================================================= #.
-
#start_with?(i) ⇒ Boolean
# === start_with? ========================================================================= #.
-
#strip ⇒ Object
# === strip.
-
#subseq(start_position, end_position = :ask_the_user_for_an_end_position_number) ⇒ Object
(also: #[], #subsequence, #start_end)
# === subseq.
-
#to_s ⇒ Object
(also: #sequence?, #sequence, #string?, #seq, #seq?, #s?, #main_string?, #main_sequence_as_string?)
# === to_s.
-
#to_str ⇒ Object
# === to_str.
-
#tr!(a, b) ⇒ Object
# === tr! ========================================================================= #.
-
#upcase! ⇒ Object
(also: #upcase, #up, #upper)
# === upcase!.
Constructor Details
#initialize(commandline_arguments = ARGV) ⇒ RawSequence
#
initialize
#
23 24 25 26 27 28 29 30 31 32 33 34 35 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 23 def initialize( commandline_arguments = ARGV ) reset if commandline_arguments and commandline_arguments.is_a?(Array) and !commandline_arguments.empty? set_raw_sequence(commandline_arguments) elsif commandline_arguments and commandline_arguments.is_a?(String) set_raw_sequence(commandline_arguments) end end |
Instance Method Details
#+(i) ⇒ Object
#
+
This method can “combine” - aka add - two sequences to one another.
#
115 116 117 118 119 120 121 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 115 def +(i) if i.is_a?(Bioroebe::RawSequence) or i.respond_to?(:sequence?) # This line will match for Bioroebe::Sequence return @sequence+ i.sequence? end end |
#<<(i) ⇒ Object Also known as: add, append, concat
#
<<
The method called << is an “input method”, that is, it will simply append onto the main sequence (stored as @sequence).
In simpler words: the @sequence stores the DNA or RNA or aminoacid sequence.
If a Sequence object is passed (Bioroebe::Sequence) then this method will tap into the main sequence (the main String) that it stores, through the .sequence? method, before continuing.
#
452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 452 def <<(i) if i.is_a?(::Bioroebe::Sequence) or i.is_a?(::Bioroebe::Sequence) i = i.sequence? elsif i.is_a? Symbol case i # ===================================================================== # # === :stop # ===================================================================== # when :stop if Bioroebe.stop_codons.empty? Bioroebe.initialize_default_stop_codons end i = ::Bioroebe.stop_codons?.sample end end @sequence << i self # Returning self here since that will allow method-chaining. end |
#[]=(start_position, end_position, new_content = '') ⇒ Object
#
[]=
Note that we will start to count at 1 here, since we also start at the first nucleotide position in a given DNA/RNA strand.
We will, however had, NOT do so when a negative number is passed to this method.
#
483 484 485 486 487 488 489 490 491 492 493 494 495 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 483 def []=( start_position, end_position, new_content = '' ) start_position = start_position.to_i end_position = end_position.to_i unless start_position < 0 start_position -= 1 unless start_position < 1 end_position -= 1 unless end_position < 1 end @sequence[start_position, end_position] = new_content end |
#calculate_levensthein_distance(a, b = sequence?) ) ⇒ Object
#
calculate_levensthein_distance
#
500 501 502 503 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 500 def calculate_levensthein_distance(a, b = sequence?) require 'bioroebe/calculate/calculate_levensthein_distance.rb' ::Bioroebe.calculate_levensthein_distance(a,b) end |
#chars? ⇒ Boolean Also known as: chars
#
chars?
This method will return the characters of the main sequence, as an Array.
#
85 86 87 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 85 def chars? @sequence.chars end |
#complement(i = @sequence) ⇒ Object
#
complement
#
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 201 def complement( i = @sequence ) _ = ''.dup i.chars.each {|this_char| case this_char when 'G' _ << 'C' when 'C' _ << 'G' when 'A' _ << 'T' when 'T' _ << 'A' end } _ end |
#composition? ⇒ Boolean Also known as: composition
#
composition
This method will return a hash displaying the nucleotide or aminoacid composition of the sequence at hand.
Usage example:
seq = Bioroebe::Sequence.new("ATGC"); seq.composition # => {"A"=>1, "T"=>1, "C"=>1, "G"=>1}
seq = Bioroebe::Sequence.new("EFGGHHGG"); seq.is_a_protein_now; seq.composition # => {"A"=>1, "T"=>1, "C"=>1, "G"=>1}
#
151 152 153 154 155 156 157 158 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 151 def composition? hash = {} # This Hash will be returned for all the three cases defined below. available_keys = @sequence.chars.uniq available_keys.each {|this_key| hash[this_key] = @sequence.count(this_key) } return hash end |
#count(this_character) ⇒ Object
#
count
#
170 171 172 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 170 def count(this_character) @sequence.count(this_character) end |
#delete(i) ⇒ Object
#
delete
#
106 107 108 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 106 def delete(i) @sequence.delete(i) end |
#delete!(i) ⇒ Object
#
delete!
#
323 324 325 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 323 def delete!(i) @sequence.delete!(i) end |
#downcase ⇒ Object Also known as: lowercase, lower
#
downcase
This method will always downcase our given sequence object at hand.
.lower() has been added in September 2021 for (slight) compatibility towards biopython.
#
246 247 248 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 246 def downcase @sequence.downcase! # Will always modify. end |
#each_char(&block) ⇒ Object
#
each_char
#
99 100 101 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 99 def each_char(&block) @sequence.each_char(&block) end |
#empty? ⇒ Boolean
#
empty?
Determine whether our sequence is empty or not. It is empty if it is a String of zero length, an “empty” String such as ”.
#
194 195 196 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 194 def empty? @sequence.empty? end |
#find_substring_indices(this_substring) ⇒ Object Also known as: find_this_subsequence
#
find_substring_indices
This method taps into the method called Bioroebe.find_substring().
It will return an Array of all substring indices (if we have found any, that is) - otherwise it will return nil.
#
532 533 534 535 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 532 def find_substring_indices(this_substring) require 'bioroebe/toplevel_methods/searching_and_finding.rb' return ::Bioroebe.find_substring_indices(string?, this_substring) end |
#first_position=(i) ⇒ Object Also known as: first_nucleotide=
#
first_position=
Use this method to assign a new sequence at the start. If this is DNA, then it is a new first nucleotide.
#
226 227 228 229 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 226 def first_position=(i) @sequence = @sequence.dup if @sequence.frozen? # Prevent frozen String error here. @sequence[0,1] = i end |
#freeze ⇒ Object
#
freeze
If you wish to free the sequence object and thus disallow further modifications to it, use this method.
#
289 290 291 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 289 def freeze @sequence.freeze end |
#gsub(replace_this, with_that) ⇒ Object
#
gsub
#
296 297 298 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 296 def gsub(replace_this, with_that) @sequence.gsub(replace_this, with_that) end |
#gsub!(replace_this, with_that) ⇒ Object
#
gsub!
#
340 341 342 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 340 def gsub!(replace_this, with_that) @sequence.gsub!(replace_this, with_that) end |
#include?(i) ⇒ Boolean
#
include?
Check whether our sequence includes some other sequence.
#
128 129 130 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 128 def include?(i) @sequence.to_s.include? i.to_s end |
#insert_at_this_position(position, insert_this_new_content) ⇒ Object
#
insert_at_this_position
This method can be specifically used to insert content into a sequence object. For example, a His6-tag sequence into a DNA sequence object.
The second argument is the new (DNA, RNA or Aminoacid) sequence that you wish to add. You can also use ‘|’ tokens there if you like to - they will be removed.
#
430 431 432 433 434 435 436 437 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 430 def insert_at_this_position( position, insert_this_new_content ) if insert_this_new_content.include? '|' insert_this_new_content.delete!('|') end @sequence[position, 0] = insert_this_new_content end |
#prepend(i) ⇒ Object
#
prepend
If you wish to prepend something to your target sequence then this is the right method to use.
#
333 334 335 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 333 def prepend(i) @sequence.prepend(i) end |
#remove_n_characters_from_the_left_side(n_characters) ⇒ Object
#
remove_n_characters_from_the_left_side
This method will remove n characters from the left side (aka 5’).
It can be applied to DNA, RNA and an aminoacid sequence, so it can be retained on the main Sequence class definition as-is.
#
408 409 410 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 408 def remove_n_characters_from_the_left_side(n_characters) @sequence[0, n_characters] = '' end |
#reset ⇒ Object
#
reset (reset tag)
#
40 41 42 43 44 45 46 47 48 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 40 def reset # ======================================================================= # # === @sequence # # This instance variable keeps our whole sequence. It is the most # important variable for objects instantiated from this class. # ======================================================================= # @sequence = ''.dup end |
#reverse ⇒ Object
#
reverse
#
92 93 94 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 92 def reverse @sequence.reverse end |
#reverse! ⇒ Object
#
reverse!
#
163 164 165 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 163 def reverse! @sequence.reverse! end |
#reverse_complement(i = sequence?) ) ⇒ Object
#
reverse_complement
Complement to the other strand via this method here, which is actually called “reverse complement”.
The complement thus refers to the “complementary DNA strand”, towards a 5’-NUCLEOTIDE-3’ sequence.
Usage example:
x = Bioroebe::Sequence.new('ATTGCCACAACTGAGACA'); x.complement # => "TGTCTCAGTTGTGGCAAT"
#
519 520 521 522 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 519 def reverse_complement(i = sequence?) require 'bioroebe/toplevel_methods/nucleotides.rb' return ::Bioroebe.complementary_dna_strand(i).reverse end |
#scan(i) ⇒ Object
#
scan
#
184 185 186 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 184 def scan(i) @sequence.scan(i) end |
#set_raw_sequence(i) ⇒ Object Also known as: assign
#
set_raw_sequence
#
74 75 76 77 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 74 def set_raw_sequence(i) i = i.flatten.compact.first if i.is_a? Array @sequence = i end |
#shuffle ⇒ Object Also known as: randomize
#
shuffle
#
234 235 236 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 234 def shuffle @sequence = @sequence.chars.shuffle.join end |
#size? ⇒ Boolean Also known as: size, length, length?
#
size?
Return the size of the string/sequence in question.
#
314 315 316 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 314 def size? @sequence.size end |
#split(i) ⇒ Object
#
split
#
177 178 179 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 177 def split(i) @sequence.split(i) end |
#start_with?(i) ⇒ Boolean
#
start_with?
#
135 136 137 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 135 def start_with?(i) to_s.start_with?(i) end |
#strip ⇒ Object
#
strip
Similar to the method .strip() on class String.
#
305 306 307 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 305 def strip @sequence.strip end |
#subseq(start_position, end_position = :ask_the_user_for_an_end_position_number) ⇒ Object Also known as: [], subsequence, start_end
#
subseq
This method will obtain a subsequence of the given sequence object at hand.
We start to count at the first nucleotide. The second argument given to this method will denote the nucleotide position at where we will STOP. So (3,8) will translate to “take nucleotide 3, up to and including nucleotide 8, and then return this result”.
See the following examples to understand this more easily.
Usage examples:
seq = Bioroebe::RawSequence.new("ATGCATGCAAAA"); seq.subseq(1, 3) # => "ATG"
seq = Bioroebe::RawSequence.new("ATGCATGCAAAA"); seq.subseq(3, 8) # => "GCATGC"
seq = Bioroebe::RawSequence.new("atgcatgcaaaa"); seq.subseq(3, 8) # => "GCATGC"
seq = Bioroebe::RawSequence.new("ATGCATGCAAAA"); seq.subseq(3, 833333333333) # => "GCATGCAAAA"
seq = Bioroebe::RawSequence.new("ATGCATGCAAATCCACAA"); seq.start_end(1, 10) # => "ATGCATGCAA"
#
385 386 387 388 389 390 391 392 393 394 395 396 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 385 def subseq( start_position, end_position = :ask_the_user_for_an_end_position_number ) if end_position == :ask_the_user_for_an_end_position_number puts 'Please provide a valid end position (an Integer value).' else start_position -= 1 end_position -= start_position sequence?[start_position, end_position] end end |
#to_s ⇒ Object Also known as: sequence?, sequence, string?, seq, seq?, s?, main_string?, main_sequence_as_string?
#
to_s
Query method over the given Sequence that this class stores, as a String.
This method has several aliases, but it can not be guaranteed that all aliases will continue to work for the remainder of this project’s lifecycle. For example, the method s? as alias for sequence? may be removed one day - but until then, it will be remain available.
Still, it is recommended to use the slightly longer method name .sequence? or .to_s; the alias s? exists mostly so that we can be lazy in IRB and elsewhere. So perhaps it will be retained, but there is no guarantee - for your own scripts you should use either .to_s or .sequence? really.
If you wish to test the output of this method, try:
require 'bioroebe'; x = Bioroebe::Seq.new('AGTACACTGGT'); puts x
#
272 273 274 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 272 def to_s @sequence.to_s end |
#to_str ⇒ Object
#
to_str
We need this method to allow to chain Sequence-objects together, in a String-like behaviour.
Specifically this allows us to make use of the ‘+’ method call.
Objects in ruby implement the to_str method so that they can be treated like a String, for all practical purposes.
This can be tested like in this way:
x = Bioroebe::RawSequence.new('ATGGATCGATGC'); y = Bioroebe::RawSequence.new('TTTGATCGATGC'); z = x + y
#
66 67 68 69 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 66 def to_str # self # ← Old code since up to May 2020. @sequence.to_s # ← This became the new default as of May 2020 again. end |
#tr!(a, b) ⇒ Object
#
tr!
#
415 416 417 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 415 def tr!(a, b) @sequence.tr!(a, b) end |
#upcase! ⇒ Object Also known as: upcase, up, upper
#
upcase!
This method will upcase the given sequence, so “atg” becomes “ATG”.
Note that .upcase() is an alias to .upcase!() - use whichever variant you want to, but keep in mind that the receiver will be modified in both variants.
.upper() has been added in September 2021 for (slight) compatibility towards biopython.
#
356 357 358 359 |
# File 'lib/bioroebe/raw_sequence/raw_sequence.rb', line 356 def upcase! @sequence.upcase! return @sequence end |