Class: Google::Apis::GenomicsV1::Read
- Inherits:
-
Object
- Object
- Google::Apis::GenomicsV1::Read
- Includes:
- Core::Hashable, Core::JsonObjectSupport
- Defined in:
- generated/google/apis/genomics_v1/classes.rb,
generated/google/apis/genomics_v1/representations.rb,
generated/google/apis/genomics_v1/representations.rb
Overview
A read alignment describes a linear alignment of a string of DNA to a
reference sequence, in addition to metadata about the fragment (the molecule
of DNA sequenced) and the read (the bases which were read by the sequencer). A
read is equivalent to a line in a SAM file. A read belongs to exactly one read
group and exactly one read group set. For more genomics resource definitions,
see Fundamentals of Google Genomics ### Reverse-stranded reads Mapped reads (
reads having a non-null alignment
) can be aligned to either the forward or
the reverse strand of their associated reference. Strandedness of a mapped
read is encoded by alignment.position.reverseStrand
. If we consider the
reference to be a forward-stranded coordinate space of [0, reference.length)
with 0
as the left-most position and reference.length
as the right-most
position, reads are always aligned left to right. That is, alignment.position.
position
always refers to the left-most reference coordinate and alignment.
cigar
describes the alignment of this read to the reference from left to
right. All per-base fields such as alignedSequence
and alignedQuality
share this same left-to-right orientation; this is true of reads which are
aligned to either strand. For reverse-stranded reads, this means that
alignedSequence
is the reverse complement of the bases that were originally
reported by the sequencing machine. ### Generating a reference-aligned
sequence string When interacting with mapped reads, it's often useful to
produce a string representing the local alignment of the read to reference.
The following pseudocode demonstrates one way of doing this: out = "" offset =
0 for c in read.alignment.cigar switch c.operation
case "ALIGNMENT_MATCH",
"SEQUENCE_MATCH", "SEQUENCE_MISMATCH": out += read.alignedSequence[offset:
offset+c.operationLength] offset += c.operationLength break case "CLIP_SOFT", "
INSERT": offset += c.operationLength break case "PAD": out += repeat("*", c.
operationLength) break case "DELETE": out += repeat("-", c.operationLength)
break case "SKIP": out += repeat(" ", c.operationLength) break case "CLIP_HARD"
: break return out ### Converting to SAM's CIGAR string The following
pseudocode generates a SAM CIGAR string from the
cigar
field. Note that this
is a lossy conversion (cigar.referenceSequence
is lost). cigarMap = "
ALIGNMENT_MATCH": "M", "INSERT": "I", "DELETE": "D", "SKIP": "N", "CLIP_SOFT":
"S", "CLIP_HARD": "H", "PAD": "P", "SEQUENCE_MATCH": "=", "SEQUENCE_MISMATCH":
"X",
cigarStr = "" for c in read.alignment.cigar cigarStr += c.
operationLength + cigarMap[c.operation]
return cigarStr
Instance Attribute Summary collapse
-
#aligned_quality ⇒ Array<Fixnum>
The quality of the read sequence contained in this alignment record ( equivalent to QUAL in SAM).
-
#aligned_sequence ⇒ String
The bases of the read sequence contained in this alignment record, without CIGAR operations applied (equivalent to SEQ in SAM).
-
#alignment ⇒ Google::Apis::GenomicsV1::LinearAlignment
A linear alignment can be represented by one CIGAR string.
-
#duplicate_fragment ⇒ Boolean
(also: #duplicate_fragment?)
The fragment is a PCR or optical duplicate (SAM flag 0x400).
-
#failed_vendor_quality_checks ⇒ Boolean
(also: #failed_vendor_quality_checks?)
Whether this read did not pass filters, such as platform or vendor quality controls (SAM flag 0x200).
-
#fragment_length ⇒ Fixnum
The observed length of the fragment, equivalent to TLEN in SAM.
-
#fragment_name ⇒ String
The fragment name.
-
#id ⇒ String
The server-generated read ID, unique across all reads.
-
#info ⇒ Hash<String,Array<Object>>
A map of additional read alignment information.
-
#next_mate_position ⇒ Google::Apis::GenomicsV1::Position
An abstraction for referring to a genomic position, in relation to some already known reference.
-
#number_reads ⇒ Fixnum
The number of reads in the fragment (extension to SAM flag 0x1).
-
#proper_placement ⇒ Boolean
(also: #proper_placement?)
The orientation and the distance between reads from the fragment are consistent with the sequencing protocol (SAM flag 0x2).
-
#read_group_id ⇒ String
The ID of the read group this read belongs to.
-
#read_group_set_id ⇒ String
The ID of the read group set this read belongs to.
-
#read_number ⇒ Fixnum
The read number in sequencing.
-
#secondary_alignment ⇒ Boolean
(also: #secondary_alignment?)
Whether this alignment is secondary.
-
#supplementary_alignment ⇒ Boolean
(also: #supplementary_alignment?)
Whether this alignment is supplementary.
Instance Method Summary collapse
-
#initialize(**args) ⇒ Read
constructor
A new instance of Read.
-
#update!(**args) ⇒ Object
Update properties of this object.
Methods included from Core::JsonObjectSupport
Methods included from Core::Hashable
Constructor Details
#initialize(**args) ⇒ Read
Returns a new instance of Read.
1895 1896 1897 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1895 def initialize(**args) update!(**args) end |
Instance Attribute Details
#aligned_quality ⇒ Array<Fixnum>
The quality of the read sequence contained in this alignment record (
equivalent to QUAL in SAM). alignedSequence
and alignedQuality
may be
shorter than the full read sequence and quality. This will occur if the
alignment is part of a chimeric alignment, or if the read was trimmed. When
this occurs, the CIGAR for this read will begin/end with a hard clip operator
that will indicate the length of the excised sequence.
Corresponds to the JSON property alignedQuality
1879 1880 1881 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1879 def aligned_quality @aligned_quality end |
#aligned_sequence ⇒ String
The bases of the read sequence contained in this alignment record, without
CIGAR operations applied (equivalent to SEQ in SAM). alignedSequence
and
alignedQuality
may be shorter than the full read sequence and quality. This
will occur if the alignment is part of a chimeric alignment, or if the read
was trimmed. When this occurs, the CIGAR for this read will begin/end with a
hard clip operator that will indicate the length of the excised sequence.
Corresponds to the JSON property alignedSequence
1869 1870 1871 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1869 def aligned_sequence @aligned_sequence end |
#alignment ⇒ Google::Apis::GenomicsV1::LinearAlignment
A linear alignment can be represented by one CIGAR string. Describes the
mapped position and local alignment of the read to the reference.
Corresponds to the JSON property alignment
1833 1834 1835 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1833 def alignment @alignment end |
#duplicate_fragment ⇒ Boolean Also known as: duplicate_fragment?
The fragment is a PCR or optical duplicate (SAM flag 0x400).
Corresponds to the JSON property duplicateFragment
1803 1804 1805 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1803 def duplicate_fragment @duplicate_fragment end |
#failed_vendor_quality_checks ⇒ Boolean Also known as: failed_vendor_quality_checks?
Whether this read did not pass filters, such as platform or vendor quality
controls (SAM flag 0x200).
Corresponds to the JSON property failedVendorQualityChecks
1826 1827 1828 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1826 def failed_vendor_quality_checks @failed_vendor_quality_checks end |
#fragment_length ⇒ Fixnum
The observed length of the fragment, equivalent to TLEN in SAM.
Corresponds to the JSON property fragmentLength
1809 1810 1811 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1809 def fragment_length @fragment_length end |
#fragment_name ⇒ String
The fragment name. Equivalent to QNAME (query template name) in SAM.
Corresponds to the JSON property fragmentName
1791 1792 1793 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1791 def fragment_name @fragment_name end |
#id ⇒ String
The server-generated read ID, unique across all reads. This is different from
the fragmentName
.
Corresponds to the JSON property id
1773 1774 1775 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1773 def id @id end |
#info ⇒ Hash<String,Array<Object>>
A map of additional read alignment information. This must be of the form map (
string key mapping to a list of string values).
Corresponds to the JSON property info
1893 1894 1895 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1893 def info @info end |
#next_mate_position ⇒ Google::Apis::GenomicsV1::Position
An abstraction for referring to a genomic position, in relation to some
already known reference. For now, represents a genomic position as a reference
name, a base number on that reference (0-based), and a determination of
forward or reverse strand.
Corresponds to the JSON property nextMatePosition
1887 1888 1889 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1887 def next_mate_position @next_mate_position end |
#number_reads ⇒ Fixnum
The number of reads in the fragment (extension to SAM flag 0x1).
Corresponds to the JSON property numberReads
1820 1821 1822 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1820 def number_reads @number_reads end |
#proper_placement ⇒ Boolean Also known as: proper_placement?
The orientation and the distance between reads from the fragment are
consistent with the sequencing protocol (SAM flag 0x2).
Corresponds to the JSON property properPlacement
1797 1798 1799 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1797 def proper_placement @proper_placement end |
#read_group_id ⇒ String
The ID of the read group this read belongs to. A read belongs to exactly one
read group. This is a server-generated ID which is distinct from SAM's RG tag (
for that value, see ReadGroup.name).
Corresponds to the JSON property readGroupId
1780 1781 1782 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1780 def read_group_id @read_group_id end |
#read_group_set_id ⇒ String
The ID of the read group set this read belongs to. A read belongs to exactly
one read group set.
Corresponds to the JSON property readGroupSetId
1786 1787 1788 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1786 def read_group_set_id @read_group_set_id end |
#read_number ⇒ Fixnum
The read number in sequencing. 0-based and less than numberReads. This field
replaces SAM flag 0x40 and 0x80.
Corresponds to the JSON property readNumber
1815 1816 1817 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1815 def read_number @read_number end |
#secondary_alignment ⇒ Boolean Also known as: secondary_alignment?
Whether this alignment is secondary. Equivalent to SAM flag 0x100. A secondary
alignment represents an alternative to the primary alignment for this read.
Aligners may return secondary alignments if a read can map ambiguously to
multiple coordinates in the genome. By convention, each read has one and only
one alignment where both secondaryAlignment
and supplementaryAlignment
are
false.
Corresponds to the JSON property secondaryAlignment
1843 1844 1845 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1843 def secondary_alignment @secondary_alignment end |
#supplementary_alignment ⇒ Boolean Also known as: supplementary_alignment?
Whether this alignment is supplementary. Equivalent to SAM flag 0x800.
Supplementary alignments are used in the representation of a chimeric
alignment. In a chimeric alignment, a read is split into multiple linear
alignments that map to different reference contigs. The first linear alignment
in the read will be designated as the representative alignment; the remaining
linear alignments will be designated as supplementary alignments. These
alignments may have different mapping quality scores. In each linear alignment
in a chimeric alignment, the read will be hard clipped. The alignedSequence
and alignedQuality
fields in the alignment record will only represent the
bases for its respective linear alignment.
Corresponds to the JSON property supplementaryAlignment
1858 1859 1860 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1858 def supplementary_alignment @supplementary_alignment end |
Instance Method Details
#update!(**args) ⇒ Object
Update properties of this object
1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 |
# File 'generated/google/apis/genomics_v1/classes.rb', line 1900 def update!(**args) @id = args[:id] if args.key?(:id) @read_group_id = args[:read_group_id] if args.key?(:read_group_id) @read_group_set_id = args[:read_group_set_id] if args.key?(:read_group_set_id) @fragment_name = args[:fragment_name] if args.key?(:fragment_name) @proper_placement = args[:proper_placement] if args.key?(:proper_placement) @duplicate_fragment = args[:duplicate_fragment] if args.key?(:duplicate_fragment) @fragment_length = args[:fragment_length] if args.key?(:fragment_length) @read_number = args[:read_number] if args.key?(:read_number) @number_reads = args[:number_reads] if args.key?(:number_reads) @failed_vendor_quality_checks = args[:failed_vendor_quality_checks] if args.key?(:failed_vendor_quality_checks) @alignment = args[:alignment] if args.key?(:alignment) @secondary_alignment = args[:secondary_alignment] if args.key?(:secondary_alignment) @supplementary_alignment = args[:supplementary_alignment] if args.key?(:supplementary_alignment) @aligned_sequence = args[:aligned_sequence] if args.key?(:aligned_sequence) @aligned_quality = args[:aligned_quality] if args.key?(:aligned_quality) @next_mate_position = args[:next_mate_position] if args.key?(:next_mate_position) @info = args[:info] if args.key?(:info) end |