Extracts a defined range of elements from all sequences.
[sq
]
An object this function is applied to.
[integer
]
Indices to extract from each sequence. The function follows the normal R
conventions for indexing vectors, including negative indices.
further arguments to be passed from or to other methods.
[character(1)
]
A string that is used to interpret and display NA
value in the
context of sq class
. Default value equals to
"!
".
["silent" || "message" || "warning" || "error"
]
Determines the method of handling warning message. Default value is
"warning"
.
sq
object of the same type as input
sq
, where each element is a subsequence created by indexing
corresponding sequence from input sq
object with input indices.
bite
function allows user to access specific elements from multiple
sequences at once.
By passing positive indices the user can choose, which elements they want
from each sequence. If a sequence is shorter than an index, then NA
value is inserted into the result in this place and a warning is issued.
The user can specify behavior of R in this case by specifying
on_warning
parameter.
Negative indices are supported as well. Their interpretation is "to select
all elements except those on positions specified by these negative indices".
This means that e.g. c(-1, -3, -5)
vector will be used to bite all
sequence elements except the first, the third and the fifth. If a sequence
is shorter than any index, then nothing happens, as it's physically
impossible to extract an element at said index.
As per normal R convention, it isn't accepted to mix positive and negative indices, because there is no good interpretation possible for that.
# Creating objects to work on:
sq_dna <- sq(c("ATGCAGGA", "GACCGNBAACGAN", "TGACGAGCTTA"),
alphabet = "dna_bsc")
sq_ami <- sq(c("MIAANYTWIL","TIAALGNIIYRAIE", "NYERTGHLI", "MAYXXXIALN"),
alphabet = "ami_ext")
sq_unt <- sq(c("ATGCAGGA?", "TGACGAGCTTA", "", "TIAALGNIIYRAIE"))
# Extracting first five letters:
bite(sq_dna, 1:5)
#> basic DNA sequences list:
#> [1] ATGCA <5>
#> [2] GACCG <5>
#> [3] TGACG <5>
# If a sequence is shorter than 5, then NA is introduced:
bite(sq_unt, 1:5)
#> Warning: some sequences are subsetted with index bigger than length - NA introduced
#> unt (unspecified type) sequences list:
#> [1] ATGCA <5>
#> [2] TGACG <5>
#> [3] !!!!! <5>
#> [4] TIAAL <5>
# Selecting fourth, seventh and fourth again letter:
bite(sq_ami, c(4, 7, 4))
#> extended amino acid sequences list:
#> [1] ATA <3>
#> [2] ANA <3>
#> [3] RHR <3>
#> [4] XIX <3>
# Selecting all letters except first four:
bite(sq_dna, -1:-4)
#> basic DNA sequences list:
#> [1] AGGA <4>
#> [2] G!!AACGA! <9>
#> [3] GAGCTTA <7>