Merges multiple sq
and possibly
character
objects into one larger sq
object.
[sq
|| character
]
Multiple objects. For exact behavior, check Details section. First argument
must be of sq
class due to R mechanism of single dispatch. If this is
a problem, recommended alternative is vec_c
method from
vctrs-package
package.
sq
object with length equal to sum of
lengths of individual objects passed as parameters. Elements of
sq
are concatenated just as if they were normal
lists (see c
).
Whenever all passed objects are of one of standard types (that is, dna_bsc, dna_ext, rna_bsc, rna_ext, ami_bsc or ami_ext), returned object is of the same class, as no changes to alphabet are needed.
It's possible to mix both basic and extended types within one call to
c()
, however they all must be of the same type (that is, either
dna, rna or ami). In this case, returned object
is of extended type.
Mixing dna, rna and ami types is prohibited, as interpretation of letters differ depending on the type.
Whenever all objects are either of atp type, returned object is also of this class and resulting alphabet is equal to set union of all input alphabets.
unt type can be mixed with any other type, resulting in unt object with alphabet equal to set union of all input alphabets. In this case, it is possible to concatenate dna and ami objects, for instance, by concatenating one of them first with unt object. However, it is strongly discouraged, as it may result in unwanted concatenation of DNA and amino acid sequences.
Whenever a character vector appears, it does not influence resulting sq type.
Each element is treated as separate sequence. If any of letters in this
vector does not appear in resulting alphabet, it is silently replaced with
NA
.
Due to R dispatch mechanism passing character vector as first will return
class-less list. This behavior is effectively impossible and definitely
unrecommended to fix, as fixing it would involve changing c
primitive.
If such possibility is necessary, vec_c
is a better
alternative.
Functions from utility module:
==.sq()
,
get_sq_lengths()
,
is.sq()
,
sqextract
# Creating objects to work on:
sq_dna_1 <- sq(c("GGACTGCA", "CTAGTA", ""), alphabet = "dna_bsc")
sq_dna_2 <- sq(c("ATGACA", "AC-G", "-CCAT"), alphabet = "dna_bsc")
sq_dna_3 <- sq(character(), alphabet = "dna_bsc")
sq_dna_4 <- sq(c("BNACV", "GDBADHH"), alphabet = "dna_ext")
sq_rna_1 <- sq(c("UAUGCA", "UAGCCG"), alphabet = "rna_bsc")
sq_rna_2 <- sq(c("-AHVRYA", "G-U-HYR"), alphabet = "rna_ext")
sq_rna_3 <- sq("AUHUCHYRBNN--", alphabet = "rna_ext")
sq_ami <- sq("ACHNK-IFK-VYW", alphabet = "ami_bsc")
sq_unt <- sq("AF:gf;PPQ^&XN")
# Concatenating dna_bsc sequences:
c(sq_dna_1, sq_dna_2, sq_dna_3)
#> basic DNA sequences list:
#> [1] GGACTGCA <8>
#> [2] CTAGTA <6>
#> [3] <NULL> <0>
#> [4] ATGACA <6>
#> [5] AC-G <4>
#> [6] -CCAT <5>
# Concatenating rna_ext sequences:
c(sq_rna_2, sq_rna_3)
#> extended RNA sequences list:
#> [1] -AHVRYA <7>
#> [2] G-U-HYR <7>
#> [3] AUHUCHYRBNN-- <13>
# Mixing dna_bsc and dna_ext:
c(sq_dna_1, sq_dna_4, sq_dna_2)
#> extended DNA sequences list:
#> [1] GGACTGCA <8>
#> [2] CTAGTA <6>
#> [3] <NULL> <0>
#> [4] BNACV <5>
#> [5] GDBADHH <7>
#> [6] ATGACA <6>
#> [7] AC-G <4>
#> [8] -CCAT <5>
# Mixing DNA and RNA sequences doesn't work:
if (FALSE) {
c(sq_dna_3, sq_rna_1)
}
# untsq can be mixed with DNA, RNA and amino acids:
c(sq_ami, sq_unt)
#> unt (unspecified type) sequences list:
#> [1] ACHNK-IFK-VYW <13>
#> [2] AF:gf;PPQ^&XN <13>
c(sq_unt, sq_rna_1, sq_rna_2)
#> unt (unspecified type) sequences list:
#> [1] AF:gf;PPQ^&XN <13>
#> [2] UAUGCA <6>
#> [3] UAGCCG <6>
#> [4] -AHVRYA <7>
#> [5] G-U-HYR <7>
c(sq_dna_2, sq_unt, sq_dna_3)
#> unt (unspecified type) sequences list:
#> [1] ATGACA <6>
#> [2] AC-G <4>
#> [3] -CCAT <5>
#> [4] AF:gf;PPQ^&XN <13>
# Character vectors are also acceptable:
c(sq_dna_2, "TGCA-GA")
#> basic DNA sequences list:
#> [1] ATGACA <6>
#> [2] AC-G <4>
#> [3] -CCAT <5>
#> [4] TGCA-GA <7>
c(sq_rna_2, c("UACUGGGACUG", "AUGUBNAABNRYYRAU"), sq_rna_3)
#> extended RNA sequences list:
#> [1] -AHVRYA <7>
#> [2] G-U-HYR <7>
#> [3] UACUGGGACUG <11>
#> [4] AUGUBNAABNRYYRAU <16>
#> [5] AUHUCHYRBNN-- <13>
c(sq_unt, "&#JIA$O02t30,9ec", sq_ami)
#> unt (unspecified type) sequences list:
#> [1] AF:gf;PPQ^&XN <13>
#> [2] &!!IA!!!!!!!!!!! <16>
#> [3] ACHNK-IFK-VYW <13>