Generates an sq object with specified number of sequences of given length and alphabet.

random_sq(n, len, alphabet, sd = NULL, use_gap = FALSE)

Arguments

n

[integer(1)]
A number of sequences to generate - must be non-negative.

len

[integer(1)]
Length of each sequence if sd not specified and mean length of sequences if sd specified - must be non-negative.

alphabet

[character]
If provided value is a single string, it will be interpreted as type (see details). If provided value has length greater than one, it will be treated as atypical alphabet for sq object and sq type will be atp.

sd

[integer(1)]
If specified, gives standard deviation of length of generated sequences - must be non-negative.

use_gap

[logical(1)]
If TRUE, sequences will be generated with random gaps inside (commonly denoted as "-").

Value

An object of class sq with type as specified.

Details

Letter '*' is not used in generating ami sequences. If parameter sd is passed, then all generated negative values are replaced with 0s.

See also

Functions from input module: import_sq(), read_fasta(), sq()

Examples

# Setting seed for reproducibility
set.seed(16)

# Generating random sequences
random_sq(10, 10, "ami_bsc")
#> basic amino acid sequences list:
#>  [1] GSCSLRWKLV                                                             <10>
#>  [2] QFVGDSLHHW                                                             <10>
#>  [3] NKECSGNNID                                                             <10>
#>  [4] YLHLGCRLLQ                                                             <10>
#>  [5] YAAWCIMRTK                                                             <10>
#>  [6] MELQLGKIFK                                                             <10>
#>  [7] SNELCVTCVW                                                             <10>
#>  [8] FCHDEWFQAT                                                             <10>
#>  [9] WFNMFMQVRT                                                             <10>
#> [10] FKEGHSHCCN                                                             <10>
random_sq(25, 18, "rna_bsc", sd = 6)
#> basic RNA sequences list:
#>  [1] GGUGUCGCCGCUCGCCGCG                                                    <19>
#>  [2] UGCCUGCCUUCUUGCUGCCCCUU                                                <23>
#>  [3] CCCCUGCUCUCUCGCCGUUGUGU                                                <23>
#>  [4] CCCCGCCGUUGUCUUGUUUUCCC                                                <23>
#>  [5] CUGGGCU                                                                 <7>
#>  [6] GCUUUGCGCGCCCGGGCGUGUG                                                 <22>
#>  [7] UGCUCUUUUGGUUUGUUGCCUGCUU                                              <25>
#>  [8] GCGUCGCGUGU                                                            <11>
#>  [9] GGGUCUUUCGUGUCGUCCUGUUCCGGGCGUGCCGUGC                                  <37>
#> [10] CGCCGUGGGGGUCC                                                         <14>
#> printed 10 out of 25
random_sq(50, 8, "dna_ext", sd = 3)
#> extended DNA sequences list:
#>  [1] RKYCYAM                                                                 <7>
#>  [2] AKGDD                                                                   <5>
#>  [3] RACBTTGT                                                                <8>
#>  [4] YBYSTCCRCR                                                             <10>
#>  [5] DBYNYYB                                                                 <7>
#>  [6] BMAYCWVGVTC                                                            <11>
#>  [7] KYBKTDGNWVBBD                                                          <13>
#>  [8] SK                                                                      <2>
#>  [9] NKDBN                                                                   <5>
#> [10] MWRAG                                                                   <5>
#> printed 10 out of 50
random_sq(6, 100, "ami_bsc", use_gap = TRUE)
#> basic amino acid sequences list:
#> [1] EEEGYFYRFH-RCETQCQFTFYREHDYGSRC-GSGTIILSSEIVGQYYMWHDNNMSMQIGQQRILFN... <100>
#> [2] -W-MKDWICTIEFQGWNINNHTQIWETEIF-RRHYSFSLLEMQRLEILQWNWMIQMKTW-NQVTCAN... <100>
#> [3] VTVVHMWFH-VD-VD-FWNQH-MG-NFAVHRGHH-SSAIRQVDQENFFLSMQHNCQLESGDDKFC-Q... <100>
#> [4] GGQSWQSECQFDNAKIQLCVIKRSCWG-EHEADEGWCHVLASDRHYTGEEHMQMWCAWF-TAEFRNM... <100>
#> [5] YIQHAEMGDHELWES-C-SEMGIMIWKKAC-G-WFHAVKFRCT-IRNVF---MCESSNAEFLHMVMW... <100>
#> [6] VV-CQQAHSSHVHRELFYSQWQVYYYICMERQCLYHWGVTQNCHMKRQGSHDQW-EMLRARKSAHGE... <100>

# Passing whole alphabet instead of type
random_sq(4, 12, c("Pro", "Gly", "Ala", "Met", "Cys"))
#> atp (atypical alphabet) sequences list:
#> [1] Cys Met Met Ala Cys Ala Gly Met Ala Ala Ala Met                         <12>
#> [2] Gly Gly Met Gly Cys Met Cys Met Met Cys Met Gly                         <12>
#> [3] Ala Met Cys Ala Gly Gly Cys Ala Met Gly Ala Ala                         <12>
#> [4] Gly Met Met Ala Gly Gly Cys Gly Ala Ala Ala Ala                         <12>

# Generating empty sequences (why would anyone though)
random_sq(8, 0, "rna_ext")
#> extended RNA sequences list:
#> [1] <NULL>                                                                   <0>
#> [2] <NULL>                                                                   <0>
#> [3] <NULL>                                                                   <0>
#> [4] <NULL>                                                                   <0>
#> [5] <NULL>                                                                   <0>
#> [6] <NULL>                                                                   <0>
#> [7] <NULL>                                                                   <0>
#> [8] <NULL>                                                                   <0>