Filter lineage_defs for specific lineages, keeping mutations that are present in at least one lineage
filter_lineages.Rd
Filter lineage_defs for specific lineages, keeping mutations that are present in at least one lineage
Usage
filter_lineages(
lineage_defs = NULL,
lineages = c("B.1.526", "B.1.1.7", "B.1.351", "B.1.617.2", "B.1.427", "B.1.429", "P.1"),
return_df = FALSE,
path = NULL,
shared_order = TRUE
)
Arguments
- lineage_defs
The result of
astronomize()
. If NULL, tries to runastronoimize
.- lineages
Vector of lineage names (must be in
rownmaes(lineage_defs)
). Defaults to lineages circulating in 2021-2022.- return_df
Should the function return a data frame? Note that returned df is transposed compared to lineage_defs. Default FALSE.
- path
Passed on to
astronomize
iflineage_defs
is NULL.- shared_order
Put shared mutations first? Default TRUE.
Value
A lineage definition matrix with fewer rows and columns than lineage_defs
. If return_df
, the columns represent lineage names and a mutations
column is added.
Details
After removing some lineage, the remaining mutations might not be present in any of the remaining lineage. This function will remove mutations that no longer belong to any lineage.
shared_order = TRUE
will result in the mutations that are present in the highest number of lineages to appear first. This is convenient for human inspection, but does not affect estimation.
Examples
# After cloning the constellations repo
lineage_defs <- astronomize(path = "../constellations")
#> Warning: Path does not exist. Using built-in definitions.
dim(lineage_defs)
#> [1] 37 325
lineage_defs <- filter_lineages(lineage_defs, c("B.1.1.7", "B.1.617.2"))
dim(lineage_defs) # rows and columns have changed
#> [1] 2 36