Filter lineage_defs for specific lineages, keeping mutations that are present in at least one lineage
filter_lineages.RdFilter lineage_defs for specific lineages, keeping mutations that are present in at least one lineage
Usage
filter_lineages(
lineage_defs = NULL,
lineages = c("B.1.526", "B.1.1.7", "B.1.351", "B.1.617.2", "B.1.427", "B.1.429", "P.1"),
return_df = FALSE,
path = NULL,
shared_order = TRUE
)Arguments
- lineage_defs
The result of
astronomize(). If NULL, tries to runastronoimize.- lineages
Vector of lineage names (must be in
rownmaes(lineage_defs)). Defaults to lineages circulating in 2021-2022.- return_df
Should the function return a data frame? Note that returned df is transposed compared to lineage_defs. Default FALSE.
- path
Passed on to
astronomizeiflineage_defsis NULL.- shared_order
Put shared mutations first? Default TRUE.
Value
A lineage definition matrix with fewer rows and columns than lineage_defs. If return_df, the columns represent lineage names and a mutations column is added.
Details
After removing some lineage, the remaining mutations might not be present in any of the remaining lineage. This function will remove mutations that no longer belong to any lineage.
shared_order = TRUE will result in the mutations that are present in the highest number of lineages to appear first. This is convenient for human inspection, but does not affect estimation.
Examples
# After cloning the constellations repo
lineage_defs <- astronomize(path = "../constellations")
#> Warning: Path does not exist. Using built-in definitions.
dim(lineage_defs)
#> [1] 37 325
lineage_defs <- filter_lineages(lineage_defs, c("B.1.1.7", "B.1.617.2"))
dim(lineage_defs) # rows and columns have changed
#> [1] 2 36