-
Notifications
You must be signed in to change notification settings - Fork 0
Custom epitopes
This mode is particularly useful in the context of B cell epitopes, as these can be examined %by only using a range on the protein, regardless of the HLA restriction on the targeted population. The user is provided with a panel, shown in the figure, for defining candidate epitopes by providing its name, a specific protein on the virus, and a position range (possibly discontinuous) on the protein. The epitope may be added to the list as is; in this case, the statistics will be computed over the full sequence population selected with the Metadata search.
Optionally, the user may select an additional condition on one amino acid change, with the purpose of instructing the system to compute statistics over the fraction of the selected population that carries the amino acid change. The panel allows the selection of a specific protein, a range of coordinates, a type of variation (insertion, deletion, or substitution), an original amino acid residue, and the corresponding alternative residue. The filter selection may be:
- approved (ADD),
- deleted for choosing alternative ones (CLEAR), or
- deleted for removing the entire amino acid-related condition (CLEAR & CLOSE).
The choice of amino acid filters is supported by a practical add-on triggered by the ''Analyze Substitutions'' button, which allows the inspection of the characteristics of a specific replacement from the original into alternative amino acid. Each involved (source or target) residue is characterized by a series of structural categorical properties (such as polarity, charge, and flexibility) and of numerical properties (e.g., molecular weight, hydrophobicity); the pair of residues involved in the change is associated to a measure of its impact (Grantham distance~\cite{grantham1974amino}); a threshold on impact maps a change into radical or conservative categories.
After addition, the new epitope is inserted in a list of user-defined epitopes, which are presented by providing summary information, including its name, creation and refresh date, protein and position range, and virus/host taxon name, number of mutated sequences, and of variants. Current epitopes in the list can be downloaded as JSON files, thereby supporting the possibility of reloading specific files representing the status of saved interaction with EpiViruSurf; in this way, users may organize and manage the information collected about user-defined epitopes through many sessions of EpiViruSurf use.
Out of the current epitope list, users may:
- read more information on the epitope (MORE INFO);
- refresh its counters (REFRESH) - this option is typically used after uploading an external JSON file as discussed above;
- reload all the values (originally used to create that epitope) into the drop-down menus of the sequence and epitope search panels (RELOAD) - this option facilitates the creation of a new epitope with different coordinates or for testing its conservancy on a different underlying population;
- delete the element from the list (DELETE).
The result table stores all the relevant information on the defined epitopes connecting them with statistics on the sequences mutated on each epitope's range. The table can be downloaded for subsequent data analysis as a CSV file. The last five columns of the table describe:
- NUM SEQ POPULATION: the number of sequences available in the population where the epitope has been tested in EpiViruSurf (i.e., matching the filters in the Metadata and Amino Acid Condition columns);
- NUM MUT SEQ: the number of sequences in the selected population that have at least one amino acid change exactly matching with the epitope position range;
- TOT MUT: the number of total amino acid changes exhibited by the full population of sequences (note that any insertion counts for one).
- MUTATED FREQ: the ratio of total variants (3) over the number of mutated sequences (2);
- MUTATED SEQ RATIO: the ratio of mutated sequences (2) over the total of the selected population (1).
When epitopes have been defined using also an amino acid condition, counters (2) and (3) are computed by considering the fraction of the population that exhibits the specific selected amino acid condition.
By clicking on the NUM MUT SEQ number, the list of mutated sequences with their metadata is shown in a table. From here, EpiViruSurf users may invoke VirusViz that will be opened on a variant distribution that considers all the mutated sequences and highlights the chosen epitope.
By clicking on the TOT MUT number, the user will open a new panel called ''Epitope mutation statistics'', where the number of mutated sequences can be observed in a custom breakdown, grouping by several attributes concerning location, collection time, and phylogenetic classification methods. A table is generated providing, for each specific amino acid change in a row, the number of sequences exhibiting such change in each formed group.