Validation neutralization assays versus `polyclonal` fits¶

Compare actual measured neutralization values for specific mutants to the polyclonal fits.

Import Python modules:

[1]:

import os
import pickle

import altair as alt

import pandas as pd

import yaml

Read configuration and validation assay measurements:

[2]:

with open("config.yaml") as f:
    config = yaml.safe_load(f)

validation_ic50s = pd.read_csv(config["validation_ic50s"], na_filter=None)

validation_ic50s

[2]:

	antibody	aa_substitutions	measured IC50	lower_bound
0	267C		0.000877	False
1	267C	S371Y	0.001423	False
2	267C	R408S	0.000574	False
3	267C	R452I	0.000678	False
4	267C	F456L	0.001209	False
5	267C	E484A	0.003011	False
6	267C	F486I	0.002161	False
7	279C		0.000057	False
8	279C	S371Y	0.000181	False
9	279C	R408S	0.000045	False
10	279C	R452I	0.000038	False
11	279C	F456L	0.000074	False
12	279C	E484A	0.000112	False
13	279C	F486I	0.000125	False

Now get the predictions by the averaged polyclonal model fits:

[3]:

validation_vs_prediction = []
for antibody, antibody_df in validation_ic50s.groupby("antibody"):
    with open(os.path.join(config["escape_dir"], f"{antibody}.pickle"), "rb") as f:
        model = pickle.load(f)
    validation_vs_prediction.append(model.icXX(antibody_df))

validation_vs_prediction = pd.concat(validation_vs_prediction, ignore_index=True)

validation_vs_prediction

[3]:

	antibody	aa_substitutions	measured IC50	lower_bound	mean_IC50	median_IC50	std_IC50	n_models	frac_models
0	267C		0.000877	False	0.034641	0.034711	0.034816	4	1.0
1	267C	E484A	0.003011	False	0.459317	0.387717	0.473481	4	1.0
2	267C	F456L	0.001209	False	1.323086	0.695120	1.754687	4	1.0
3	267C	F486I	0.002161	False	0.548088	0.264798	0.718249	4	1.0
4	267C	R408S	0.000574	False	0.011290	0.011112	0.008520	4	1.0
5	267C	R452I	0.000678	False	0.006932	0.006755	0.004558	4	1.0
6	267C	S371Y	0.001423	False	0.934559	0.418290	1.359847	4	1.0
7	279C		0.000057	False	0.006658	0.006468	0.004207	4	1.0
8	279C	E484A	0.000112	False	0.019545	0.013754	0.019474	4	1.0
9	279C	F456L	0.000074	False	0.007652	0.007338	0.005839	4	1.0
10	279C	F486I	0.000125	False	0.007503	0.007076	0.004314	4	1.0
11	279C	R408S	0.000045	False	0.006452	0.005207	0.004716	4	1.0
12	279C	R452I	0.000038	False	0.000839	0.000766	0.000438	4	1.0
13	279C	S371Y	0.000181	False	0.019463	0.019499	0.012934	4	1.0

Now plot the results. We will plot the median across the replicate polyclonal fits to different deep mutational scanning replicates. This is an interactive plot that you can mouse over for details:

[4]:

corr_chart = (
    alt.Chart(validation_vs_prediction)
    .encode(
        x=alt.X("measured IC50", scale=alt.Scale(type="log")),
        y=alt.Y(
            "median_IC50",
            title="predicted IC50 from DMS",
            scale=alt.Scale(type="log"),
        ),
        facet=alt.Facet("antibody", columns=4, title=None),
        color=alt.Color("lower_bound", title="lower_bound"),
        tooltip=[
            alt.Tooltip(c, format=".3g") if validation_vs_prediction[c].dtype == float
            else c
            for c in validation_vs_prediction.columns.tolist()
        ],
    )
    .mark_circle(filled=True, size=60, opacity=0.6)
    .configure_axis(grid=False)
    .resolve_scale(y="independent", x="independent")
    .properties(width=150, height=150)
)

corr_chart

/fh/fast/bloom_j/computational_notebooks/bdadonai/2022/vep_dms/SARS-CoV-2_Delta_spike_DMS_REGN10933/.snakemake/conda/7c022d2d81458b7fb39e0b59857b3086_/lib/python3.9/site-packages/altair/utils/core.py:317: FutureWarning: iteritems is deprecated and will be removed in a future version. Use .items instead.
  for col_name, dtype in df.dtypes.iteritems():

[4]:

[ ]:

Validation neutralization assays versus polyclonal fits¶

Validation neutralization assays versus `polyclonal` fits¶