Aim
The aim of our epidemiological studies is divided into two parts:
- To derive a model predicting the likelihood of rising of MDR-TB in India and the World (we can use an existing model or modify it for our purposes).
- Comparative analysis of early and late detection of TB cases and its consequences in the quest to TB eradication
Introduction
With the increase in the usage of antibiotics in medicine, Tuberculosis(TB)- Multidrug resistance (MDR-TB) and Extensive Drug resistance (XDR-TB) are the
concerns of the upcoming future. Furthermore, the ongoing COVID-19 pandemic
has further incremented the use of antibiotics. As a result, there is a high probability that the spread of other existing diseases might rise drastically. One must
be highly cautious about diseases like TB, as its drug-resistant strains are of utmost concern in different regions of the world.
Although intensive efforts have been made to stop the spread of TB all around the globe through multiple initiatives like Directly Observed treatment shortcourse (DOTS), Greenlight Committee(GLC)[1], and many others. MDR-TB and XDR-TB remain an ever-growing problem.
Literature Reading
We referred to the mathematical model for TB emergence for India, Philippines, Russia and South Africa[1]. This model was based on six primary stages for the spread of TB
- Susceptibilty
- Latent TB infections (LTBI)
- Active TB infections
- Detected active TB cases provided that the patients were receiving the correct treatment based on underlying drug resistance
- Completely treated TB cases
- TB cases that resolved spontaneously without any treatment
To improve the model, we included parameters such as Drug-resistant TB and HIV infections. Some other categories were also introduced. Drug resistance included the following categories:
- Drug sensitive strain (DS)
- Isoniazid(INH) resistant or Rifampcin resistant(RR)
- MDR (Multi-drug resistant)
- Pre-MDR
- XDR(Extensively-drug resistant)
Figure 1: Mathematical Model for the TB disease, here LTFU= Lost to follow up
For people suffering from HIV infection, who were susceptible to TB infections, as well as the people with both HIV and TB disease, the following categories were introduced:
- HIV- negative
- HIV-positive but no Anti-retroviral therapy (ART)
- HIV-positive with Anti-retroviral therapy
They plotted the results in the form of graphs that clearly indicate the rise of MDR and XDR-TB in the regions of India, the Philippines, Russia, South Africa. However, there is a steady decline in the MDR and XDR-TB cases with acquired drug resistance in the above-mentioned regions.
They also modeled the course of the disease for China region[2](Fig.4):
Susceptible (\(S\))
Exposed to general TB (\(E_s\))
Exposed to MDR-TB (\(E_r\))
Infected by general TB (\(I_s\))
Infected by MDR-TB (\(I_r\))
Figure 2: Predicted trend of MDR-TB cases with incident Tuberculosis and XDRTB cases with incident MDR-TB cases
Figure 3: Predicted trend of incident MDR-TB and XDR-TB cases caused by acquired drug resistance.
Figure 4: Model used for China region
Figure 5: Simulated result showed the trend of TB in the course of 20 years in China
Figure 5, pictorially represents the model. In this model, it is assumed that, out of the total population, S number of people are susceptible to TB. Out of these susceptible individuals,\(E_s\) number of individuals are directly exposed to general TB causing bacteria and \(E_r\) number of individuals are exposed to the mutant MDR-TB strain. From this exposed category, \(I_s\) number of individuals get infected by general TB and \(I_r\) number of people are infected by MDR-TB. Some of the infected individuals will get treated and become healthy, while others will succumb to the disease. This forms the basis of our model. We then solved the differential equations by putting some initial values, which we either got from some reliable sources (mentioned in references) or estimated them from the available data. The results helped us to analyze the trend of TB cases in China over the course of 20 years.
Simulated result showed the trend of TB in the course of 20 years in China(Fig.5): Here, red line is \(I_r\), blue line is \(I_s\)and black is I which is defined as\(I\)= \(I_s + I_r\)
Our Epidemioligical Model
Introduction
The main aim of our project is to emphasize that the early detection of MDR-TB can lead to less spread, and rapid recovery of MDR-TB patients, which will eventually lead to decline in active MDR-TB cases.
Assumptions
- Mortality and infection rate is higher in case of no detection and late detection.
- The background birth and mortality rate are not included.
- Population is well mixed. A well-mixed population is the one where any infected individual has a probability of contacting.
- New infections originate only from the interaction of susceptibles with the infected.
- All the parameters are constant.
- Recovery time of late, early and not diagnosed is same.
- All susceptibles are equally susceptible. i.e., the strength of the immune system of everyone’s the same.
Methodology
For this purpose we have designed a simplistic model with the following compartments:
- Susceptible(S)
- Infected MDR-TB case with no detection \(I_n\)
- Infected MDR-TB case with late detection \(I_l\)
- Infected MDR-TB case with early detection \(I_e\)
- Recovered (R)
- Deceased (D)
In our model we assume that if S number of individuals in a population are susceptible to TB and \(I_n\), \(I_l\),\(I_e\) number of individuals are infected by MDR-TB with no, late and early detection respectively. R is the number of recovered individuals and D represents the number of deceased. Fitting these into the following Differential equations with appropriate test values gives us the required results. Differential equations used in our model:
where,
\(a,b,c\) = Transmission rates of \(I_n, I_l, I_e\) respectively
\(d, f\) = ratio of population no and late diagnosed respectively
\(e\) = inverse of recovery period
\(m_l, m_e, m_n\) = Mortality rate of late, early and not diagnosed respectively
Initial Conditions
The relevant ODEs are solved by using ODEint function of python with appropriate initial conditions. To benchmark, the following provides results for the simulation with initial conditions, \[\begin{aligned} S |_{t=0} &=& 1-3\times10^{-6},\\ I_n |_{t=0} &=& 10^{-6},\\ I_l |_{t=0} &=& 10^{-6},\\ I_e |_{t=0} &=& 10^{-6},\\ R_T |_{t=0} &=& 0, \\ D |_{t=0} &=& 0. \end{aligned}\]
Result
From 6 and 7, it is clearly evident that with an increase in the rate of no and late detection, the total number of infection cases increases. When no and late detection rate is 0.1 (figure 6), total fraction of the infected population is approximately 0.35, but when no and late detection rate is 0.3 (figure 7), the total fraction of the infected population increases to approximately 0.55. From this, we can infer that the lesser the no and late detection rate, the lesser are the total infections.
Python Code for Simulation
Click here to find our python code for epidemiology and enzyme kinetics simulations.
figure 6
Figure 6: Simulated result showed the trend of TB in India when fraction of \(I_N, I_L, I_E\) is 0.3, 0.3, 0.4 respectively.
Figure 7: Simulated result showed the trend of TB in India when fraction of \(I_N, I_L, I_E\) is 0.1, 0.1, 0.8 respectively.
References
- A.Sharma, A. Hill, E. Kurbatova, Martie van der Walt, C.Kvasnovsky, T.E.Tupasi, Estimating the future burden of multidrug-resistant and extensively drug-resistant tuberculosis in India, the Philippines, Russia, and South Africa: a mathematical modeling study (VOLUME 17, ISSUE 7, P707-715, JULY 01, 2017)
- Yi Yu, Yi Shi, Wei Yao,Dynamic model of tuberculosis considering multidrug resistance and their applications, Infectious Disease Modelling (Volume 3, 2018, Pages 362-372,ISSN 2468-0427,)
Introduction
Enzyme kinetics is a branch of biochemistry that deals with the rate at which chemical reactions occur more frequently in the presence of enzymes. In this modelling section, we tried to find the minimum amount of various enzymes such as Fluorescent-Quencher Pair, Free Gene and Cas 14 required to optimize our proposed diagnostic kit. This kinetics of the enzyme may reveal useful information about its function, role in metabolism, factors that affect its function, and inhibitory processes.
Equations
\[\begin{aligned}
\textrm{Min cas required} &=& \frac{x}{\alpha\beta} \\
\textrm{Min QP required} &=& \frac{x}{\beta}\\
y &=& \frac{\gamma xt}{\alpha\beta} = w +\tau \delta \\
\tau &=& \frac{y-w}{\delta}\end{aligned}\] where the parameters are as follows
\(x\) = Threshold fluorescence
\(\alpha\) = QP(in mol) cut per unit mol Cas
\(\beta\) = Fluorescence prod by per unit mol QP
\(\gamma\) = FG(in mol) cut per unit mol Cas per unit time
\(\delta\) = rate of production of FG by LAMP
\(\phi\) = mol of FG prod by 1 mol FG
t = time taken to complete rxn
\(\tau\) = min time required to complete LAMP
\(y\) = min Free gene required
\(w\) = Free gene in sputum sample
Methodology
The equations were derived by back-tracing the endpoint, i.e., finding the threshold fluorescence. Although, theoretically there is no minimum amount of fluorescence required to conclude that the sample contains the desired mutations, but due to the least count in the instruments, which we assumed it to be equal to threshold fluorescence.
Now the minimum/optimum amount of various enzymes can be calculated by basic mathematical equations as given in Section 3.
In eq.1 there will be the samples whose theoretical fluorescence will be less than the threshold frequency.
Assuming a threshold fluorescence value.
Calculating minimum Cas14 required.
Minimum FQ paired required(in mol) for assay.
Calculating amount of free gene required for the assay.
Predicting minimum time required to run lamp.
Running few simulations to see the tend of values.
Future Prospects
Owing to Covid-19 restrictions, lab work was severely affected. Hence most of the lab work was delayed and only a few of the experiments were performed.
After we found the minimum/optimum amount of reagents with the help of the above equation, it would have tested it with the experiments to rectify the error involved with measuring fluorescence by the help of hardware machine. Henceforth, the other reagents can be re-adjusted accordingly.
With this being done, our hardware machine and the mobile phone-based app would also have received updates to give conclusive results with maximum accuracy.
Simulation
With randomly assumed Constants, we can plot graphs between
- Amount of initial Free Gene vs time taken to run LAMP.
- Threshold Frequency vs Minimum amount of Free Gene.
Plot 2 is the plot between initial amount of Free gene(in M) vs time taken to run Lamp. It is clear from the plot that τ is inversely proportional to Initial amount of Free gene.
Whereas 3 is the plot between Threshold Frequency vs Minimum amount of Free Gene required. The plot has a positive slope which indicates direct proportionality between the two.
Figure 2: Amount of initial Free Gene vs time taken to run LAMP
Figure 3: Threshold Frequency vs Minimum amount of Free Gene
Introduction
The evolution of Multidrug-resistant TB (MDR TB) strains and their continuing spread presents a difficult challenge to control its prevalence globally and increase the curing rates. Thus, tracing the evolutionary history of the strains will:
1) Inform us the trends in the mutational landscape that increased the fitness of these strains over time and how they become resistant to multiple drugs such as isoniazid, rifampicin and others.
2) Provide a basis to postulate if additional strains are likely to become multidrug-resistant and thus increasing their virulence and prevalence in the future.
To this end, we sought to understand the evolution of MDR TB strains based on some of the genes such as the katG and rpoB, where crucial mutations in them have conferred the resistance to isoniazid and rifampicin. Based on the recent literature, we also extended our analysis to GyrA, as mutations in this gene underpinned the emergence of Extensively drug-resistant TB (XDR) TB that are resistant to fluoroquinolones and second-line injectable drugs.
Method and Analysis
Installation and outputs
Command lines used for installation -
- sudo apt-get install mafft
- sudo apt-get install iqtree
- sudo apt-get install fasttree
Example image (iqtree installation) -
MAFFT run
MAFFT output
MAFFT output alignment viewed in textpad
IQTREE run
IQTREE model finder ongoing
FIGTREE - output is given in the below Tree links.
Optimization
The optimization steps involved verifying and improving the alignments and testing different parameters for reconstructing the phylogenetic tree. As such, we constructed alignments using multiple software including the K-align and MAFFT. We also tested multiple alignment strategies (G-INS-I, Q-INS-I, L-INS-I and E-INS-I) as implemented in the alignment software.
As aforementioned the optimisation also included manual trimming and removal of poorly aligned stretches and sequences. The L-INS-I option as implemented in the MAFFT created a relatively better alignment with fewer gaps and this was further improved by correcting the alignment using TextPad.
Regarding the phylogenetic tree reconstruction, we initially used the FastTree software to reconstruct the tree. As FastTree is a quick alternative to more conventional and time-consuming and accurate methods, we noticed inconsistencies in the tree topologies and therefore utilised the comprehensive methodologies as implemented in the IQTREE software.
However, it must be noted that the FastTree software acted as good guidance to quickly identify the sequences that are troublesome (jumping from one major node to the other in different tree topologies) and helped us to remove those sequences from the alignment that caused tree artefacts.
Observation and Inference
Tree 1-katG Tree 2-rpoB Tree 3-GyrA
The tree topologies obtained based on the multiple sequence alignments of katG, rpoB and GyrA proteins suggested that the rpoB encoded the least number of substitutions per site (0.008), while the katG and GyrA encoded 0.05 and 0.2 substitutions per site, respectively. Indeed, this is reflected in most of the sequences that possess a shorter branch length, except for a few that are potentially fast evolving with longer branches. While some of these sequences are truncated at the C-terminal ends, many of those sequences are full-length suggesting that the longer branch lengths are not necessarily an outcome of tree artefacts.
The tree topologies based on the three genes showed different trends. Consistent with the inference obtained from the substitutions per site, the rpoB encoded only a single cluster indicative of longer branches, where the katG and GyrA encoded more than one cluster with longer branches. Interestingly, based on the katG, the strain with the highest percentage of substitutions per site was found to be Mycobacterium tuberculosis strain TKK-01-0051, along with other sequences for which the strain information is unavailable. Likewise, in the trees based on the GyrA and rpoB, there were instances where fast-evolving sequences are observed (see tree topologies below)
katG fast-evolving branches
rpoB fast-evolving branches
rpoB fast-evolving branches
GyrA fast-evolving branches
While it is not necessary that these observed sequences/strains that are fast evolving will confer multidrug resistance in the future, it cannot be ruled out that some of those mutations could also be key in developing the resistance against the first and the second line of drugs utilised for the treatment. In general, we can hypothesise that some of the aforementioned strains may harbour mutations for multidrug resistance (MDR) in the future. Although strains such as Mycobacterium tuberculosis strain TKK-01-0051 are currently not drug-resistant, they can have the potential to evolve to become drug-resistant.
Limitations and Scope for further improvement
While the phylogenetic analysis can inform us of the fast-evolving strains, which of those mutations could potentially make the strain resistant to multiple drugs needs comprehensive sequence-structure analysis of the proteins that confer resistance. For instance, although some of the above sequences are fast evolving, some of those point mutations may still not confer resistance. Thus, the fast-evolving strains identified using the phylogenetic analysis may not necessarily be the most virulent or the multi-drug resistant strains. A comprehensive sequence-structure analysis of proteins can inform us how some of the key mutations could potentially confer resistance. Also performing a thorough whole genome-wide mutational landscape survey can precisely inform the trends in fast-evolving strains.