Description
Project Description
Overview:
Nowadays in Egypt, The most common malignancy among women is breast cancer (BC) and it’s the second most common malignancy in both sexes and all ages. According to Globocan and the WHO, The estimated number of cases in 2020 is 22,038 which accounts for about 32.4% of all cancers among females and this number is forecasted to be 46,000 in 2050. It is estimated that breast cancer mortality rate is around 10.3%, being the second cause of cancer-related deaths, preceded by liver cancer.
Triple negative breast cancer (TNBC) is an aggressive subtype which has a very poor prognosis despite the fact that it responds well to chemotherapy. The term triple-negative refers to the fact that the cancer cells lack oestrogen or progesterone receptors and also produce little (HER2) protein (The cells test "negative" on all 3 tests.). These cancers tend to be more common in women younger than age 40, who have a (BRCA1) mutation, with an incidence rate of 10-15% of all breast cancers.
This aggressive subtype is often associated with higher rates of recurrence that other types of breast cancer. Rates of the tumor metastasizing to the visceral organs and central nervous system are also higher. As a result to all of this damage, it has poor survival rates with a 77% chance of 5-year relative survival according the American Cancer Society. On top of all this, available treatment options for TNBC are fewer compared to other subtypes.
As medical students, we had to step up and take action to face this disease that is threatening the wellbeing and health of women in our community, whom we call warriors to raise their morality as they are fighting in this ferocious battle against this enemy.
Breast cancer doesn’t only have an effect on women who have it, but also has a major impact on their families as well as the productivity of the community as a whole, given the role women play in their families and society’s well-being. An article that was published in December 2020 describes the income impact of breast cancer and assesses factors that influence income and productivity among women with BC. Data were collected from 200 women with clinically confirmed BC cases in Southwest Nigeria over 6 months. Sixteen percent of women in the group were absent at the workplace for an average of 10 days, showing a 45.5% productivity loss at the workplace. Lastly, 22% of the women were absent at the workplace for more than 2 weeks on average. What we deduce from this is that increasing incidence of BC among women causes a decrease in productivity level at the workplace.
Because of the widespread increase in the incidence of breast cancer among Egyptians and due to the poor prognosis of TNBC, it was essential to do something beneficial regarding treatment. That is why, our team constructed deep learning-assisted immunotherapy platform for TNBC. Our goal is to administer safe and efficient therapy to TNBC patients in which we depended mainly on designing the vaccine circuits.
Last year, our project was concerned with the same problem and we approached it by designing a novel immunotherapeutic approach using DNA-launched RNA Replicons (DREP). We utilized our hotspot detection tool, “Custommune”, to generate a list of candidate neo-epitopes, validated them in silico, modelled the behaviour of our suggested circuits and optimized them to ensure maximal safety and efficacy.
This year, we are looking to develop an enhanced version of an alphavirus-based vaccine-delivery vector using our deep learning-based approach to provide improved versions of the Equine Encephalosis (EEV) and Semliki Forest (SFV) alphavirus vectors. We also aim at designing more potent promoters for the vector through computational design and testing. We want to create full fitness landscapes for alphavirus regulatory proteins. These landscapes will provide a mutational impact on evolutionary fitness and structural stability of each protein.
Our hope is to create a modular logic-based design for controlling the vector specificity towards immune cells, and improve delivery and amplification of the delivered genetic message. Antigenic constructs should be designed based on our machine learning algorithms for neo-antigen and epitope prediction for a potent immune response.
This year, we adopted three approaches in our mission to put a stopper to this disease which are: vaccine design, vector design and directed evolution approaches.
Starting with vaccine engineering algorithms, we could potentiate easy, rapid and flexible designing of potent vaccines against challenging cancers and infectious diseases with higher accuracy compared to standard tools and pipelines. Our enhanced delivery vector could provide a basis not just for vaccine delivery but also for gene therapy and a variety of other applications.
As for our directed evolution algorithm, it should be helpful for a variety of applications including antibody, aptamer, and vaccine design. Directed evolution will enable us to manipulate the function of each part of our vaccine construct, by running some random mutations to produce new variants of each fragment with remarkably increased functionality that meet our desired perspective of fitness for its specific role.This could be helpful for different purposes in a variety of scientific fields requiring evolving proteins.
This will help us provide hope to patients in our local community and all over the world. In addition, it provides solid scientific foundations to run and improve our previously initiated start-up in iGEM (Custommune) which helps us to bring out our project for the benefit of people. Our main motivation is to make immunotherapies affordable and accessible for all patients around the world and provide them with this advanced therapeutic technology.
Vaccine engineering:
Designing of the circuits didn’t only depend on cell specificity feature, but also on making logical decisions, effectively delivering antigenic message and eliciting a controllable, regulatable, and system-sensitive immune response.
The Vaccine has an oncolytic activity when administered intra-tumorally to combat TNBC cells in conjunction with vaccine activity that is achieved via Hbax or FADD. In addition to regulate transcription using riboswitches upstream the vaccine to control downstream expression. In order to be system-sensitive design, Toehold switched is inserted upstream and downstream the vaccine to regulate the transcription depending on RNAs sensing leading to regulation feedbacks. To be a controllable vaccine, small molecule inhibition could be used (TMP) to act as safety switch.
Vector design:
Alpha-viruses were our chosen vectors because they are particularly proving their value as expression vectors as they are easily and quickly engineered and can be used to produce high levels of proteins of interest, in which case our vaccine. And by using the evolved protein we obtained from our directed evolution approach, we are able to design a safer and more stable platform using logic-gating systems and regulation methods for termination of vaccine expression to have more control over our model. This also helped us make our platform more productive by increasing the expression and effectiveness of our vaccine.
Computational Directed Evolution:
Engineering enhanced versions of proteins requires complex processes that integrate various measurements. This process is often overwhelmed by challenges that are specific for each engineering case. Herein, we have developed a computational approach that integrates predictions of mutational effects on evolutionary fitness with encodings for local and global features representations of protein sequences. These features were used to design a machine learning framework in an approach to accurately predict fit variants without necessitating experimental inputs.
We present a generator function that can be used to generate mutant version of the proteins used in our circuit. This system of generation was augmented by incorporating all the possible mutant versions of the protein and included information regarding structural fitness and stability using (ΔΔG) calculations and also evolutionary fitness depending on the epistasic effect of mutations and the independent contribution of the mutation in that specific position on the mutant by obtaining frequency matrices from the alignment for each mutation. A classifier that we developed using deep learning is used to act as a positive selective pressure system. We constructed a network that is ready to identify the protein function and also get a regressive score to the contribution of the motifs on that protein to that specific function and we included a Protein-Protein interactions (PPIs) network to test the degree of which the protein expresses its function based on interactions with other proteins. That said, we can now calculate the fitness score depending on the indices included in the code that ranges between 0 and 1 that can be used to rank our mutants to choose which of them we can proceed with and which to ignore.
To predict the effect of mutation on our proteins, we implemented 4 main indices represented in; and evolutionary fitness, Local and Global evolutionary contexts, and physicochemical parameters assessment.
1. Firstly, we enabled our model to predict the evolutionary fitness of the mutants represented in independent fitness and epistatic fitness. For the independent fitness, we’ve applied a function that computes independent effect of changing a certain amino acid into another at a specific position by computing the frequency of the alternative amino acid in that exact position within the sequence alignment. Concerning the epistatic function, we had to accurately estimate epistatic global effect of mutations whether in single position deep mutational scanning or within combinatorial libraries of mutation. That’s why we adopted three approaches where we used evolutionary distances, residue frequencies, and finally residue couplings. The latter method was the most sensitive to representations of evolutionary fitness amongst the proposed methods.
2. Secondly, for the evolutionary contexts, we had to account for the local and global prediction aspects for the mutant sequences. Regarding the local contexts, they are essential indicators for activity, selectivity and function of homologous sequences of protein families.
3. On the other hand, we decided to use the power of global protein representation models such as TAPE and UniRep as they are able to learn non-trivial features essential for the process of training deep learning models using protein sequences. This can be done through integrating these features with the local context encoder so that each sequence is represented by essentially three elements including; physicochemical parameters, local evolutionary contexts and global evolutionary contexts.
4. Finally, we’ll discuss the physciochemical parameters, which are very important for proper representation of proteins for machine learning task due to their strong relationship with functional and structural activity of the desired protein. We have worked to create functions that can encode proteins component values to be used for representation and inference of important physicochemical features for mutant sequences.
We can validate our results by comparing it to experimental datasets. Our attention was directed towards state-of-the-art validation datasets to confirm our results. AlphaFold was used to assess the Molecular Dynamics (MD) of our proteins and measure their stability.
Our next step in directed evolution was applying saturation mutagenesis to the alpha-viral Non-Structural proteins (NSPs), which co-ordinate intracellular viral replication. This did not only allow us to regulate the replication of our vector, but also enhance the stability, expression and selectivity of our vaccine.
We were mostly concerned with the precautionary measures that would be included to regulate our vaccine. That’s why many levels of control have been included in our circuit not only to make it safer, but also to make sure that it is sensitive and responsive to the tumor micro-environment. Trimethoprim (TMP) was included in our circuit as a small molecule inhibition for stabilizing the destabilizing domain (DD) fused to the RNA binding protein (MS2). Two toehold switches were also included in our circuit with different functions according to the micro-environment surrounding our vaccine. If there is overexpression of carcinogenic mRNA, the ToeholdON will activate to increase downstream vaccine expression, On the other hand, we modified the other toehold switch by adding degradation motifs whose main primary function is to terminate the circuit in case of overexpression of the vaccine.
And to make our circuit even more safe, (HBAX) gene, which is an apoptosis-related gene, was integrated into our model to provide an oncolytic function in conjunction with the activity of the vaccine.
We constructed 2 models for predicting the fitness of our mutant proteins: The Unsupervised model and the clustering model.
Regarding the unsupervised approach, we aimed at creating a Zero-N directed evolution of proteins using deep neural networks that have been constructed using 1-dimensional Convolutional Neural Networks (1dCNN) and bidirectional Long Short term Memory (biLSTM) forms of Recurrent Neural Networks (RNN). This design can also be adapted for prediction of mutations that can increase mutant fitness but are undesirable due to global cellular constraints.
As for the clustering model, we applied a k-means clustering algorithm to cluster encoded protein sequences into at least two main clusters including; fit and unfit mutants. This helps greatly with reducing the noise produced by unfit mutants that contribute to generating low fitness scores.
Antibodies development classification:
Antibodies are one of the most important protein classes to be subjected to artificial design, affinity maturation and development pipelines. Thus, one model to fully process libraries of therapeutic antibodies is still needed. However, integration of different models can boost performance of antibody-specific directed evolution models. For this purpose, we worked on a model that can direct a future generator function for antibody development purposes. The datasets for training were adapted from IEDB and SAbDab databases (Dunbar et al., 2014). First, a paratope prediction classifier has been built and tested for accuracy of identifying paratope sequences. Then, these generated sequences can be subjected to affinity regressor against target antigen using PPI affinity model. After getting candidate hits, top sequences can be subjected to a humanization classifier to filter for only sequences with high humanization probability. Finally, it is essential to check for the developability of the antibody sequence. For this purpose we have built a developability model that can guide the final filtration step by selecting only sequences with high developability probability.
Virus-Like Particles (VLPs):
They are virus-derived nanoparticles made up of one or more different molecules with the ability to self-assemble, mimicking the form and size of a virus particle but lacking the genetic material so they are not capable of infecting the host cell. They constitute a powerful and flexible platform that harness the immunogenicity of viruses without compromising the safety of their recipient. VLPs may be loaded with innate stimuli, further enhancing their immunogenic role. Generally, VLPs have been extensively used as vaccines due to these favourable characteristics. With the VLPs platform, even short peptides identified in tumor cells (that otherwise would be quickly cleared) can be presented to Dendritic Cells (DCs) to induce cellular or humoral response against cancer. VLP can target solid tumors or Cancer Stem Cells (CSCs) with the potential to be used as prophylactic or therapeutic cancer vaccines, alone or in combination with chemotherapy, checkpoint inhibitors, or future therapies.
A foreign peptide can be inserted genetically into the coding sequence of a viral structural protein. When expressed in an appropriate host, the protein self-assembles into a VLP with the peptide exposed on its surface. So we used Hepatitis B Virus (HBV) core protein, Genetic insertion of foreign sequences sometimes prevents the recombinant protein from folding properly. In the case of HBV core proteins, this problem has been largely overcome through the use of a series of genetic variants that can be employed in a combinatorial fashion to find a VLP that tolerates almost any desired insertion.
References:
1. World Health Organization (WHO), Global Cancer Observatory (Globocan), https://gco.iarc.fr/today/data/factsheets/populations/818-egypt-fact-sheets.pdf.
2. Mbonigaba J, Akinola WG. Productivity and Income Effect of Breast Cancer among Women in Southwestern Nigeria. Economies. 2021; 9(3):129.
3. Donaldson, B., Lateef, Z., Walker, G. F., Young, S. L., & Ward, V. K. (2018). Virus-like particle vaccines: immunology and formulation for clinical translation. Expert review of vaccines, 17(9), 833–849.
4. Syomin, B. V., & Ilyin, Y. V. (2019). Virus-Like Particles as an Instrument of Vaccine Production. Molecular biology, 53(3), 323–334.
5. Aguilar, P. V., Adams, A. P., Wang, E., Kang, W., Carrara, A. S., Anishchenko, M., Frolov, I., & Weaver, S. C. (2008). Structural and nonstructural protein genome regions of eastern equine encephalitis virus are determinants of interferon sensitivity and murine virulence. Journal of virology, 82(10), 4920–4930.
6. Cobb, R. E., Sun, N., & Zhao, H. (2013). Directed evolution as a powerful synthetic biology tool. Methods (San Diego, Calif.), 60(1), 81–90.
7. Cobb, R. E., Chao, R., & Zhao, H. (2013). Directed Evolution: Past, Present and Future. AIChE journal. American Institute of Chemical Engineers, 59(5), 1432–1440.
8. Lundstrom K. (2019). Plasmid DNA-based Alphavirus Vaccines. Vaccines, 7(1), 29.
9. Moradi Vahdat, M., Hemmati, F., Ghorbani, A., Rutkowska, D., Afsharifar, A., Eskandari, M. H., Rezaei, N., & Niazi, A. (2021). Hepatitis B core-based virus-like particles: A platform for vaccine development in plants. Biotechnology reports (Amsterdam, Netherlands), 29, e00605.
10. Yang, K.K., Wu, Z. & Arnold, F.H. Machine-learning-guided directed evolution for protein engineering. Nat Methods 16, 687–694 (2019).
11. Dunbar J, Krawczyk K, Leem J, Baker T, Fuchs A, Georges G, Shi J, Deane CM. SAbDab: the structural antibody database. Nucleic acids research. 2014 Jan 1; 42(D1):D1140-6.
12. Gado, N., Ibrahim, D., Atef, D., & Kanaan, A. (2016). Clinical characteristics of triple negative breast cancer in Egyptian women: a hospital-based experience. Cancer Ther Oncol Int J, 4, 426.
13. Makar, W. S. (2019). Clinicopathological Characteristics and Survival of Triple-Negative Breast Cancer Patients: A single Institution Study from Egypt. Research in Oncology, 15(1), 31-34.
14. Jomaa MK, Nagy AA. Survival outcomes in Egyptian patients with triple-negative breast cancer: Single institute experience. J Clin Oncol [Internet]. 2015 Oct 1; 33(28_suppl):158.