Difference between revisions of "Team:HK GTC/Deep learning"

 
(35 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
 
{{HK_GTC}}
 
{{HK_GTC}}
  
 
<html>
 
<html>
<div class="section-heading">
 
        <div class="mask"></div>
 
        <h1>Deep Learning</h1>
 
      </div>
 
  
      <section></section>
+
<style>
 
+
  .section-heading{
       <div class="section-selector">
+
      background-image: url("https://static.igem.org/mediawiki/2021/2/2f/T--HK_GTC--dl_head.jpg");
        <h1>Deep Learning</h1>
+
  }
        <ul>
+
</style>
            <li><a href="#detection">Detection of Plastic Bottles</a></li>
+
 
            <li><a href="#workflow">Our Workflow</a></li>
+
<span class="material-icons to-top"><a href="#1">expand_less</a></span>
            <li><a href="#descri">Model Description </a></li>
+
 
            <li><a href="#usage">Model Usage</a></li>
+
    <div class="section-heading" id="1">
            <li><a href="#results">Model Results</a></li>
+
      <div class="mask"></div>
            <li><a href="#future">Future Plans/Implementations</a></li>
+
       <h1>Deep Learning</h1>
        </ul>
+
  </div>
      </div>
+
 
    
+
  <div class="section-wrapper">
      <div class="right-section">
+
 
 
+
  <div class="section-selector">
        <div class="section-contents">
+
      <h1>Deep Learning</h1>
            <h2 id="current">Detection of Plastic Bottles</h2>
+
      <ul>
            <p>As we know, plastic pollution is a severe problem that impacts both the ecosystem and our daily lives. One of the major sources of plastic is oceans, and an estimated amount of 5.25 trillion pieces of plastic and microplastic are currently floating around the ocean, of which 15% of them will eventually land on our beaches[1]. In response to the plastic problem, we would like to develop a deep learning PET bottle detection model for mapping plastic pollution on beaches. The data allows government, councils, NGO to have an overview of the current situation and to estimate how effective their proposal is to reduce the impact of plastic wastes. Researchers will begin working to prepare and implement an effective plan for prevention and cleanup effort on where they should focus. Our ultimate goal is to help to reduce the amount of plastic pollution in the ocean. </p>
+
        <li><a href="#detection">Detection of Plastic Bottles</a></li>
 
+
        <li><a href="#workflow">Our Workflow</a></li>
            <h2 id="ins">Our Workflow</h2>
+
        <li><a href="#descri">Model Description </a></li>
            <p>Using our drone and phones, we took 718 images along coastlines of beaches including Cheung Chau and Cheung Sha in Hong Kong, and uploaded them to CVAT(Computer vision annotation tool)[2] - a website provided by Clearbot, a company which creates marine plastic clearing robots, to create ground truth instances for the training process.</p>
+
        <li><a href="#usage">Model Usage</a></li>
 
+
        <li><a href="#results">Model Results</a></li>
             <h2 id="sol">Solution</h2>
+
        <li><a href="#future">Future Plans/Implementations</a></li>
            <p>
+
      </ul>
              The ultimate goal of this project is to use a protein engineering approach to develop a dual-enzyme system
+
  </div>
              in which two enzymes act synergistically to completely degrade PET into its constituting monomers. These
+
 
              monomers can be further synthesized back into PET and other useful products, allowing the plastic industry
+
   <div class="right-section">
              to develop more sustainably.
+
      <div class="section-contents">
 
+
        <h2 id="current">Detection of Plastic Bottles</h2>
              We propose the usage of two enzymes, PETase and MHETase, to depolymerize PET. Using PETase, PET is first
+
        <p>As we know, plastic pollution is a severe problem that impacts both the ecosystem and our daily lives. One
              broken down into three monomers: bis(2-hydroxyethyl) terephthalic acid (BHET), mono(2-hydroxyethyl)
+
            of the major sources of plastic is oceans, and an estimated amount of 5.25 trillion pieces of plastic and
              terephthalic acid (MHET), and terephthalic acid (TPA). MHETase further catalyzes the breakdown of MHET into
+
            microplastic are currently floating around the ocean, of which 15% of them will eventually land on our
              TPA and ethylene glycol (EG) (Figure 1).
+
            beaches[1]. In response to the plastic problem, we would like to develop a deep learning PET bottle
             </p>
+
            detection model for mapping plastic pollution on beaches. The data allows government, councils, NGO to have
 
+
            an overview of the current situation and to estimate how effective their proposal is to reduce the impact of
             <div class="single-image-with-desc">
+
            plastic wastes. Researchers will begin working to prepare and implement an effective plan for prevention and
              <center><img src="../img/1.png" alt=""></center>
+
            cleanup effort on where they should focus. Our ultimate goal is to help to reduce the amount of plastic
              <p>Figure 1. The breakdown process of PETing usPETase and MHETase. Left: PETase depolymerizes PET into BHET,
+
            pollution in the ocean. </p>
                  MHET and TPA. Right: MHETase depolymerizes MHET into TPA and EG.</p>
+
 
 +
        <h2 id="workflow">Our Workflow</h2>
 +
 
 +
        <h3>Photo Taking</h3>
 +
 
 +
        <p>Using our drone and phones, we took 718 images along coastlines of beaches including Cheung Chau and Cheung
 +
            Sha in Hong Kong, and uploaded them to CVAT(Computer vision annotation tool)[2] - a website provided by
 +
            Clearbot, a company which creates marine plastic clearing robots, to create ground truth instances for the
 +
            training process.</p>
 +
 
 +
        <div class="single-image-with-desc">
 +
             <center><img src="https://static.igem.org/mediawiki/2021/b/b6/T--HK_GTC--dl_1.jpg" alt=""></center>
 +
            <p>Fig 1. Students using a drone to capture images on beaches</p>
 +
        </div>
 +
 
 +
        <h3>Volunteer Plastic Tagging</h3>
 +
        <p>Together with the human practice team, we have organized a plastic tagging activity within our school, we
 +
            invited around 5 students from each class, with a total amount of 60 students. During the activity, we
 +
            taught them how to trace polygons around a plastic object in an image in CVAT. Together with the help of
 +
            some of our team members, we had generated the training data.</p>
 +
            <video width="640" height="480" controls>
 +
              <source src="https://static.igem.org/mediawiki/2021/1/10/T--HK_GTC--dl_cvat.mp4" type="video/mp4">
 +
            </video>
 +
        <p>Then we train a model using the data obtained with Detectron from Facebook AI Research to obtain plastic
 +
            detecting models. </p>
 +
 
 +
        <h2 id="descri">Model Description</h2>
 +
        <p>
 +
            The detection algorithm we’re using is Mask-RCNN[3], which is an object detection algorithm developed by
 +
            Facebook, the algorithm extends the concept of Faster-RCNN, which only creates bounding boxes around
 +
            detected objects. We decided to use Detectron2[4] as a framework to create a model. We use the baselines
 +
            from Detectron’s Model Zoo[5] for transfer learning to improve model performance.
 +
        </p>
 +
        <h3>Structure</h3>
 +
        <p>To have an intuitive understanding of the model structure. Let's say we have a video that we want to detect PET bottles from, we then process each frame from the video by passing it to a Convolutional Neural Network, which uses filters as kernels to extract features from the images, such as shapes, reflections, highlights, etc. And creating a feature map.</p>
 +
 
 +
        <div class="single-image-with-desc">
 +
            <center><img src="https://static.igem.org/mediawiki/2021/b/b7/T--HK_GTC--dl_2.jpg" alt=""></center>
 +
            <p>Fig.2 Structure of feature extracting backbone</p>
 +
        </div>
 +
 
 +
        <p>The feature map is then passed into a Region Proposal Network (RPN), which is a small neural network that
 +
            generates region proposals as bounding boxes and whether there is an object in them. Together with the
 +
            feature map from the last section, ROI align is applied to each region of interest in the feature map to get
 +
            a fixed-dimensioned input for the next section. The output of the ROI align layer is passed into two
 +
            networks a Fully Connected Layer and a small Fully convolutional network, together they can create a mask
 +
            and classify the object. </p>
 +
 
 +
        <div class="single-image-with-desc">
 +
            <center><img src="https://static.igem.org/mediawiki/2021/3/3c/T--HK_GTC--dl_3.jpg" alt=""></center>
 +
            <p>Fig.3 Structure of RPN, ROI Align and following two neural network layers</p>
 +
        </div>
 +
        <h3>Training</h3>
 +
        <p>To optimize the results, we need to train the network until the output of the network for the training data
 +
            is close to the ground truth results. In other words, the training objective is to minimize the difference
 +
            between them. The difference between the ground truth and model output can be defined with a Loss Function,
 +
            in the case of Mask-RCNN, it’s defined from the error of bounding box prediction, classification, and mask
 +
            prediction. For validation of training results, we use COCO mAP[6] for calculation.</p>
 +
 
 +
        <h2 id="usage">Model Usage</h2>
 +
        <p>Requirements:<br><br>
 +
 
 +
            Local Linux environment With:<br><br>
 +
            Detectron2<br>
 +
            Jupyter Notebook<br>
 +
            pytorch 1.8<br>
 +
            torchvision<br>
 +
            OpenCV<br>
 +
            Numpy<br>
 +
            <br>
 +
            or Google Colab Notebook<br>
 +
            <br>
 +
            Files and guidelines for the codes can be found in our
 +
            <a href="https://github.com/IGEM-TEAM-HK-GTC/HK_GTC">Github</a>.
 +
 
 +
        </p>
 +
 
 +
        <h2 id="results">Model Results</h2>
 +
        <h3>Training / Validation Configurations</h3>
 +
        <p>We performed data augmentations on the training images, including random flipping, and brightness and
 +
            contrast, between the scale 0.9 and 1.1. Each iteration loops through 2 images from the training data for
 +
            mini-batch stochastic gradient descent. Our training and validation dataset contains 718 images in total,
 +
            and are split in an 8:2 ratio, where there are 574(1145 instances) and 144 (193 instances) images for
 +
            training and validation. We trained the models in both Google Colaboratory and Kaggle using both Nividia
 +
            P100 and K80 GPU, for 1000 iterations. For validation, we collected the Mean Average Precisions of the model
 +
             performing on the validation dataset and losses for every 20 iterations. And plotted them in a line graph.
 +
        </p>
 +
        <p>We trained the model in different baselines from the Detectron Model Zoo[4], which includes X101-FPN,
 +
             R101-FPN, R50-FPN. Also, performed power estimations for the training dataset in fractions of 0.25, 0.5,
 +
            0.75 to find out the size of the training data to maximize model performance.</p>
 +
        <h3>Baseline results</h3>
 +
        <div class="single-image-with-desc">
 +
            <center><img src="https://static.igem.org/mediawiki/2021/2/25/T--HK_GTC--dl_4.jpg" alt=""></center>
 +
            <p>Table 1: The mAP results were tested on the validation dataset by models with different baselines, as we
 +
              can see that X101-FPN outperforms other models we tested.</p>
 +
        </div>
 +
 
 +
        <div class="single-image-with-desc">
 +
            <center><img src="https://static.igem.org/mediawiki/2021/6/61/T--HK_GTC--dl_5.jpg" alt=""></center>
 +
            <p>Table 2: The mAP results from the Mask-RCNN paper.</p>
 +
        </div>
 +
        <p>As shown in Table 1, the X101-FPN backbone, which combines the concept of ResNet and InceptionNet,
 +
            out-performs the R50-FPN and R101-FPN baseline in terms of overall Mean Average Precision(+3.7, +2.5).
 +
            Comparing with the results obtained from the Mask-RCNN paper, which they train their model from the COCO
 +
            dataset, consisting 330k images of 80 object categories. Our models clearly have a higher mAP. Sample
 +
            detection images are shown below.</p>
 +
        <div class="single-image-with-desc">
 +
            <div class="im-group">
 +
              <center><img src="https://static.igem.org/mediawiki/2021/6/6a/T--HK_GTC--dl_6.jpg" alt=""></center>
 
             </div>
 
             </div>
 
+
             <p>Fig. 4: 3 sample images from the validation set detected by the model with X101-FPN baseline
            <h2 id="nase">PETase and its mutant, S245I</h2>
+
             <p>In 2019, our team created two successful PETase mutants that can increase the enzymatic activity of PETase.
+
              The single-mutant S245I, and the double-mutant W159H/S245I proved to have a higher depolymerization activity
+
              as compared with the wild type. This year, we did some follow-up experiments to confirm if our PETase mutant
+
              S245I can successfully digest PET. We used a Scanning Electron Microscope (SEM) to observe the pitting of
+
              the digested PET film surface. The HPLC result shows the levels of the intermediate product of digestion,
+
              MHET, and the monomer, TPA.
+
 
+
              We showed that when only wild type PETase and the S245I PETase is present in the digestion process, MHET was
+
              detected, which suggests that PET is not completely depolymerized by PETase (Table 1). Therefore, we
+
              hypothesize the presence of MHETase in our enzyme system can increase the degradation rate of PET into its
+
              constituting monomers.
+
 
             </p>
 
             </p>
 
+
        </div>
             <div class="two-image-with-desc">
+
 
              <div class="im-group">
+
        <h3>Potential reasons for High AP & inaccurate performance</h3>
                  <center><img src="../img/2.png" alt=""></center>
+
        <p>
                  <center><img src="../img/3.png" alt=""></center>
+
             Although we have a high Average Precision, the detections were not perfect, there were clearly some false
               </div>
+
            positives existing in the sample images.<br><br>The first reason is probably that our model only detects one object category - PET bottles. Compared with the Mask-RCNN benchmark, they detect 80 categories in total, which drags down their AP. <br><br>Besides, there are flaws existing in our dataset: The most obvious point is the lack of both training and validation data in our dataset, compared to large scale datasets such as Pascal VOC, imageNet, and COCO, consisting of >10000 instances per category. For training, it inhibits the ability of our model to learn more features of PET bottles. Moreover, our dataset contains images of similar objects of different angles, this causes the model to rely on these duplicate data, and only be able to detect bottles with features similar to them, causing bias in our model.
              <p>Figure 2. HPLC data of the products obtained after 96 hours of PET digestion at 30°C.
+
<br><br>Finally, we didn’t use some advanced validation methods, such as K-Fold Cross-validation, where the dataset
                  The retention time of TPA and MHET HPLC standards were at 4.64 minutes and 5.17 minutes respectively.
+
            is divided into k sections, and average the AP obtained by using the distinct sections for validation.</p>
                  Left: PETase digestion using wild type PETase as the only enzyme.
+
        <div class="single-image-with-desc">
                  Right: PETase digestion using S245I PETase as the only enzyme.
+
            <div class="im-group">
              </p>
+
              <center><img src="https://static.igem.org/mediawiki/2021/0/05/T--HK_GTC--dl_7.png" alt=""></center>
 +
              <center><img src="https://static.igem.org/mediawiki/2021/0/08/T--HK_GTC--dl_8.png" alt=""></center>
 +
               <center><img src="https://static.igem.org/mediawiki/2021/b/bc/T--HK_GTC--dl_9.png" alt=""></center>
 
             </div>
 
             </div>
 
+
             <center><p>Fig 5. The training curves of X101FPN, R101FPN, and R50FPN (Up to down)</p></center>
             <h2 id="chrim">Chimeric proteins and enzyme cocktails</h2>
+
            <p>We hypothesize that adding MHETase with PETase will synergize the PET depolymerization process. We propose
+
              to develop a dual-enzyme of PETase and MHETase system in forms of chimeric proteins and enzyme cocktails.
+
              For the chimeric proteins of PETase (including PETase mutants) and MHETase, we link the C-terminus of PETase
+
              to the N-terminus of MHETase using a 12 amino acid serine-glycine linker. We expect the efficiency of the
+
              degradation of PET into its final monomers, TPA and EG, will be increased. For the protein cocktails of
+
              PETase (including PETase mutants) and MHETase, we mix PETase with MHETase in a single reaction. We would
+
              also like to compare the depolymerization activities of the chimeric protein and the protein cocktail.</p>
+
 
+
            <h2 id="result">Results</h2>
+
            <p>In the start of our project, we designed our new constructs, cut using the restriction enzyme and did PCR
+
              screening. Following that, we did protein induction to express the protein, and protein extraction and
+
              purification. We did Bradford protein assay to test the concentration of protein, and SDS-PAGE to confirm if
+
              our protein is expressed. Finally, we did PET film digestion and used HPLC and SEM equipment borrowed from
+
              HKU to analyse the results of the experiment (See Figure 3).
+
            </p>
+
 
+
            <p>
+
              References <br>
+
              [1]: “EU plastics production and demand first estimates for 2020” <br>
+
              [2]: ?<br>
+
              [3]: The New Plastics Economy, Rethinking The Future of Plastics (Rep.). (2016). Geneva, Switzerland: The
+
              World Economic Forum. <br>
+
              [4]: <br>
+
              [5]: <br>
+
              [6]: Austin, H. P., Allen, M. D. et al. (2018). Characterization and engineering of a plastic-degrading
+
              aromatic polyesterase. Proceedings of the National Academy of Sciences, 115(19).
+
              doi:10.1073/pnas.1718804115<br>
+
              [7]: Yoshida, S., Hiraga, K. et al. (2016). A bacterium that degrades and assimilates poly(ethylene
+
              terephthalate). Science,351(6278), 1196-1199. doi:10.1126/science.aad6359 <br>
+
              [8]: Knott, B. C., Erickson, E. et al. (2020). Characterization and engineering of a two-enzyme system for
+
              plastics depolymerization, pnas.org. doi:10.1073/pnas.2006753117<br>
+
              [9]: Han, X., Liu, W. et al. (2017). Structural insight into catalytic mechanism of PET hydrolase, nature
+
              communications. DOI:10.1038/s41467-017-02255-z<br>
+
              [10]: Joo, S., Cho, I. J. et al. (2018). Structural insight into molecular mechanism of poly(ethylene
+
              terephthalate) degradation. Nature Communications, 9(1). doi:10.1038/s41467-018-02881-<br>
+
            </p>
+
 
+
 
         </div>
 
         </div>
      </div>
 
  </section>
 
  
 +
        <p>Fig. 5 are the training curves for the models, where the blue line represents the mAP, and the purple line
 +
            represents the loss. We can see that the mAP of the R50FPN and R101FPN model reached stability at around 500
 +
            iterations, while the X101FPN model reaches stability at around 400 iterations, early than the rest. And
 +
            their losses all reached constancy at around 600 iterations. And after the 600 iteration mark, there is no
 +
            sudden spike drop in the mAP, so we know that the model wasn’t experiencing overfitting.</p>
 +
 +
        <h3>Power Estimations of the dataset</h3>
 +
        <div class="two-image-with-desc">
 +
            <div class="im-group">
 +
              <center><img src="https://static.igem.org/mediawiki/2021/9/98/T--HK_GTC--dl_10.png" alt=""></center>
 +
              <center><img src="https://static.igem.org/mediawiki/2021/a/ad/T--HK_GTC--dl_11.png" alt=""></center>
 +
            </div>
 +
        </div>
 +
 +
        <p>As Clearly shown in the graphs, our small scaled dataset is entirely not enough to achieve maximum
 +
            performance. The trend of the AP against the training image percentage shows an exponential increase,
 +
            which means if we further expand our dataset it will continue to show AP improvements.</p>
 +
 +
        <h2 id="future">Future Plans/Implementations</h2>
 +
        <h3>Plans</h3>
 +
            <ul>
 +
              <li>Increase the size of training and validation data to improve model accuracy and ensure a reliable mAP</li>
 +
              <li>Train our dataset on other algorithms e.g. YOLOv3.</li>
 +
            </ul>
 +
 +
        <h3>Implementations</h3>
 +
        <ul>
 +
            <li>Verify actions done to coastline plastic pollution from different stakeholders (e.g. Effects of producer
 +
              responsibility schemes by the government, Duration of effect of beach cleanups by NGOs)</li>
 +
            <li>Use the model as a backbone of a plastic cleanup robot around the coastline.</li>
 +
            <li>Mapping of plastic waste around the coastline using drones to generate a cleanup plan.
 +
            </li>
 +
        </ul>
 +
 +
        <p>
 +
            References <br>
 +
            [1]https://www.condorferries.co.uk/marine-ocean-pollution-statistics-facts<br>
 +
            [2]https://github.com/openvinotoolkit/cvat<br>
 +
            [3]https://arxiv.org/abs/1703.06870<br>
 +
            [4]https://github.com/facebookresearch/D<br>etectron<br>
 +
            [5]https://github.com/facebookresearch/d<br>etectron2/blob/master/MODEL_ZOO.md#co<br>co-instance-segmentation-baselines-with-mask-r-cnn<br>
 +
            [6]https://jonathan-hui.medium.com/map-mean-average-precision-for-object-detection-45c121a31173<br>
 +
            [7]https://www.itc.gov.hk/en/fund_app/pat<br>ent_app_grant.html<br>
 +
        </p>
 +
 +
      </div>
 +
  </div>
 +
  </div>
 +
  <div class="clear"></div>
 
   <footer>
 
   <footer>
 
       <div>
 
       <div>
          <h1>GT COLLEGE iGEM2021</h1>
+
        <h1>GT COLLEGE iGEM2021</h1>
          <img width="30%" src="https://static.igem.org/mediawiki/2021/7/78/T--HK_GTC--logo.png" alt="">
+
        <img width="30%" src="https://static.igem.org/mediawiki/2021/7/78/T--HK_GTC--logo.png" alt="">
 
       </div>
 
       </div>
 
       <div>
 
       <div>
          <h1>Follow Us!</h1>
+
        <h1>Follow Us!</h1>
          <a href="https://www.instagram.com/gtigemteam/">
+
        <a href="https://www.instagram.com/gtigemteam/">
              <i class="icon fab fa-instagram fa-2x"></i>
+
            <i class="icon fab fa-instagram fa-2x"></i>
          </a>
+
        </a>
 
       </div>
 
       </div>
 
       <div>
 
       <div>
          <h1>Contact:</h1>
+
        <h1>Contact:</h1>
          <p>igemteam.gt@gmail.com</p>
+
        <p>igemteam.gt@gmail.com</p>
 
       </div>
 
       </div>
  </footer>
+
  </footer>
 
</html>
 
</html>

Latest revision as of 11:18, 20 October 2021

HK_GTC 2021 Homepage

expand_less

Deep Learning

Detection of Plastic Bottles

As we know, plastic pollution is a severe problem that impacts both the ecosystem and our daily lives. One of the major sources of plastic is oceans, and an estimated amount of 5.25 trillion pieces of plastic and microplastic are currently floating around the ocean, of which 15% of them will eventually land on our beaches[1]. In response to the plastic problem, we would like to develop a deep learning PET bottle detection model for mapping plastic pollution on beaches. The data allows government, councils, NGO to have an overview of the current situation and to estimate how effective their proposal is to reduce the impact of plastic wastes. Researchers will begin working to prepare and implement an effective plan for prevention and cleanup effort on where they should focus. Our ultimate goal is to help to reduce the amount of plastic pollution in the ocean.

Our Workflow

Photo Taking

Using our drone and phones, we took 718 images along coastlines of beaches including Cheung Chau and Cheung Sha in Hong Kong, and uploaded them to CVAT(Computer vision annotation tool)[2] - a website provided by Clearbot, a company which creates marine plastic clearing robots, to create ground truth instances for the training process.

Fig 1. Students using a drone to capture images on beaches

Volunteer Plastic Tagging

Together with the human practice team, we have organized a plastic tagging activity within our school, we invited around 5 students from each class, with a total amount of 60 students. During the activity, we taught them how to trace polygons around a plastic object in an image in CVAT. Together with the help of some of our team members, we had generated the training data.

Then we train a model using the data obtained with Detectron from Facebook AI Research to obtain plastic detecting models.

Model Description

The detection algorithm we’re using is Mask-RCNN[3], which is an object detection algorithm developed by Facebook, the algorithm extends the concept of Faster-RCNN, which only creates bounding boxes around detected objects. We decided to use Detectron2[4] as a framework to create a model. We use the baselines from Detectron’s Model Zoo[5] for transfer learning to improve model performance.

Structure

To have an intuitive understanding of the model structure. Let's say we have a video that we want to detect PET bottles from, we then process each frame from the video by passing it to a Convolutional Neural Network, which uses filters as kernels to extract features from the images, such as shapes, reflections, highlights, etc. And creating a feature map.

Fig.2 Structure of feature extracting backbone

The feature map is then passed into a Region Proposal Network (RPN), which is a small neural network that generates region proposals as bounding boxes and whether there is an object in them. Together with the feature map from the last section, ROI align is applied to each region of interest in the feature map to get a fixed-dimensioned input for the next section. The output of the ROI align layer is passed into two networks a Fully Connected Layer and a small Fully convolutional network, together they can create a mask and classify the object.

Fig.3 Structure of RPN, ROI Align and following two neural network layers

Training

To optimize the results, we need to train the network until the output of the network for the training data is close to the ground truth results. In other words, the training objective is to minimize the difference between them. The difference between the ground truth and model output can be defined with a Loss Function, in the case of Mask-RCNN, it’s defined from the error of bounding box prediction, classification, and mask prediction. For validation of training results, we use COCO mAP[6] for calculation.

Model Usage

Requirements:

Local Linux environment With:

Detectron2
Jupyter Notebook
pytorch 1.8
torchvision
OpenCV
Numpy

or Google Colab Notebook

Files and guidelines for the codes can be found in our Github.

Model Results

Training / Validation Configurations

We performed data augmentations on the training images, including random flipping, and brightness and contrast, between the scale 0.9 and 1.1. Each iteration loops through 2 images from the training data for mini-batch stochastic gradient descent. Our training and validation dataset contains 718 images in total, and are split in an 8:2 ratio, where there are 574(1145 instances) and 144 (193 instances) images for training and validation. We trained the models in both Google Colaboratory and Kaggle using both Nividia P100 and K80 GPU, for 1000 iterations. For validation, we collected the Mean Average Precisions of the model performing on the validation dataset and losses for every 20 iterations. And plotted them in a line graph.

We trained the model in different baselines from the Detectron Model Zoo[4], which includes X101-FPN, R101-FPN, R50-FPN. Also, performed power estimations for the training dataset in fractions of 0.25, 0.5, 0.75 to find out the size of the training data to maximize model performance.

Baseline results

Table 1: The mAP results were tested on the validation dataset by models with different baselines, as we can see that X101-FPN outperforms other models we tested.

Table 2: The mAP results from the Mask-RCNN paper.

As shown in Table 1, the X101-FPN backbone, which combines the concept of ResNet and InceptionNet, out-performs the R50-FPN and R101-FPN baseline in terms of overall Mean Average Precision(+3.7, +2.5). Comparing with the results obtained from the Mask-RCNN paper, which they train their model from the COCO dataset, consisting 330k images of 80 object categories. Our models clearly have a higher mAP. Sample detection images are shown below.

Fig. 4: 3 sample images from the validation set detected by the model with X101-FPN baseline

Potential reasons for High AP & inaccurate performance

Although we have a high Average Precision, the detections were not perfect, there were clearly some false positives existing in the sample images.

The first reason is probably that our model only detects one object category - PET bottles. Compared with the Mask-RCNN benchmark, they detect 80 categories in total, which drags down their AP.

Besides, there are flaws existing in our dataset: The most obvious point is the lack of both training and validation data in our dataset, compared to large scale datasets such as Pascal VOC, imageNet, and COCO, consisting of >10000 instances per category. For training, it inhibits the ability of our model to learn more features of PET bottles. Moreover, our dataset contains images of similar objects of different angles, this causes the model to rely on these duplicate data, and only be able to detect bottles with features similar to them, causing bias in our model.

Finally, we didn’t use some advanced validation methods, such as K-Fold Cross-validation, where the dataset is divided into k sections, and average the AP obtained by using the distinct sections for validation.

Fig 5. The training curves of X101FPN, R101FPN, and R50FPN (Up to down)

Fig. 5 are the training curves for the models, where the blue line represents the mAP, and the purple line represents the loss. We can see that the mAP of the R50FPN and R101FPN model reached stability at around 500 iterations, while the X101FPN model reaches stability at around 400 iterations, early than the rest. And their losses all reached constancy at around 600 iterations. And after the 600 iteration mark, there is no sudden spike drop in the mAP, so we know that the model wasn’t experiencing overfitting.

Power Estimations of the dataset

As Clearly shown in the graphs, our small scaled dataset is entirely not enough to achieve maximum performance. The trend of the AP against the training image percentage shows an exponential increase, which means if we further expand our dataset it will continue to show AP improvements.

Future Plans/Implementations

Plans

  • Increase the size of training and validation data to improve model accuracy and ensure a reliable mAP
  • Train our dataset on other algorithms e.g. YOLOv3.

Implementations

  • Verify actions done to coastline plastic pollution from different stakeholders (e.g. Effects of producer responsibility schemes by the government, Duration of effect of beach cleanups by NGOs)
  • Use the model as a backbone of a plastic cleanup robot around the coastline.
  • Mapping of plastic waste around the coastline using drones to generate a cleanup plan.

References
[1]https://www.condorferries.co.uk/marine-ocean-pollution-statistics-facts
[2]https://github.com/openvinotoolkit/cvat
[3]https://arxiv.org/abs/1703.06870
[4]https://github.com/facebookresearch/D
etectron
[5]https://github.com/facebookresearch/d
etectron2/blob/master/MODEL_ZOO.md#co
co-instance-segmentation-baselines-with-mask-r-cnn
[6]https://jonathan-hui.medium.com/map-mean-average-precision-for-object-detection-45c121a31173
[7]https://www.itc.gov.hk/en/fund_app/pat
ent_app_grant.html

GT COLLEGE iGEM2021

Follow Us!

Contact:

igemteam.gt@gmail.com