Team:HUST-China/Micronucleus Counter

Micronucleus Counter | iGEM HUST-China

Micronucleus Counter


It's THE WORLD'S FIRST micronucleus recognition counter for ordinary light microscope photographs. This tool GREATLY SAVES EXPERIMENTAL COSTS, both in terms of equipment cost and labor cost.And the evaluation results based on this set of tools are highly consistent with the recognized evaluation standards.

PART 1 algorithm structures

We choosed the Yolov 5 algorithm framework.

This algorithm uses Mosaic data enhancement, adaptive anchor frame calculation, adaptive image scaling and other processing methods for input data, and backbone uses Focus architecture and CSP architecture - it is worth mentioning that this algorithm introduces CSP2 architecture in Neck part, thus enhancing the ability of model feature fusion. Finally, GIOU_Loss is used as the loss function of Bounding box at the output end, and NMS non-maximum suppression is used to ensure no double counting.

PART 2 results

Our runtime environment is Pycharm, Python+Tensor, Flow+Torch+Scikit-learn. You could download the source code to get the requirements via requirements.txt. The way to install the requirements is shown below:

pip install -r requirements.txt

Taking our micronucleus counting as an example, we wrote the scripting file

import os
for dirname in os.listdir(r'F:\venv\work'):
    os.system('python ./ --weights F:/venv/MCNcounter/runs/train/exp2/weights/ --source F:/venv/work/{0}  --imgsz 2592  --save-txt --name F:/venv/results/{0} --line-thickness 1 >> F:/venv/logs/{0}_log.txt'.format(dirname))

Among this, F:\venv\work is the catalog where the pictures to be analyzed were situated.

F:\venv\MCNcounter\runs\train\exp2\weights\ is the address of weight file. These contents could all be customized and we would make an instruction about this in PART 3.

These are the running results:

Left: raw photo, Right: labeled photo

Figure 2: Left: raw photo, Right: labeled photo

In the installed file, the log.txt and the labeled images are included, which can not only extract the labels directly, but also test if the labels mark the micronuclei accurately.

PART 3 Customization of data set

Generally, yolov5 framework could be optimized to identify every image to label (or the video read frame by frame). But to improve the efficiency of the model, here it is used as a micronucleus counter, we don't illustrate these customizable file and model files. We would provide the interface of training model when the labels are only "cell" and "micro" (referring to micronucleus) to simplify the explanation of how to customize the data set.

Still, source codes are provided and you could learn further about how to customize via internet.

The steps to customize the training set are listed below:

  • Input this at the command terminal:

  • If the file "labelimg" isn't installed, please run this beforehand:

    pip3 install labelimg
  • Put the pictures for training into ./MSNcounter/prepare_data_voc/images, then click "Open Dir" in the surface of lebelimg. Choose the image folder and click "Change Save Dir", and save the label files in the folder ./MSNcounter/prepare_data_voc/Annotations.

  • Choose cell/micronuclei in the inner frame of labelimg, and name it as cell/micro

  • Save, and exit the labelimg surface.

  • Run this file under the catalog where is situated:

    python3 ./
  • As the same, run this file under the catalog where is situated:

    python3 ./
  • Begin the training:

    python --img 640 --batch 16 --epoch 300 --data data/ab.yaml --cfg models/yolov5s.yaml --weights F:/venv/MCNcounter/runs/train/exp2/weights/ --device '0'

Tips: You can modify the parameters according to the actual configurations. The weight file after training is: ./runs/train/exp*/weights/

  • You can run our script directly according to the catalog settings displayed in part 2, or set the attention catalog according to the command below:

    python ./ --weights {weights_file(.pt)} --source {the_dir_of_photos}  --imgsz {the _size_of_photos}  --save-txt --name {the_dir_of_results} --line-thickness 1 >> {the_dir_of_logs}

PART 4 Several tips about the data set

Here are some advice about data set:

  • You should make sure that the samples are collected with the same apparatus and the same parameters:
  • The quality of the samples should be specially noticed. Redundant external stains, less interphase cells and uneven dyeing(or most of the part are too pale) could lead to unreliable testing results. In this case, positive and negative control group could only be used as relative reference and the widely acknowledged data standards could not be optimized anymore.

Impressive! I believe it will bring convenience to other teams in the future, what do you say? I’m now really excited about their final products, let’s take a look!


  1. Allen, M. J., & Sheridan, S. C. (2015).

    Mortality risks during extreme temperature events (ETEs) using a distributed lag non-linear model.

    International Journal of Biometeorology 62(1), 57-67.

    CrossRefGoogle ScholarBack to text
  2. Rosano, A., Bella, A., Gesualdo, F., Acampora, A., Pezzotti, P., Marchetti, S., ... & Rizzo, C. (2019).

    Investigating the impact of influenza on excess mortality in all ages in Italy during recent seasons (2013/14-2016/17 seasons).

    International Journal of Infectious Diseases 88, 127-134.

    CrossRefGoogle ScholarBack to text
  3. Ingalls, B. P. (2013).

    Mathematical modeling in systems biology: An introduction.

    MIT Press.

    Google BooksBack to text
  4. Agriculture: Crop production: Sugarcane. TNAU Agritech Portal.

    (March 15, 2019). Retrieved on June 22, 2020. from

    Back to text
  5. Author Name. (n.d.).

    Agriculture: Crop production: Sugarcane. TNAU Agritech Portal.

    Retrieved on June 22, 2020. from

    Back to text


Huazhong University of Sci. & Tech., Wuhan, China

1037# Luoyu Rd, Wuhan, P.R.China 430074

Copyright © HUST-China iGEM 2021