Team:UPF Barcelona/Software IRIS

Team:UPF Barcelona -


A computer vision system embedded into a cross-platform application. Intended to act as the bridge between ARIA’s computational mechanisms and the user, its goal is to automatically extract the key information from ARIA samples, to reduce such data into simple binary matrixes, and to send it to the internet for further processing. Its source code is developed to facilitate its compilation for Windows, Linux, macOS, Android, and iOS.


With Alpha, we explored the possibility of turning data into useful information, and of translating that information into tools that can be practically brought into the lab. With OmegaCore, we have focused on how to convert this information into analytical power capable of evaluating the results produced by our bio-tools. The next step, then, consists of preparing an accessible interface with which the user can interact with all these systems.

This is the main objective of IRIS, the cross-platform application that we are developing to automatically scan the results of the detections and send them to the rest of Omega. In line with this premise, we divided the development of IRIS into two complementary stages: building the artificial vision system and generating the source code of an application that can be easily compiled on the different OS.


The artificial vision system is built around the well-known OpenCV library, specifically in its Python version. Applying different computer vision techniques, the system uses the device's camera to locate the array in question, analyze its content and extract a binary matrix with positive and negative detections in the correct positions. Next, we explain in detail the workflow of the system.


After initializing, the system accesses the first available camera on the device, extracts its resolution, and calculates the position of the frame center. Then it defines a series of flow control variables and load parameters such as the detection array dimensions (rows and columns). Once this is done, the system enters the main loop.

Main Loop

In each iteration of the loop, different specialized functions will be called to perform all the required tasks (these functions are explained in more detail in subsequent sections). On the one hand, what the user experiences are just a continuous, real-time video signal with different indicators depending on the situation. On the other hand, lots of processes start to happen internally.

First, a frame is captured and passed to a search function that detects whether or not there is an array on the image. If affirmative, the frame is sent to the analysis function that evaluates each of the cells, while if negative, the execution continues. Regardless of the previous steps, a drawn on the frame with the approximate size and position that the matrix should present, as well as a small text indicating how it should be oriented. All this signaling is then displayed on the video signal in real-time as a guide to assist the user when making the detection. Subsequently, a control function is called to measure whether the user inputs. Finally, there is the last management structure that, based on the result of the control function, externalizes the matrix obtained, stops the execution of the program, or allows the whole cycle to start over.

Search module

The array search function operates in real-time, constantly, taking as input each captured frame and using different transformations to determine if the detection array appears or not. If found, it applies the second block of transformations to separate it from the background, reorient it, and process it for later analysis.

As a first step, the system turns the frame to black and white and applies a Gaussian filter to homogenize the similar intensity areas and blur the secondary edges. Next, adaptive thresholding is applied to binarizes the image, simplifying the geometric shapes that appear. Once this is done, the system searches for the contours of the different geometric shapes and filters all those that occupy less than 10% of the image. At this point, the surviving geometric shapes are used to generate rectangles that contain them.

From these rectangles, those whose ratio between width and height is not close to 1 are filtered: that is, we leave only the squares. Finally, the square with the largest area of all those that have passed the filter is selected. This entire procedure is carried out to distinguish the detection array, which is square, from other possible artifacts. The area criterion is applied because, in the guide that we show on the screen, the system is instructing the user to locate the camera at the necessary distance so that its area precisely meets the established criteria. Through these series of conditions, an attempt is made to maximize selectivity when distinguishing the matrix from its environment.

If all the previous process does not work, the system is indicated that there is no recognizable array in the field of vision, and therefore the participation of the function ends for the current iteration. On the contrary, if the array has indeed been found, we proceed with its extraction. To achieve this, a binary mask is created in which the edges that have been detected for the matrix are drawn, and its content is filled, leaving the shadow of the matrix in the mask as ones, and the rest as zeros. Thanks to this, an AND operation can be applied to eliminate the entire background from the original image: the pixels of the image that correspond to the matrix are multiplied by 1, remaining constant, while those of the background are multiplied by 0, disappearing.

After doing this, the angle of deviation found in the detection array is used to compute a rotation matrix, which will be applied as an affine transformation to fix the alignment of the object in question. The result: a real-time automatic stabilization system that corrects the orientation of the matrix so that it is always aligned with the screen. After this, the matrix is directly cropped from the rest of the mask, resized, and ready for analysis.

Analysis function

The analysis function also runs in real-time, but only while the presence of a detection array is being detected. Once it receives the preprocessed mask, it refines the edges and, executes another Grayscale + Gaussian Filter + Adaptive Thresholding transformation bloc. This results in a binary image that only shows intensity contrasting regions, as the lines conforming the array or the stains in the positive cells.

At this point, knowing the number of rows and columns in the array and thanks to the angle correction made before, a simple grid can be generated so that its cells overlap with those of the array. Next, for each of these cells in the binary image, the number of non-null pixels is evaluated. Because of how the transformations are parameterized, those cells where there is a region with intensity contrast have a greater number of non-null pixels. Thus, by establishing a counting threshold, the system can distinguish when a cell is positive and when it is negative, and this information can be transferred into a binary matrix: the output of the system.


Once we have the central component to carry out the artificial vision process, we need to build a structure that enables its deployment and connection with the rest of Omega. Wanting to solve this problem, we have been developing a source code that acts as a template for a cross-platform application, implemented to be compiled on the main operative systems found both on personal computers and mobile devices.

Introduction to Kivy

To achieve this, we used the Kivy library as the heart of IRIS. Kivy is a set of open-source tools for the development of applications and user interfaces that can be run on Linux, Windows, OS X, Android, iOS, and Raspberry Pi, allowing to generate a common code that, with minimal adaptations, can later be compiled and distributed on all these platforms. In addition, Kivy has its own graphic engine, high flexibility when it comes to communicating and integrating with other subsystems, and a wide community of developers that supports and expands it. For all these reasons, we decided to base the IRIS infrastructure on this library.

Constructing the app

At the user experience level, the application shows three buttons: one to activate the camera, another to take captures, and another to stop its execution. In addition, when the camera is active, the real-time video is shown. Finally, a slot is also displayed on the screen to write the contact address that will be sent to Omega along with the processed data.

Regarding its ins and outs, the functionality and structure of the application are distributed as a series of classes. The main class contains a method to build the interface and orchestrate the app's general workflow, being also able to stop its execution. Each of the buttons is assigned to a specific control variable, and also a clock is used for the cyclical execution of the artificial vision system and the associated subsystems. The processing class contains the artificial vision system per se, with the properties and attributes explained above, and with links to communicate with the rest of the components.

Finally, the interface class is responsible for communicating the highest-level structures of the application, such as user interaction, with the computer vision system. This includes the rendering mechanisms that show the video signal on the screen (with specific Kivy methods, based on textures), the management of control variables according to internal processes and the data produced by the artificial vision system, the measurement of user inputs, or operating the communications module with which IRIS links to Omega.