The award-winning solution which bagged a cash prize of 3 lakhs demonstrates robustness of an optical tracking algorithm that can not only enhance navigation but also enable real-time tracking of autonomous flying objects.
As part of the Indian Navy’s Innovation and Indigenisation Seminar ‘Swavalamban 2024’ that was held in October, the Navy also kicked off a nation-wide competition aimed at addressing real world operational challenges with innovative technological solutions. Participants who could enrol in teams of 1-5 were presented with a number of problem statements to choose from. They ranged from developing an application load balancer on Open Stack for dynamic traffic distribution, health monitoring, etc, to creation of a decentralised system for drone swarm coordination, to a solution for maritime situational awareness to developing an AI/ML solution that can identify and separate speech from a group of simultaneous speakers, to navigation and real-time tracking of flying objects.
Taking inspiration from his own research work that was presented at the IEEE International Conference on Robotics and Automation (ICRA) 2023, Rishabh Bhattacharya, a 3rd year (BTech and Master of Science in Computer Science and Engineering by Research) student opted to challenge himself by developing an optical flow tracking algorithm capable of sub-pixel accuracy to enhance navigation and real-time tracking of flying objects. “One of the criteria laid out was for the solution to demonstrate resilience to varying lighting conditions, rapid movements, and complex textures while maintaining efficiency on platforms like drones or embedded systems,” he reveals, adding that the objective included the development of a sophisticated algorithm which would ensure robustness and scalability of the solution through comprehensive testing and evaluation.
The Problem
In any autonomous system such as a drone or a self-driving vehicle, accurate motion detection and tracking are essential for navigating the dynamic environments and avoiding obstacles. Besides, real-time tracking of flying objects like birds, other drones, and UAVs is crucial for collision avoidance and situational awareness. Optical flow, a key computer vision technique is used to estimate the motion of objects and surfaces by analysing the movement of pixels between consecutive image frames. According to Rishabh, the goal of achieving high precision at a sub-pixel level is challenging but nevertheless vital for fine-grained motion estimation and tracking. “Additionally, tracking flying objects introduces complexities due to their rapid and unpredictable movements, necessitating advanced detection and tracking mechanisms that can operate seamlessly in real-time,” he remarks.
What He Did
One of the first challenges Rishabh faced was the lack of comprehensive datasets that encompass a wide variety of flying objects, including planes, helicopters and UAVs. To address this, he integrated the Flying Objects dataset from Sekilab, which includes planes, helicopters and birds, with the UAV dataset available on Kaggle. This resulted in a diverse and extensive dataset tailored to the specific needs of the project. But in addition to this, in order to facilitate effective training and evaluation of the optical flow tracking algorithm, Rishabh also created a synthetic dataset. It involved applying semantic separation techniques to isolate individual objects and systematically move them across the screen. “By doing so, the dataset simulates various motion scenarios, providing a controlled environment for assessing the optical flow tracking capabilities. In fact, the combined dataset, totalling 7.7 gigabytes is slated for public release to benefit the broader research community,” he mentions.
To enhance the robustness of the object detection component of the optical flow tracker, Rishabh utilised the framework that he had initially proposed in Prof Madhava Krishna’s research paper titled, “ GDIP: Gated Differential Image Processing for Object Detection in Adverse Conditions”. The framework introduces a domain-agnostic network architecture that can be integrated into existing object detection models such as YOLOv8, to improve performance under challenging environmental conditions like fog, and low lighting.
Performance And Efficiency
The enhanced YOLOv8 model, augmented with the GDIP framework, was trained using the combined dataset over 50 epochs. Rishabh also employed optimization techniques to reduce processing time thereby ensuring the model’s suitability for real-time applications. He reports that the model was fine-tuned to process each frame of a GIF or incoming video stream in approximately 2 milliseconds. “When we tested the model in varying lighting conditions, with complex textures, and unpredictable object movements, the algorithm maintained a high level of performance demonstrating its resilience to environmental challenges,” he remarks.
Speaking about the entire experience itself, Rishabh recalls meeting Navy admirals and commanders at the hackathon who discussed his winning solution and looked forward to integrating it into the Navy. “The various things that I worked on in the Machine Learning Lab under the guidance of Dr Naresh Manwani gave me exposure to different ideas some of which I used in the hackathon, like the paper we discussed for a project in the lab ended up being useful in the final solution,” he says.
Next post