Machine Learning

We are a dedicated team that is passionately devoted to the fields of machine learning, image processing, and robotics. Our focus lies in the exciting areas of image processing, particularly tracking and detection, robotics with a focus on reinforcement learning for locomotion and manipulation, as well as the application of machine learning models on embedded systems. We work on both autonomous driving as well as walking robotic systems.

In image processing, we utilize advanced algorithms and techniques to track and detect objects in real-time. By employing modern computer vision methods, our systems are capable of analyzing and interpreting complex visual data.

In robotics, we concentrate on the field of reinforcement learning. Here, we enable our robots to learn autonomously and improve their motion patterns and manipulation abilities. We apply reinforcement learning to both mobile systems and walking robots that operate in various environments.

Our team combines expertise in artificial intelligence, robotics, and image processing to develop innovative solutions. Our goal is to expand the possibilities of these fascinating technologies and achieve practical, actionable results.

Multi-Object Tracking

Multi-Object Tracking is an advanced technique in image processing that allows for the tracking and analysis of multiple objects in a scene over time. This method finds applications in various fields, including optimizing logistical processes in 24/7 operations.

In the logistics domain, accurately tracking the location and movements of entities such as vehicles, packages, or goods is often crucial. With the help of Multi-Object Tracking, these entities can be continuously monitored to ensure precise and reliable real-time localization and tracking. This enables the optimization of operational workflows by identifying bottlenecks, planning routes, or improving the efficiency of warehouse management systems.

Multi-Object Tracking algorithms rely on advanced computer vision techniques such as object detection and motion estimation. Initially, objects are detected and localized in a video stream or image sequence. Subsequently, the tracking of these detected objects across multiple frames is enabled by estimating their motions using methods like the Kalman filter. This is often done while considering additional information, such as object speed, shape, or size.

The application of Multi-Object Tracking in the logistics field offers a range of benefits, including improved real-time monitoring, identification of bottlenecks or unexpected delays, optimization of delivery and transportation routes, and increased efficiency in warehouse management.

Furthermore, our approaches also allow for the use of multiple cameras to enhance accuracy. While occlusions can lead to worse results when using only one camera, with multiple cameras, the object is usually visible on at least one of them. Through special handover procedures, the tracking can be seamlessly handed over from one camera to another, avoiding gaps in the tracking process.

Autonomous Mobile Robots

The locomotion of mobile robots presents a challenging task, especially when it comes to executing precise motion sequences. The dynamics of the robot plays an important role, as they can introduce effects such as slipping or instability during fast turns. To address such challenges, we integrate machine learning into the planning process to enable accurate execution despite these effects.

By leveraging machine learning, we can better understand and model the behavior of the robot and its responses to different dynamic conditions. Reinforcement learning methods are employed, where the robot learns to choose suitable motion sequences to achieve the desired goal through interaction with its environment. Through continuous training and feedback, the robot learns how to adapt its movements to specific requirements and constraints.

In planning the motion sequences, we take into account the physical and particularly dynamic properties of the robot to ensure that the desired motions are executed precisely and stably. Physical models and dynamic constraints are used to verify the feasibility of the planned motions. However, by utilizing machine learning, we can go beyond purely physical models and also consider the nonlinear effects of dynamics.

Manipulation

Grasping, which refers to the act of gripping objects, is a crucial capability of robots in object manipulation. Particularly with the application of reinforcement learning and the utilization of point clouds, new possibilities arise for developing effective grasping strategies.

Point clouds are 3D data that represent a scene or an object. They are often captured using sensors such as 3D cameras or LIDAR systems. By using point clouds, the algorithms gain detailed information about the environment and the objects they intend to grasp.

Reinforcement learning is a machine learning method in which an agent learns to perform suitable actions to achieve a specific goal through interaction with its environment. In grasping with RL, the robot is trained to learn the optimal grasping strategy by utilizing point clouds to extract relevant information about the shape, position, and orientation of objects.

The RL agent repeatedly performs grasping actions and receives feedback in the form of rewards or penalties, depending on the success of the grasping process. Through this iterative learning, the agent improves its grasping strategies over time, developing more precise and effective movements.

This enables applications in various fields, such as industrial automation, logistics, or household robotics, where precise and reliable grasping actions are required.

Robot swarms

Autonomously driving robots in swarms are a fascinating application where multiple mobile robots collaborate to achieve common goals. One particularly interesting approach is to use reinforcement learning to navigate the robots autonomously and collision-free to their destinations without the need for direct communication between them.

The use of reinforcement learning allows the robots to learn through experience and interaction with their environment how to adapt their actions to navigate safely and efficiently. By receiving rewards based on their actions, the robots learn how to avoid obstacles and reach their goals without collisions.

The key aspect of swarm navigation is that the robots do not need to communicate directly with each other to perform coordinated movements. Instead, their behavior is based on individual observations (measured using sensors) and decisions. Each robot perceives its environment, gathers information about obstacles, goals, and other robots, and uses this information to plan its actions.

Through the use of reinforcement learning, the robots are trained to perform individually optimal movements while considering collision avoidance. They learn how to maneuver around obstacles and find efficient routes to their goals. Through repeated training and experience, the robots improve their navigation skills and develop strategies to handle complex scenarios with multiple robots.

The advantages of using reinforcement learning-based swarms lie in the scalability and flexibility of the system. As the robots do not rely on communication and therefore do not require a central infrastructure, they can operate autonomously and decentralized, enabling scalability to a larger number of robots. Furthermore, the individual adaptation of movements allows for flexible and robust navigation in complex environments.

Driving robots in swarms using reinforcement learning for collision-free navigation finds applications in various fields such as logistics, environmental monitoring, or teamwork in robotics. Through autonomous and cooperative navigation, these robots can work efficiently and handle tasks in dynamic and challenging environments.

Machine Learning Compiler

Machine Learning compilers are specialized tools developed to translate trained models from the field of machine learning into executable code. The main goal is to optimize the execution of these models for various hardware platforms.

The process of translating machine learning models involves several steps. First, the trained model is analyzed to understand its structure and the mathematical operations used. Then, this code is analyzed and optimized to enable efficient execution on the target hardware. The optimizations can include various techniques such as simplifying calculations, removing redundant operations, or reordering code to optimize memory access. The aim is to generate code that performs the model's computations with minimal resource requirements and maximum speed.

Another important aspect of machine learning compilers is their support for different hardware platforms. The compilers need to be able to adapt the optimized code to the specific characteristics of the target platform, such as available memory or support for specific instruction sets. This ensures optimal execution of the model on the target hardware.

Furthermore, machine learning compilers enable efficient execution of the model on different hardware platforms without the need for extensive frameworks. A component developed at IML, for example, allows the translation of models into the programming languages C and C++, enabling easy deployment on a wide range of hardware platforms. This also allows for the integration of the model into existing systems and applications developed in C, C++, or other languages. In particular, the focus is on the use of ML models on embedded systems with limited resources. In this field, we actively contribute to the open-source developments of the LLVM compiler infrastructure project and the OpenXLA ML compiler ecosystem initiated by Google. We also provide our own developments as open-source software.