To develop a fully autonomous robotic system capable of navigating, detecting, and handling objects in real-world environments using computer vision, depth sensing, and ROS2-based navigation.
About the Project -
This project was developed during my time at Control One, where I served as the lead developer. The objective was to design and build a vision-based autonomous forklift capable of performing material handling tasks with minimal human intervention.
The system combined ROS2-based navigation, computer vision, and depth sensing to achieve safe, reliable, and intelligent operation inside dynamic warehouse environments. Alongside autonomy, it also supported teleoperation and fleet management, making it adaptable for real-world industrial use.
The key features of this autonomous forklifts are Teleoperation Mode, Vision-Based Pallet Picking & Dropping, Autonomous Navigation & Obstacle Avoidance, Fleet Management System, Human-in-the-Loop Safety.
Key Features
In teleoperation mode, the forklift could be remotely operated through a 3-screen 4K video setup that streamed front, left, and right views in real time, giving the operator complete situational awareness. A driving console with steering wheel and accelerator pedal replicated the natural driving experience, while on-screen guidelines helped align the forklift’s path. All video and control signals were transmitted over a 5G cloud network with ultra-low latency, achieving ~100 ms video streaming delay and 20–30 ms response for steering and acceleration commands. With a single click, operators could seamlessly switch between manual teleoperation and full autonomy, ensuring both safety and flexibility. This setup allowed smooth, real-time remote driving while still enabling human-in-the-loop intervention whenever the forklift required assistance.
In this module, the forklift uses a YOLOv5-based vision system, trained on over 15,000 pallet images captured under different lighting conditions, shapes, and sizes, achieving 0.97 precision/recall and over 95% detection accuracy. Once a pallet is detected, it can be selected either manually in teleoperation mode or automatically in autonomous mode based on proximity and accessibility. After selection, the forklift performs semi-autonomous pallet alignment, using depth camera feedback and a high-precision IMU to align forks with the pallet centroid and orientation. The system then inserts the forks, lifts the pallet, reverses safely, and hands control back to either autonomy or teleoperation. Safety overrides were also built in, allowing supervisors to halt the process instantly if required.
The forklift achieves high-precision autonomous navigation using a 20 degree tilted depth camera and a SLAM algorithm, ensuring accurate localization even in tight spaces with pallets only 10–20 cm apart. Navigation tasks are executed via ROS2 Nav2, with goals and mission sequences managed through a task executor system that breaks missions into stages: reaching the target, pallet selection, picking, delivery, and dropping. Obstacle avoidance is handled independently, with real-time detection of humans and objects, applying immediate braking to ensure safety. Despite varying lighting conditions, the system achieves approximately 30 cm positional accuracy, balancing reliability with responsiveness in a heavy 3000 kg vehicle.
The forklift is integrated with a Fleet Management System featuring a cloud- or locally-deployable GUI for complete robot supervision and task planning. Through bi-directional API communication, operators can monitor each robot’s battery, position, health status, current task, and live camera/sensor data in real time. The system enables task scheduling and analytics, allowing users to plan daily operations, track pallet pick-and-drop performance, and review overall efficiency. Multiple forklifts can be managed simultaneously, with operators able to assign tasks, monitor progress, and optimize fleet performance through a single unified dashboard.
The forklift features a Human-in-the-Loop safety system, allowing a user to seamlessly take control via a teleoperation setup with three high-resolution camera feeds and a game-style console. Whenever the system encounters a challenging scenario or requires assistance, the user can intervene in real time, while still leveraging semi-autonomous operations for pallet picking, reversing, and precise dropping. Even under human control, obstacle detection and emergency braking remain active, ensuring complete safety during operations. This integration provides a real-time, highly responsive, and fail-safe mechanism for supervising and assisting the autonomous forklift.
Our autonomous forklift development has been a journey of carefully blending safety, intelligence, and scalability. From the earliest teleoperation trials to advanced fleet management and GPU optimization, every stage was designed with one principle in mind: safe, reliable, and efficient automation for industrial logistics.
We began by enabling teleoperation with a game console controller, supported by a multi-camera live feed on three separate screens. This provided operators with complete visibility of the forklift’s surroundings.
To improve control, we introduced dynamic steering guidelines overlaid on the screen. These guidelines adapt in real time to the steering angle, showing exactly how the forklift will move if the steering is held in its current position.
Even in manual control, safety remains uncompromised. Obstacle avoidance stays active in the background, ensuring no collisions while the operator maneuvers. This “human-in-the-loop” mode guarantees that the system never becomes unsafe, even when a person takes over.
Once teleoperation was robust, we focused on perception and intelligence. Using deep learning models, we trained the forklift to:
Detect pallets and track their centroid positions for precise alignment.
Assign unique IDs to each pallet to avoid confusion when multiple pallets are in view.
Overlay operational feedback — steering guidelines, speed indicators, and alignment markers — on the operator’s console.
This vision-driven capability allowed us to transition smoothly from human-guided teleoperation to semi-autonomous pallet handling.
For navigation, we built a complete mapping and task execution workflow:
Mapping with RTAB → generating a 2D binary map → integration with Nav2.
Waypoints were collected manually via teleoperation and stored as JSON entries, each tagged with a task nature (e.g., “stand,” “pick,” “drop”).
A task executor script reads these waypoints and executes them sequentially.
This enabled the forklift to operate in semi-autonomous mode — where a single command executes a full sequence: drive to pallet, align, pick, reverse, and drop at a destination.
Our fleet management platform extends beyond a single forklift. Built with a flexible GUI that runs both locally and on the cloud, it provides:
Bi-directional communication with every robot in the fleet.
Real-time status: battery, health, task progress, position, warnings.
Direct monitoring of cameras and sensor data.
Task scheduling: plan a full day’s workflow, assign next 10+ tasks.
Analytics & reporting: daily pallet counts, completed operations, performance benchmarks.
Multi-robot support with full control and coordination.
This layer transforms individual forklifts into a connected, intelligent fleet capable of large-scale deployment.
Performance is at the core of our architecture. To balance heavy AI workloads, we optimized both models and computation flow:
Vision models converted into TensorRT engine models for maximum inference speed.
GPU utilization tuned (~70% load on NVIDIA A2000 GPU) with obstacle avoidance + pallet detection running in parallel.
CUDA and Numba applied to offload mathematical operations from CPU to GPU.
Python drives AI and perception scripts, while C++ manages low-level controller logic for Arduino-based systems.
This ensures the forklift can handle real-time decision-making without latency, even under full operational load.
At the heart of our forklift lies a layered control and AI framework, integrating perception, planning, and actuation seamlessly:
Lower Control Layer
micro ROS Servo Controller – manages levers, accelerator, brake.
micro ROS Sensor Feedback Controller – steering control feedback and IMU yaw data.
Point LiDAR near forks – ensures pallet safety during final approach (sub-1m, where cameras lose visibility).
Steering PID Controller – precise turning, linked with forklift control script.
Neon Light Controllers – color-coded status indicators (manual, semi-autonomous, fully autonomous).
AI & Autonomy Layer
Manager Script – receives tasks from central server, ensures synchronization.
Task Planner Script – breaks tasks into ordered sub-tasks.
Object Detection (YOLO-based) – detects pallets, tracks centroids, assigns unique pallet IDs.
Pickup/Drop Automation Script – executes precise handling sequences using centroid tracking.
Mode Switching Logic
Manual Mode → listens to operator joystick velocities.
Semi-Autonomous Mode → executes pallet pick/drop sequences.
Autonomous Mode → follows Nav2-generated navigation goals.
This system was built not just as a proof of concept, but as a scalable industrial solution. By combining human-in-the-loop safety, AI-driven perception, robust autonomy, and fleet-level orchestration, we’ve laid the foundation for a forklift system that is real-time, reliable, and deployment-ready.
Building on our research and autonomous forklift development, we successfully deployed a specialized vision-based pallet handling system for Roots MultiClean. Unlike our fully autonomous forklift, this project focused on precision pick-and-drop operations in a multi-level rack warehouse environment.
The forklift, already fully electric, was integrated with our system by hacking into its native console controls using digital potentiometers to simulate driver inputs. We installed cameras and custom wiring, enabling the forklift to detect and handle pallets without requiring servo retrofits.
The operation flow is hybrid:
The driver aligns the forklift with the target rack.
The operator enters the rack code (e.g., AA03, 05, 06, …).
The forklift automatically adjusts its Z-axis (height up to 21 meters), X-axis, and Y-axis alignment using vision feedback.
It then detects the pallet number via OCR, positions the forks with centimeter-level accuracy, and securely picks the pallet.
Finally, the pallet is brought safely down to the ground.
This system was implemented using ROS2 with our perception and task execution stack, excluding SLAM and navigation, since forklift movement was manually driven. The key innovation here was the combination of vision-based alignment + OCR-based pallet identification, ensuring reliable multi-rack operations across 12 levels.
📹 A demonstration video of this deployment at Roots Multiclean is included below.
Building upon our forklift autonomy system, we extended the same vision-based approach to a BYD Battery Operated Pallet Truck (BOPT). Instead of adding servo controllers, we directly tapped into the BOPT’s native voltage-based controls for steering and acceleration using digital potentiometers managed by our controller.
A depth camera was mounted at the front to enable pallet detection and precise fork alignment. The system could automatically approach a pallet, align along the X, Y, and Z axes, insert the forks accurately, and execute lifting and dropping operations.
The same AI stack — including pallet detection, pallet ID allocation, task planner, and fleet management — was integrated, allowing the BOPT to perform fully autonomous pick-and-drop tasks with obstacle avoidance and safety mechanisms in place.
📹 A demonstration video of this deployment at Delhivery is included below
This autonomous forklift and bopt project was developed during my time at Control One, a robotics startup where I had the opportunity to lead the technical vision and guide a talented team of engineers. As the system architect, I was responsible for designing the complete robotics stack — from low-level servo control and microROS feedback systems to high-level AI-driven perception, navigation, and fleet management.
While this work was a team effort, I played a central role in guiding the development, solving critical technical challenges, and ensuring that the project evolved from teleoperation to semi-autonomy and finally to fully integrated fleet management.
I am deeply grateful to my colleagues and the startup environment at Control One, which pushed me to grow both technically and personally. Even though the journey faced challenges on the organizational side, the experience gave me invaluable insights into building real-world robotics systems at scale, managing teams, and handling the complexity of turning research into deployable products.
This project represents not only a technological achievement but also a milestone in my career — a step closer to my vision of creating robotics systems that transform industrial automation.
Thank You