About Me
I'm Prashant Rawat, a Senior AI Engineer at Arithmer, Tokyo, with 7+ years building production-grade AI systems across robotics, computer vision, and generative AI. I specialize in autonomous navigation, real-time perception, and intelligent decision-making for robotic platforms — including humanoid and quadruped robots with LiDAR-camera sensor fusion, cloud-deployed CV pipelines on AWS, and generative AI for virtual apparel try-on.
My background spans defense AI at BEL, fintech fraud detection at Celusion, and cutting-edge robotics at Arithmer — giving me a rare combination of real-time embedded systems, large-scale cloud ML, and foundation model applications.
Bio
Professional Skills
Languages & Data
Deep Learning & Vision
Infrastructure, MLOps & API
Generative AI
Core Concepts & Fields
Work Experience
Senior AI Engineer at Arithmer, Tokyo (Japan)
- Architected autonomous navigation stack for the Unitree G1 humanoid using LiDAR SLAM, Intel RealSense depth sensing, and ROS2 — enabling robust mapping, localization, and path planning in dynamic indoor environments
- Integrated Vision-Language Models for camera-based real-time perception and interactive Q&A; developed a VLA pick-and-place pipeline for the Dobot Nova robotic arm with end-to-end language-to-action execution
- Architected 8+ AI modules for manufacturing and logistics with Docker, achieving 90%+ accuracy; deployed cloud-based CV solutions on AWS (EC2, S3, CloudWatch, RDS)
- Built Generative AI pipeline transforming static user images into dynamic videos simulating virtual apparel try-on
- Engineered autonomous navigation and AI-powered intrusion detection system for Unitree Go2 quadruped with LiDAR-camera sensor fusion, real-time person detection, face recognition, and intelligent patrol in unstructured indoor environments
Sr. Machine Learning Engineer at Celusion Technologies, Thane
- Led a team of engineers developing anomaly detection models for fraudulent credit card transactions
- Spearheaded facial recognition with passive liveness detection to prevent spoofing attacks, improving security of automated KYC workflows
Machine Learning Engineer at Celusion Technologies, Thane
- Engineered a document intelligence pipeline for Aadhaar and PAN card processing (CV, object detection, OCR) for automated KYC used by major Indian banks
- Developed fraud detection using K-means clustering and facial landmarks; automated bank reconciliation with ML algorithms
Member Research Engineer-II at Central Research Laboratory - B.E.L, Ghaziabad
- Developed ML-based weapon detection from real-time camera, radar, and electric fence feeds for border security — deployed in field operations
- Built coastal surveillance systems for unauthorized ship detection.
- Led the Air Traffic Management System (ATMS) and implemented RTI-DDS for inter-module communication across projects
Education
M.Tech in Machine Learning & Intelligent Systems - IIIT-Allahabad
B.Tech in Computer Science - College of Engineering Roorkee
Portfolio
Humanoid Robot — Autonomous Navigation & Intelligent Interaction (Unitree G1)
Architected a production-grade autonomy stack for the Unitree G1 humanoid robot, enabling robust indoor navigation, real-time perception, and natural human-robot interaction. Built LiDAR-based SLAM for mapping and localization in dynamically changing indoor environments, fused with Intel RealSense D435i depth data for obstacle-aware checkpoint-based path planning via ROS2. Integrated camera-based perception using Vision-Language Models (Gemini & OpenAI APIs) for real-time scene understanding — enabling the robot to visually identify objects, describe them in natural language, and execute context-aware actions. Developed an interactive Q&A mode driven by hand-raise detection, allowing the humanoid to autonomously engage with live audiences. Demonstrated end-to-end system at client presentations with fully autonomous navigation, perception, and speech response loops.
Quadruped Robot — Autonomous Navigation & Intrusion Detection (Unitree Go2/Go2-W)
Engineered an end-to-end autonomous navigation and intelligent surveillance system for the Unitree Go2 quadruped robot. Built LiDAR-based mapping and localization with dynamic re-planning to adapt to changing indoor layouts, fused with onboard camera perception for robust sensor fusion (LiDAR + vision) via ROS2. Developed an AI-powered intrusion detection and surveillance module featuring autonomous random-walk patrol, real-time person detection, face recognition for authorized personnel verification, and instant alert generation on unknown intrusions. Integrated Bluetooth voice command interface and LLM-based agent for natural language control. System operates reliably in unstructured indoor environments with real-time decision-making and continuous environmental awareness.
Robotic Arm — Vision-Language-Action Pick-and-Place (Dobot Nova)
Developed a Vision-Language-Action (VLA) pipeline for the Dobot Nova robotic arm, bridging camera-based perception with intelligent action execution. The system processes natural language commands, leverages real-time computer vision to detect and localize target objects in cluttered scenes, and computes precise grasp-and-place trajectories for reliable pick-and-place operations. Designed for real-world deployment in dynamic environments where object positions and workspace configurations change frequently, demonstrating end-to-end integration of language understanding, visual perception, and robotic manipulation.
Generative AI - Virtual Apparel Try-On
Developed a Generative AI pipeline that transforms a user's static image into a dynamic video, realistically simulating them wearing selected garments in motion.
Cloud-Based Object Detection & Quality Inspection
Engineered and deployed cloud-based object detection and computer vision solutions on AWS (EC2, S3, CloudWatch, RDS). Improved automated quality inspection systems achieving 90% accuracy, with anomaly detection for manufacturing and logistics.
Manufacturing Defect Detection with Anomalib
Built an anomaly detection system for manufacturing defect identification using Anomalib. Customized and improved core Anomalib components to boost detection accuracy for domain-specific defect patterns, enabling automated quality inspection on production lines.
Foaming Depth Measurement for Enzyme Production
Developed a CNN and Detectron2-based computer vision system to measure the depth of foaming during the enzyme manufacturing process, enabling real-time monitoring and precise control of foaming levels for consistent production quality.
Human Pose Comparison
Compare the poses of instructor and student. Gives real-time feedback to the student to improve their form in areas like dance, yoga or other exercises and subsequently helps them understand their areas of improvement.
Identity Verification of Indian Aadhaar and PAN Card
Fast and accurate detection of Indian Aadhaar and PAN card from user's uploaded image, and subsequent OCR to automate retrieval of user information which is then cross-checked with user's request form in banking software.
Face Liveness Detection
This project combines facial recognition with facial passive liveness detection to prevent spoofing of biometric samples in banking software.
Face DeDupe
Using clustering and facial landmarks, check any new request form's user image for existence in banking database. If a match is detected but user enters conflicting details, the software warns about possible fraud.
Coastal Surveillance System - Myanmar
A seamless integration of the sensors on TCP, UDP and RabbitMQ network, then processed data for sensor data fusion. Deployed in field operations for unauthorized ship detection.
Integrated Perimeter Security System - India
A Weapon Detection Module using object detection to improve the surveillance system for real-time camera feed. Deployed in field operations for border security.
Sensor Planning Deployment
An application to determine the minimum number of sensors required for effective surveillance of a designated area.
Master's Thesis: Localization using ML in Partially Dynamic Environment
Proposed and tested methodology within the framework of CNN-LSTM for real-time localization of a robot in static and dynamic environments. Improved on state-of-the-art accuracy.
Self-Driving Car in Simulator
Nvidia-based CNN architecture for predicting Steering Angle, Throttle, Brake, Speed which are the basic parameters for driving a car. These predicted parameters are then inferred in real-time for each frame.
Robot Navigation using Overhead Camera and Reinforcement Learning
Real-time reinforcement learning-based solutions to make the robot move intelligently from the initial position, collect the ball from a location and score the goal.
Semi-Automated FAQ-Retrieval System
An FAQ system using data mining and NLP that returns the top-10 best-matching questions for user query quickly.
Hand Gesture Recognition
A real-time object detection of hand gestures from a finite number of gesture classes in American Sign Language.
Open Source
Tools I've built and published to PyPI — covering the full CV workflow from annotation to training to image processing.
alchemycv
Python library for streamlining computer vision workflows — image masking, edge detection, and common CV operations.
pip install alchemycv
AlchemyAnnotate
Desktop annotation tool for drawing bounding boxes on images. Exports to YOLO, Pascal VOC, and COCO formats for model training.
pip install alchemyannotate