Prashant Rawat | Senior AI Engineer

About Me

I'm Prashant Rawat, a Senior AI Engineer at Arithmer, Tokyo, with 7+ years building production-grade AI systems across robotics, computer vision, and generative AI. I specialize in autonomous navigation, real-time perception, and intelligent decision-making for robotic platforms — including humanoid and quadruped robots with LiDAR-camera sensor fusion, cloud-deployed CV pipelines on AWS, and generative AI for virtual apparel try-on.

My background spans defense AI at BEL, fintech fraud detection at Celusion, and cutting-edge robotics at Arithmer — giving me a rare combination of real-time embedded systems, large-scale cloud ML, and foundation model applications.

Bio

Email prashantrawatmailbox@gmail.com

Phone +91-8791327684 / +81-8085450101

Location Tokyo, Japan

Professional Skills

Languages & Data

Python C / C++ ROS2 SQL Pandas NumPy Scikit-Learn Matplotlib HTML CSS JavaScript

Deep Learning & Vision

PyTorch TensorFlow Keras Hugging Face Transformers OpenCV Detectron PaDiM

Infrastructure, MLOps & API

Flask FastAPI AWS GCP Docker Git Jira CI/CD RabbitMQ ROS

Generative AI

Agentic AI OpenAI APIs

Core Concepts & Fields

Artificial Intelligence Machine Learning Deep Learning NLP Computer Vision Reinforcement Learning Object Detection Anomaly Detection SLAM Localization Point Cloud Data Science Data Mining Predictive Analytics Data Visualization

Work Experience

Senior AI Engineer at Arithmer, Tokyo (Japan)

Sep 2024 - Present

Architected autonomous navigation stack for the Unitree G1 humanoid using LiDAR SLAM, Intel RealSense depth sensing, and ROS2 — enabling robust mapping, localization, and path planning in dynamic indoor environments
Integrated Vision-Language Models for camera-based real-time perception and interactive Q&A; developed a VLA pick-and-place pipeline for the Dobot Nova robotic arm with end-to-end language-to-action execution
Architected 8+ AI modules for manufacturing and logistics with Docker, achieving 90%+ accuracy; deployed cloud-based CV solutions on AWS (EC2, S3, CloudWatch, RDS)
Built Generative AI pipeline transforming static user images into dynamic videos simulating virtual apparel try-on
Engineered autonomous navigation and AI-powered intrusion detection system for Unitree Go2 quadruped with LiDAR-camera sensor fusion, real-time person detection, face recognition, and intelligent patrol in unstructured indoor environments

Sr. Machine Learning Engineer at Celusion Technologies, Thane

Apr 2023 - Aug 2024

Led a team of engineers developing anomaly detection models for fraudulent credit card transactions
Spearheaded facial recognition with passive liveness detection to prevent spoofing attacks, improving security of automated KYC workflows

Machine Learning Engineer at Celusion Technologies, Thane

Sep 2021 - Mar 2023

Engineered a document intelligence pipeline for Aadhaar and PAN card processing (CV, object detection, OCR) for automated KYC used by major Indian banks
Developed fraud detection using K-means clustering and facial landmarks; automated bank reconciliation with ML algorithms

Member Research Engineer-II at Central Research Laboratory - B.E.L, Ghaziabad

Jul 2019 - Sep 2021

Developed ML-based weapon detection from real-time camera, radar, and electric fence feeds for border security — deployed in field operations
Built coastal surveillance systems for unauthorized ship detection.
Led the Air Traffic Management System (ATMS) and implemented RTI-DDS for inter-module communication across projects

Education

M.Tech in Machine Learning & Intelligent Systems - IIIT-Allahabad

2017 - 2019

B.Tech in Computer Science - College of Engineering Roorkee

2012 - 2016

Portfolio

Humanoid Robot — Autonomous Navigation & Intelligent Interaction (Unitree G1)

Architected a production-grade autonomy stack for the Unitree G1 humanoid robot, enabling robust indoor navigation, real-time perception, and natural human-robot interaction. Built LiDAR-based SLAM for mapping and localization in dynamically changing indoor environments, fused with Intel RealSense D435i depth data for obstacle-aware checkpoint-based path planning via ROS2. Integrated camera-based perception using Vision-Language Models (Gemini & OpenAI APIs) for real-time scene understanding — enabling the robot to visually identify objects, describe them in natural language, and execute context-aware actions. Developed an interactive Q&A mode driven by hand-raise detection, allowing the humanoid to autonomously engage with live audiences. Demonstrated end-to-end system at client presentations with fully autonomous navigation, perception, and speech response loops.

Python ROS2 LiDAR SLAM Localization Point Cloud Sensor Fusion VLM (Gemini) OpenAI API Intel RealSense

Quadruped Robot — Autonomous Navigation & Intrusion Detection (Unitree Go2/Go2-W)

Engineered an end-to-end autonomous navigation and intelligent surveillance system for the Unitree Go2 quadruped robot. Built LiDAR-based mapping and localization with dynamic re-planning to adapt to changing indoor layouts, fused with onboard camera perception for robust sensor fusion (LiDAR + vision) via ROS2. Developed an AI-powered intrusion detection and surveillance module featuring autonomous random-walk patrol, real-time person detection, face recognition for authorized personnel verification, and instant alert generation on unknown intrusions. Integrated Bluetooth voice command interface and LLM-based agent for natural language control. System operates reliably in unstructured indoor environments with real-time decision-making and continuous environmental awareness.

Python ROS2 LiDAR SLAM Localization Point Cloud Sensor Fusion Computer Vision Intrusion Detection LLM Agent

Robotic Arm — Vision-Language-Action Pick-and-Place (Dobot Nova)

Developed a Vision-Language-Action (VLA) pipeline for the Dobot Nova robotic arm, bridging camera-based perception with intelligent action execution. The system processes natural language commands, leverages real-time computer vision to detect and localize target objects in cluttered scenes, and computes precise grasp-and-place trajectories for reliable pick-and-place operations. Designed for real-world deployment in dynamic environments where object positions and workspace configurations change frequently, demonstrating end-to-end integration of language understanding, visual perception, and robotic manipulation.

Python VLA Computer Vision AI Models Dobot Nova Robotic Arm

Generative AI - Virtual Apparel Try-On

Developed a Generative AI pipeline that transforms a user's static image into a dynamic video, realistically simulating them wearing selected garments in motion.

Generative AI Deep Learning Computer Vision

Cloud-Based Object Detection & Quality Inspection

Engineered and deployed cloud-based object detection and computer vision solutions on AWS (EC2, S3, CloudWatch, RDS). Improved automated quality inspection systems achieving 90% accuracy, with anomaly detection for manufacturing and logistics.

AWS Docker Object Detection Anomaly Detection

Manufacturing Defect Detection with Anomalib

Built an anomaly detection system for manufacturing defect identification using Anomalib. Customized and improved core Anomalib components to boost detection accuracy for domain-specific defect patterns, enabling automated quality inspection on production lines.

Anomalib Anomaly Detection PyTorch Manufacturing

Foaming Depth Measurement for Enzyme Production

Developed a CNN and Detectron2-based computer vision system to measure the depth of foaming during the enzyme manufacturing process, enabling real-time monitoring and precise control of foaming levels for consistent production quality.

Detectron2 CNN Computer Vision Depth Estimation

Human Pose Comparison

Compare the poses of instructor and student. Gives real-time feedback to the student to improve their form in areas like dance, yoga or other exercises and subsequently helps them understand their areas of improvement.

Pose Estimation Computer Vision Real-time

Identity Verification of Indian Aadhaar and PAN Card

Fast and accurate detection of Indian Aadhaar and PAN card from user's uploaded image, and subsequent OCR to automate retrieval of user information which is then cross-checked with user's request form in banking software.

OCR Object Detection KYC

Face Liveness Detection

This project combines facial recognition with facial passive liveness detection to prevent spoofing of biometric samples in banking software.

Face Recognition Liveness Detection Anti-Spoofing

Face DeDupe

Using clustering and facial landmarks, check any new request form's user image for existence in banking database. If a match is detected but user enters conflicting details, the software warns about possible fraud.

Clustering Fraud Detection Face Recognition

Coastal Surveillance System - Myanmar

A seamless integration of the sensors on TCP, UDP and RabbitMQ network, then processed data for sensor data fusion. Deployed in field operations for unauthorized ship detection.

Sensor Fusion RabbitMQ Surveillance Deployed

Integrated Perimeter Security System - India

A Weapon Detection Module using object detection to improve the surveillance system for real-time camera feed. Deployed in field operations for border security.

Weapon Detection Object Detection Real-time Deployed

Sensor Planning Deployment

An application to determine the minimum number of sensors required for effective surveillance of a designated area.

Optimization Surveillance Planning

Master's Thesis: Localization using ML in Partially Dynamic Environment

Proposed and tested methodology within the framework of CNN-LSTM for real-time localization of a robot in static and dynamic environments. Improved on state-of-the-art accuracy.

CNN-LSTM Localization Robotics

Self-Driving Car in Simulator

Nvidia-based CNN architecture for predicting Steering Angle, Throttle, Brake, Speed which are the basic parameters for driving a car. These predicted parameters are then inferred in real-time for each frame.

CNN Autonomous Driving Real-time Inference

Robot Navigation using Overhead Camera and Reinforcement Learning

Real-time reinforcement learning-based solutions to make the robot move intelligently from the initial position, collect the ball from a location and score the goal.

Reinforcement Learning Robotics Computer Vision

Semi-Automated FAQ-Retrieval System

An FAQ system using data mining and NLP that returns the top-10 best-matching questions for user query quickly.

NLP Data Mining Information Retrieval

Hand Gesture Recognition

A real-time object detection of hand gestures from a finite number of gesture classes in American Sign Language.

Sign Language Object Detection Real-time

Open Source

Tools I've built and published to PyPI — covering the full CV workflow from annotation to training to image processing.

alchemycv

Python library for streamlining computer vision workflows — image masking, edge detection, and common CV operations.

$ pip install alchemycv

PyPI GitHub

Python OpenCV PyPI

AlchemyAnnotate

Desktop annotation tool for drawing bounding boxes on images. Exports to YOLO, Pascal VOC, and COCO formats for model training.

$ pip install alchemyannotate

PyPI GitHub

Python PySide6 YOLO COCO

AlchemyDetect

Desktop GUI for training and running inference with Detectron2 models — Faster R-CNN, RetinaNet, and Mask R-CNN with real-time loss plots.

$ pip install alchemydetect

PyPI GitHub

Python Detectron2 PyTorch PyQt6

AlchemyCloud

Desktop editor and viewer for 3D point clouds — load PCD/PLY files, crop, downsample, measure, and export, with smooth interactive inspection.

Python Open3D PySide6 3D

Coming soon

AlchemyFace

Annotate faces, build a known-person database, and configure detection & recognition pipelines — label identities and export ready-to-use config.

Python OpenCV Face Recognition PySide6

Coming soon

Contact

+91-8791327684

prashantrawatmailbox@gmail.com

github.com/kouya-marino

linkedin.com/in/prashantrawat

Let's work together

Have a project, role, or collaboration in mind? Send me an email and I'll get back to you as soon as I can.

Email Me