2

Perception Engineer Remote Jobs (NOW HIRING)

Senior Machine Learning Engineer - Platform

$107K - $146K/yr

This is a remote position for candidates based in the US. You should apply if: * You want to impact ... Architect a Unified Perception Layer: Lead the transition from fragmented, task-specific models to ...

Flight Research Remote Pilot

Yuma, AZ · Remote

$91K - $125K/yr

We build and deploy autonomy, perception, planning, and radar systems across conventional, electric ... Flight test engineering * UAS operations within the National Airspace System beyond Part 107

Flight Research Remote Pilot

Yuma, AZ · Remote

$91K - $125K/yr

We build and deploy autonomy, perception, planning, and radar systems across conventional, electric ... Flight test engineering * UAS operations within the National Airspace System beyond Part 107

Requirements * 5+ years of professional experience developing and implementing ML for perception ... Proficiency in Unix-based environments (Linux, macOS) including working with remote servers and ...

Senior Machine Learning Engineer

Detroit, MI · On-site +1

$126K - $180K/yr

Requirements * 5+ years of professional experience developing and implementing ML for perception ... Proficiency in Unix-based environments (Linux, macOS) including working with remote servers and ...

next page

Showing results 1-20

Perception Engineer Remote information

See salary details

$12

$55

$80

How much do perception engineer remote jobs pay per hour?

As of Jun 20, 2026, the average hourly pay for perception engineer remote in the United States is $55.99, according to ZipRecruiter salary data. Most workers in this role earn between $40.14 and $74.52 per hour, depending on experience, location, and employer.

What does a Perception Engineer do?

A Perception Engineer designs and implements systems that allow machines, such as autonomous vehicles or robots, to interpret and understand their environment using sensors like cameras, LiDAR, or radar. They develop algorithms for object detection, classification, and tracking to enable real-time decision-making. Perception Engineers often work with machine learning, computer vision, and sensor fusion technologies to improve the accuracy and reliability of these systems. Working remotely, they collaborate with cross-functional teams using online tools and platforms. This role is critical in advancing autonomous systems and robotics.

What are some common challenges Perception Engineers face when working remotely, and how can they be addressed?

Perception Engineers working remotely often encounter challenges such as limited access to physical hardware (e.g., sensors, robots, or vehicles) for testing algorithms, and difficulties in real-time collaboration with team members across different time zones. To address these issues, many remote teams rely on advanced simulation tools, cloud-based platforms for data sharing, and regular virtual meetings to ensure alignment. Building strong communication habits and leveraging remote debugging tools can also help maintain productivity and foster effective teamwork.

What are the key skills and qualifications needed to thrive as a Perception Engineer (Remote), and why are they important?

To thrive as a Perception Engineer (Remote), you need a strong background in computer vision, machine learning, and sensor fusion, typically supported by a degree in computer science, robotics, or a related field. Proficiency in programming languages like Python and C++, experience with frameworks such as TensorFlow or PyTorch, and familiarity with tools like ROS are commonly required. Excellent problem-solving skills, effective communication, and the ability to work independently are vital soft skills for remote collaboration. These competencies are crucial for developing robust perception systems that enable autonomous technologies to interpret and interact with their environment accurately.

What is the difference between Perception Engineer Remote vs Computer Vision Engineer?

AspectPerception Engineer RemoteComputer Vision Engineer
Required CredentialsBachelor's or higher in CS, Electrical Engineering, or related; experience with perception algorithmsBachelor's or higher in CS, Electrical Engineering, or related; strong programming skills in Python/C++
Work EnvironmentRemote, collaborative teams, often in autonomous vehicles, robotics, or AIRemote or on-site, working on image processing, object detection, and visual data analysis
Industry UsageAutonomous vehicles, robotics, AI startupsAutomotive, robotics, surveillance, and AI research

While both roles focus on perception and visual data, Perception Engineers Remote typically work on sensor data integration and perception algorithms in autonomous systems remotely, whereas Computer Vision Engineers focus more broadly on image processing and visual data analysis, often in research or product development settings.

What cities are hiring for Perception Engineer Remote jobs? Cities with the most Perception Engineer Remote job openings:
What are the most commonly searched types of Perception Engineer jobs? The most popular types of Perception Engineer jobs are:
What states have the most Perception Engineer Remote jobs? States with the most job openings for Perception Engineer Remote jobs include:
What job categories do people searching Perception Engineer Remote jobs look for? The top searched job categories for Perception Engineer Remote jobs are:
Infographic showing various Perception Engineer Remote job openings in the United States as of June 2026, with employment types broken down into 100% Full Time. Highlights an 100% Remote job distribution, with an average salary of $116,463 per year, or $56 per hour.

AI Research Engineer: Vision AI / VLM / Physical AI

Centific

Remote

$209K/yr

Full-time

This job post has expired 1 day ago. Applications are no longer accepted.


Job description

About Centific
Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. We harness the power of an integrated solution ecosystem-comprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 markets-to create contextual, multilingual, pre-trained datasets; fine-tuned, industry-specific LLMs; and RAG pipelines supported by vector databases. Our zero-distance innovationâ„¢ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.
Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale, helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.
About Job
AI Research Engineer: Vision AI / VLM / Physical AI
Company: Centific
Location: Seattle, WA (or Remote)
Type: Full-time
Build the Future of Perception & Embodied Intelligence
Are you pushing the frontier of computer vision, multimodal large models, and embodied/physical AI-and have the publications to show it? Join us to translate cutting-edge- research into production systems that perceive, reason, and act in the real world.
The Mission
We are building state of- t-heart Vision AI across 2D/3D perception, egocentric/360° understanding, and multimodal reasoning. As an AI Research Engineer, you will own high- -leverage experiments from paper → prototype → deployable module in our platform.
We are seeking passionate Engineersto join our cutting-edge labs, you could be part of :
Computer Vision team as a Research Engineer and dive into the world of 3D reconstruction, scene understanding, and visual AI. You'll explore innovative techniques like those used to transform real-world spaces into immersive 3D models-such as the 3D Reconstruction projects -and work with cutting-edge architectures like VGG-T (Visual Geometry Grounded Transformers), known for advancing deep learning in vision tasks. This role is perfect for those excited to develop AI systems that interpret, reconstruct, and interact with the visual world, using state-of-the-art tools and methodologies. Physical AI Robotics team, where you'll work at the intersection of simulation, robotics, and AI. You'll leverage NVIDIA's Omniverse for advanced 3D simulation and collaboration, Isaac Sim for robotics training and testing, and GR00T for foundation models in robotics. Experience with Holoscan SDK for real-time medical and industrial robotics pipelines, Newton Physics for dynamic simulation, and NVIDIA's NERD for neural robot dynamics will be a plus. This role is ideal for those eager to push the boundaries of AI-driven robotics using state-of-the-art tools and frameworks.
What You'll Do
  • Advance Visual Perception: Build and fine-tune models for detection, tracking, segmentation (2D/3D), pose & activity recognition, and scene understanding (incl. 360° and multi-view).
  • Multimodal Reasoning with VLMs: Train/evaluate vision-language models (VLMs) for grounding, dense captioning, temporal QA, and tooluse; design retrieval- augmented and agentic loops for perception- action- tasks.
  • Physical AI & Embodiment: Prototype perception-in-the-loop policies that close the gap from pixels to actions (simulation + real data). Integrate with planners and task graphs for manipulation, navigation, or safety workflows.
  • Data & Evaluation at Scale: Curate datasets, author high-signal evaluation protocols/KPIs, and run ablations that make results irreproducible impossible.
  • Systems & Deployment: Package research into reliable services on a modern stack (Kubernetes, Docker, Ray, FastAPI), with profiling, telemetry, and CI for reproducible science.
  • Agentic Workflows: Orchestrate multi-agent pipelines (e.g., -LangGraphstyle graphs) that combine perception, reasoning, simulation, and -codeg eneration to -selfc heck and -selfcorrect-.

Example Problems You Might Tackle
  • Long horizon- video understanding (events, activities, causality) from egocentric or 360° video.
  • 3D scene grounding: linking language queries to objects, affordances, and trajectories.
  • Fast, privacy preserving perception for -ondevice- or edge inference.
  • Robust multi-modal evaluation: temporal consistency, open-set detection, uncertainty.
  • Vision conditioned- policy evaluation in sim (Isaac/MuJoCo) with sim2real stress tests.

Minimum Qualifications
  • Masters/Ph.D in CS/EE/Robotics (or related), actively publishing in CV/ML/Robotics (e.g., CVPR/ICCV/ECCV, NeurIPS/ICML/ICLR, CoRL/RSS).
  • Strong PyTorch (or JAX) and Python; comfort with CUDA profiling and mixed precision- training.
  • Demonstrated research in computer vision and at least one of: VLMs (e.g., LLaVA style, video- language- models), embodied/physical AI, 3D perception.
  • Proven ability to move from paper → code → ablation → result with rigorous experiment tracking.

Preferred Qualifications
  • Experience with video models (e.g., TimeSFormer/MViT/VideoMAE), diffusion or 3D GS/NeRF pipelines, or SLAM/scene reconstruction.
  • Prior work on multimodal grounding (referring expressions, spatial language, affordances) or temporal reasoning.
  • Familiarity with ROS2, DeepStream/TAO, or edge inference optimizations (TensorRT, ONNX).
  • Scalable training: Ray, distributed data loaders, sharded checkpoints.
  • Strong software craft: testing, linting, profiling, containers, and reproducibility.
  • Public code artifacts (GitHub) and first-author publications or strong open source- impact.

Our Stack (you'll touch a subset)
  • Modeling: PyTorch, torchvision/lightning, Hugging Face, OpenMMLab, xFormers
  • Perception: YOLO/Detectron/MMDet, SAM/Mask2Former, CLIP-style backbones, optical flow
  • VLM / LLM: Vision encoders + LLMs, RAG for video, toolformer-/agent loops
  • 3D / Sim: Open3D, PyTorch3D, Isaac/MuJoCo, COLMAP/SLAM, NeRF/3DGS
  • Systems: Python, FastAPI, Ray, Kubernetes, Docker, Triton/TensorRT, Weights & Biases
  • Pipelines: LangGraph-like orchestration, data versioning, artifact stores

What Success Looks Like
  • A publishable or open-sourced outcome (with company approval) or a production-ready module that measurably moves a product KPI (latency, accuracy, robustness).
  • Clean, reproducible code with documented ablations and an evaluation report that a teammate can rerun end-to-end.
  • A demo that clearly communicates capabilities, limits, and next steps.

Why Centific
  • Real impact: Your research ships-powering core features in our MVPs and products.
  • Mentorship: Work closely with our Principal Architect and senior engineers/researchers.
  • Velocity + Rigor: We balance top-tier research practices with pragmatic product focus.

Salary: $90K Annually
Centific is an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, citizenship status, age, mental or physical disability, medical condition, sex (including pregnancy), gender identity or expression, sexual orientation, marital status, familial status, veteran status, or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories, consistent with legal requirements.