1

Vision Language Model Jobs (NOW HIRING)

The ideal candidate will have advanced working knowledge of data analytics, modern machine learning algorithms, foundation models, large language models, vision-language models, small language models ...

The ideal candidate will have advanced working knowledge of data analytics, modern machine learning algorithms, foundation models, large language models, vision-language models, small language models ...

We're looking for a Vision Language Model (VLM) & Visual Foundation Model (VFM) Forward Deployed Engineer to operate at the forefront of visual and multi-modal intelligence deployment in industry ...

We're looking for a Vision Language Model (VLM) & Visual Foundation Model (VFM) Forward Deployed Engineer to operate at the forefront of visual and multi-modal intelligence deployment in industry ...

You'll architect and implement Vision-Language-Action (VLA) models, advance reinforcement learning applications, and push the boundaries of multimodal AI integration. This role combines deep ...

You'll play a pivotal role in building advanced vision pipelines (detection, segmentation, transformers, 3D vision) and integrating them with large language models (LLMs) and vision-language models ...

next page

Showing results 1-20

Vision Language Model information

See salary details

$10

$31

$67

How much do vision language model jobs pay per hour?

As of Jun 4, 2026, the average hourly pay for vision language model in the United States is $31.37, according to ZipRecruiter salary data. Most workers in this role earn between $18.99 and $39.18 per hour, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Vision Language Model Engineer, and why are they important?

To thrive as a Vision Language Model Engineer, you need a strong background in computer vision, natural language processing, machine learning, and often a graduate degree in computer science or a related field. Proficiency with deep learning frameworks such as TensorFlow or PyTorch, experience with large-scale datasets, and familiarity with model deployment tools are typically required. Strong problem-solving skills, creativity, and effective collaboration abilities help you stand out in this rapidly evolving field. These skills are essential for developing advanced AI systems that accurately interpret and generate language grounded in visual data, driving innovation in applications like image captioning and visual question answering.

What are some common challenges faced by professionals working with Vision Language Models, and how can they be addressed?

Professionals working with Vision Language Models often encounter challenges such as aligning visual and textual data, handling large-scale datasets, and ensuring model interpretability. Dealing with noisy or incomplete data from either modality can affect model performance, so strong data preprocessing and augmentation skills are essential. Collaboration with multidisciplinary teams—including data engineers, machine learning researchers, and domain experts—is key to refining models and deploying them effectively. Staying updated with the latest advancements and leveraging open-source resources can also help address these challenges.

What is a Vision Language Model?

A Vision Language Model (VLM) is an artificial intelligence system designed to understand and generate information using both visual data (like images or videos) and textual data (like written language). These models are trained on large datasets containing images paired with descriptive text, allowing them to perform tasks such as image captioning, visual question answering, and multimodal content generation. VLMs use advanced machine learning techniques to learn the relationships between visual elements and language, making them valuable for applications that require an integrated understanding of both modalities. They are widely used in fields such as robotics, accessibility technology, and automated content creation.

What is the difference between Vision Language Model vs Computer Vision Engineer?

AspectVision Language ModelComputer Vision Engineer
Required credentialsAdvanced degrees in AI, Machine Learning, or related fieldsDegree in Computer Science, Electrical Engineering, or related fields
Work environmentResearch labs, AI startups, tech companies focusing on multimodal AITech companies, research institutions, industries applying image analysis
Industry usageDeveloping multimodal AI systems combining vision and languageCreating algorithms for image recognition, object detection, and analysis
Search and comparison intentUnderstanding roles in AI development involving vision and languageFocus on technical image processing and computer vision applications

While both roles involve working with visual data, a Vision Language Model specializes in integrating visual and textual information using advanced AI techniques, often in research or product development. In contrast, a Computer Vision Engineer focuses on developing algorithms for analyzing and interpreting visual data, primarily in applications like image recognition and object detection.

Infographic showing various Vision Language Model job openings in the United States as of May 2026, with employment types broken down into 1% As Needed, 39% Full Time, 55% Part Time, 1% Temporary, 3% Contract, and 1% Nights. Highlights an 91% Physical, 3% Hybrid, and 6% Remote job distribution, with an average salary of $65,246 per year, or $31.4 per hour.
AI Solutions Architect

AI Solutions Architect

CNPC USA

Houston, TX • On-site

Other

This job post has expired today. Applications are no longer accepted.


Job description

Company Profile:
CNPC USA is a subsidiary of China National Petroleum Company and serves as the North American headquarters. Our mission is to drive innovation through advanced research and development of next-generation technologies for oil and gas exploration and production.
Job Summary:
CNPC USA is seeking a highly experienced AI Solutions Architect to lead the design, prototyping, implementation, and integration of artificial intelligence, machine learning, generative AI, and industrial analytics solutions for oil and gas technology applications. This position is a key technical role responsible for translating open-ended business and technical challenges into scalable AI system architectures, decision-support tools, digital workflows, and production-ready analytical solutions.
The ideal candidate will have advanced working knowledge of data analytics, modern machine learning algorithms, foundation models, large language models, vision-language models, small language models, optimization methods, operations research, and modern decision science. This role will work closely with subject-matter experts, product champions, product managers, designers, and software engineers to develop AI-enabled solutions that support CNPC USA technology development, product commercialization, and energy-domain digital transformation.
Key Responsibilities:
  • Conduct exploratory and undirected technology development to address open-ended AI/ML problems and questions in the energy domain.
  • Participate in data science, artificial intelligence, machine learning, industrial analytics, decision science, and operations research initiatives.
  • Develop, prototype, and evaluate solutions using modern deep learning methods, foundation models, generative AI, modern NLP, vision-language models, small language models, and time-series analytics.
  • Research and assess next-generation technologies for inference, predictive modeling, general-purpose data-driven modeling, and optimization of complex systems.
  • Engineer appropriate system-level AI solutions in collaboration with subject-matter experts, product champions, product managers, designers, and software engineers.
  • Work with software engineering teams to integrate AI solutions into business workflows, cloud environments, data platforms, and production applications.
  • Prototype end-to-end data solutions across multiple cross-functional teams in high-visibility roles.
  • Generate innovative ideas, establish new technology development directions, and shape and execute technical projects from concept through deployment.
  • Maintain state-of-the-art knowledge and contribute to technical discussions, architecture reviews, project reviews, and expert assessments in related areas of responsibility.
  • Communicate sophisticated AI concepts, plans, recommendations, and results effectively to management, clients, technical stakeholders, and the broader business community.
  • Prepare oral and written reports, presentations, technical memoranda, project documentation, and executive-level summaries.
  • Work effectively with peers, management, operations groups, and outside organizations to advance technology development and deployment.
  • Participate in relevant technical reviews and audits of projects as requested.
  • Review, mentor, and coach junior team members while defining and promoting standards, best practices, reusable architectures, and lessons learned.
  • Actively disseminate knowledge through webinars, talks, tutorials, technical communities, and internal training activities.
Minimum Education & Experience Requirements:
  • Master's degree in Operations Research, Industrial Engineering, Applied Mathematics, Computer Science, or a related STEM field, or foreign equivalent.
  • Three (3) years of post-baccalaureate experience in the job offered or in any AI/data science-related job title.
Applicants must have three (3) years of experience in each of the following:
  • AI and data science in the decision science and operations research space using software implementation technology.
  • Markov decision process methods and applications.
  • Data mining for analytics and decision making.
  • LLM-based generative AI solution development.
  • Vision-language model and small language model system development.
  • Modern NLP development in AI.
  • Computational intelligence and non-convex optimization techniques.
  • Time-series analysis techniques using statistics and AI.
  • Applied mathematics and statistics.
  • Cloud development tools and cloud environments for AI, data mining, and large-scale data systems.
  • Optimization solver tools, including CPLEX.
  • Programming languages and frameworks for modern AI and data science, including Python, R, TensorFlow, and PyTorch.
Preferred Experience:
  • Experience applying AI/ML, optimization, and decision science to oilfield, drilling, completion, reservoir, production, or other oil and gas related domains.
  • Experience architecting end-to-end AI systems, including data pipelines, model development, model serving, evaluation, monitoring, and workflow integration.
  • Experience with generative AI application patterns such as retrieval-augmented generation, domain-specific copilots, multimodal AI workflows, and human-in-the-loop decision support.
  • Experience translating ambiguous business needs into technical roadmaps, architecture options, proof-of-concept demonstrations, and scalable implementation plans.
  • Experience leading cross-functional technical discussions and mentoring engineers or data scientists on AI solution design and best practices.

Physical Demands:
The physical demands described here are representative of those that must be met by an employee to successfully perform the essential job functions.
While performing the duties of this job, the employee is regularly required to talk or hear. This is a sedentary role; however, some filing, bending and the ability to lift 20 lbs. is required.
Travel:
This position requires 5-10% domestic and international travel for internal workshops, project work sessions, technical workshops, conferences, and customer presentations. Local travel between other CNPC USA locations and testing or partner facilities may be required.
Work Arrangement:
Telecommuting is permitted less than 50% per week within the same geographic location as the assigned CNPC USA office location.
Supervisory Responsibility:
This position has no direct supervisory responsibilities; however, it does act as a mentor and technical point of contact for less experienced engineers, data scientists, and AI/ML team members.
CNPC USA is an Equal Opportunity Employer (EOE). Qualified applicants are considered regardless of race, color, age, sex, sexual orientation, religion, disability, ethnicity, national origin, marital status, veteran status, or any other legally protected status.
Disclaimer: The job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee. Other duties, responsibilities and activities may change or be assigned at any time with or without notice.