2

Full Time Chatbot Tester Jobs (NOW HIRING)

On-site (5 days per week) Employment Type: Full-Time Compensation: $180,000-$350,000+ USD ... This is not a chatbot, prompt-engineering, or RAG-wrapper opportunity. The engineering team is ...

New

Lead Web Developer

Reston, VA · On-site

$125K/yr

Hybrid or Remote Terms: Full Time Clearance: U.S. Citizenship Preferred; Ability to Obtain Public ... Lead the technical design, development, configuration, testing support, and release of website ...

Hybrid or Remote Terms: Full Time Clearance: U.S. Citizenship Preferred; Ability to Obtain Public ... Lead the technical design, development, configuration, testing support, and release of website ...

Lead Web Developer

Reston, VA · On-site +1

$125K/yr

Hybrid or Remote Terms: Full Time Clearance: U.S. Citizenship Preferred; Ability to Obtain Public ... Lead the technical design, development, configuration, testing support, and release of website ...

USRA is seeking a full-time Software Engineer to work in their Huntsville, AL office, collaborating ... testing and debugging procedures, technical documentation, and quality assurance • Provide ...

Full stack engineer

New York, NY · On-site

$100K - $180K/yr

An AI-powered Slack chatbot that keeps teams connected to their marketing metrics in real-time ... This is a full-time in-office position (NYC). Engineers are expected to be in office 3-4 days/week.

Web Administrator

Reston, VA · On-site +1

$85K/yr

Hybrid or Remote Terms: Full Time Clearance: U.S. Citizenship Preferred; Ability to Obtain Public ... Provide basic functional, browser, mobile, accessibility, and regression testing for content ...

next page

Showing results 1-20

Full Time Chatbot Tester information

See salary details

$10

$38

$62

How much do full time chatbot tester jobs pay per hour?

As of Jul 4, 2026, the average hourly pay for full time chatbot tester in the United States is $38.36, according to ZipRecruiter salary data. Most workers in this role earn between $21.39 and $50.72 per hour, depending on experience, location, and employer.

What are some common challenges Full Time Chatbot Testers face when evaluating AI-driven conversations?

Full Time Chatbot Testers often encounter challenges such as identifying nuanced language errors, ensuring the chatbot handles unexpected user inputs gracefully, and verifying that responses are both accurate and contextually appropriate. Testers must also adapt to frequent updates in chatbot algorithms and features, requiring flexibility and keen attention to detail. Collaboration with developers and UX designers is essential to communicate findings and suggest improvements, making strong teamwork and documentation skills valuable in this role.

What are the key skills and qualifications needed to thrive as a Full Time Chatbot Tester, and why are they important?

To thrive as a Full Time Chatbot Tester, you need a solid understanding of software testing methodologies, attention to detail, and experience with conversational AI platforms, typically supported by a relevant degree or certifications in computer science or QA. Familiarity with testing tools like Selenium, Jira, and chatbot development environments such as Dialogflow or Rasa is often expected. Strong analytical thinking, communication, and problem-solving skills help testers identify issues and collaborate effectively with development teams. These skills are crucial to ensuring chatbots deliver accurate, natural, and user-friendly experiences while meeting quality standards.

What does a Full Time Chatbot Tester do?

A Full Time Chatbot Tester is responsible for evaluating and ensuring the quality and performance of chatbots before they are released to users. Their duties include creating and running test scenarios, identifying bugs or issues, and providing feedback to developers for improvements. They may test for functionality, user experience, natural language understanding, and integration with other systems. Their work helps ensure that chatbots deliver accurate, helpful, and seamless interactions for end users.

What is the difference between Full Time Chatbot Tester vs QA Tester?

AspectFull Time Chatbot TesterQA Tester
CredentialsBasic testing knowledge, sometimes certifications in software testingCertifications like ISTQB, testing experience
Work EnvironmentTech companies, remote or office-based, focus on AI/chatbot platformsVaried industries, office or remote, focus on software quality
Industry UsagePrimarily in AI, customer service, and tech sectorsAcross multiple industries including software, manufacturing, healthcare

The Full Time Chatbot Tester specializes in testing AI-driven chatbots, ensuring conversational accuracy and functionality, often within tech companies. QA Testers have a broader scope, testing various software products across industries. While both roles require testing skills and certifications, chatbot testers focus specifically on conversational AI, whereas QA testers handle diverse software quality assurance tasks.

More about Full Time Chatbot Tester jobs
What cities are hiring for Full Time Chatbot Tester jobs? Cities with the most Full Time Chatbot Tester job openings:
What are the most commonly searched types of Chatbot Tester jobs? The most popular types of Chatbot Tester jobs are:
What states have the most Full Time Chatbot Tester jobs? States with the most job openings for Full Time Chatbot Tester jobs include:

Test Engineer-AI/LLM

OPPO US Research Center

Palo Alto, CA • On-site

Full-time

Posted 8 days ago


Job description

OPPO US Research Center is seeking a full-time meticulous and innovative AI/LLM Test Engineer to join our cutting-edge AI team. In this critical role, you will evaluate the performance, reliability, and safety of Large Language Models (LLMs) in real-world product scenarios and test end-to-end generative AI solutions. Your work will directly shape how users experience AI-powered features by ensuring robustness, accuracy, and alignment with product goals. This is a unique opportunity to pioneer testing methodologies for next-generation AI systems at the forefront of technology.

We are also seeking a Contractor based LLM Evaluation & QA Engineer to support the testing and validation of large language model (LLM)-powered applications. You will help implement test strategies, execute evaluation workflows, and assist in model performance validation across diverse generative AI use cases.

This contract role is ideal for someone with hands-on experience in AI/ML evaluation, QA engineering, or data analysis who wants to deepen their exposure to generative AI systems.

Requirements

Full-time position requirement:

Core Testing & Evaluation

  • Design and execute performance tests for LLMs across diverse product use cases (e.g., chatbots, content generation etc.).
  • Develop automated test frameworks to evaluate LLM outputs for accuracy, bias, safety, and coherence.
  • Conduct end-to-end testing of integrated generative AI solutions, including APIs, data pipelines, and user interfaces.

Optimization & Validation

  • Collaborate with ML engineers to validate fine-tuned models and optimize prompts for target scenarios.
  • Analyze model failures, edge cases, and adversarial inputs to identify risks and improvement areas.
  • Benchmark LLM performance against industry standards and product-specific KPIs.

Collaboration & Quality Assurance

  • Partner with product, engineering, and research teams to define test requirements and acceptance criteria.
  • Document defects, performance metrics, and test results to drive data-driven improvements.
  • Advocate for AI ethics and safety through rigorous testing of fairness, bias mitigation, and content moderation.

Innovation & Tooling

  • Build scalable tools for synthetic test data generation, prompt variation testing, and automated evaluation workflows.
  • Stay current with advancements in generative AI testing, including red-teaming techniques and evaluation frameworks (e.g., HELM, Dynabench).
  • Propose novel testing strategies for emerging challenges (e.g., hallucinations, context drift).

Basic Qualifications:

  • Bachelor’s degree in Computer Science, Data Science, Engineering, or a related technical field, or equivalent practical experience.
  • 1+ years of experience in software testing, data science, or ML validation, with exposure to AI/ML systems.
  • Proficiency in Python and testing frameworks (e.g., PyTest, Selenium).
  • Hands-on experience evaluating LLMs in production environments (e.g., GPT, Claude, Llama, Gemini).
  • Strong analytical skills for dissecting model behavior, statistical performance, and failure modes.
  • Familiarity with cloud platforms (GCP, Azure, or AWS) and MLOps tooling (e.g., MLflow, Weights & Biases).
  • Experience with version control (Git) and agile development methodologies.

Preferred Qualifications:

  • Master’s degree in AI, Machine Learning, or a related field.
  • Expertise in prompt engineering, LLM fine-tuning (e.g., LoRA, RLHF), or optimization techniques.
  • Experience with automated evaluation tools (e.g., LangChain, TruLens) or LLM-specific test suites.
  • Knowledge of data pipelines, SQL/NoSQL databases, and API testing (e.g., Postman).
  • Background in statistics, quantitative analysis, or data visualization for test insights.
  • Contributions to AI safety/ethics initiatives or open-source LLM evaluation projects.
  • Experience testing mobile-integrated AI solutions (Android/iOS).

Contractor position requirements:

Testing & Evaluation Support:

  • Execute pre-defined performance tests for LLMs across various tasks (e.g., summarization, Q&A, chatbot flows).
  • Run scripted evaluations to assess outputs for factuality, coherence, and safety.
  • Perform manual and automated test execution on APIs and LLM-integrated user interfaces.

Prompt & model validation:

  • Assist ML engineers in evaluating prompt variations and prompt-tuning outcomes.
  • Log and analyze failure cases, anomalies, and edge cases based on provided guidelines.

Collabration & Documentation

  • Work with QA leads, product managers, and ML engineers to understand test goals and criteria.
  • Report defects, compile evaluation summaries, and maintain testing logs.

Tooling & Antomation:

  • Use existing internal tools or frameworks to automate test runs and result collection.
  • Contribute to prompt generation, input templating, or result tagging processes.

Basic Qualifications:

  • Bachelor's degree or equivalent work experience in a technical field (e.g., Computer Science, Engineering, Data Science).
  • 6+ months experience in software QA, data labeling, LLM evaluation, or ML testing projects.
  • Basic Python proficiency, especially for data processing and automation tasks.
  • Familiarity with LLMs (e.g., GPT, Claude, Gemini) and prompt-based outputs.
  • Comfortable working with tools like Jupyter, Postman, or testing dashboards.
  • Detail-oriented with good documentation habits.

Contractor Details:

  • Duration: Long term
  • Rate: Commensurate with experience
  • Conversion Opportunity: High-performing contractors may be considered for full-time roles

Benefits

OPPO is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.

The US base salary range for this full-time position is $100,000-$200,000 + bonus + long term incentives benefits. Our salary ranges are determined by role, level, and location.