1

Senior Prompt Engineer Jobs in Raleigh, NC (NOW HIRING)

Agentic Delivery Senior Engineer

Raleigh, NC

$101.60K - $139.50K/yr

Coach junior practitioners in prompt engineering, engineering quality, delivery methods, and ... Recruiting for this role ends on 5/29/2026 Work You'll Do The Agentic Delivery Senior Engineer will ...

Senior Machine Learning Engineer

Raleigh, NC · On-site

$101.60K - $139.50K/yr

Job Summary We are seeking a Senior Machine Learning Engineer with strong expertise in production ... Design and optimize prompt engineering and tool-calling logic for Generative AI applications.

Senior Machine Learning Engineer

Raleigh, NC · On-site

$101.60K - $139.50K/yr

Job Summary We are seeking a Senior Machine Learning Engineer with strong expertise in production ... Design and optimize prompt engineering and tool-calling logic for Generative AI applications.

AI Engineer Senior Consultant

Raleigh, NC · Hybrid

$101.60K - $139.50K/yr

Implement safety, privacy, and access controls (PII handling, prompt-injection defenses, content ... AI Engineer Senior Consultant Our Deloitte Human Capital team transforms technology platforms ...

Senior Software Engineer

Chapel Hill, NC · On-site +1

$160K - $180K/yr

Senior Software Engineer 4 ($160,000-$180,000) depending on qualifications Job Summary: Upon ... Familiarity with prompt engineering, vector databases, and frameworks like LangChain or LlamaIndex ...

AI Data Engineer - Senior Consultant

Raleigh, NC · Hybrid

$101.60K - $139.50K/yr

Implement safety, privacy, and access controls (PII handling, prompt-injection defenses, content ... AI Engineer Senior Consultant Our Deloitte Human Capital team transforms technology platforms ...

next page

Showing results 1-20

Senior Prompt Engineer information

See Raleigh, NC salary details

$57.8K

$123K

$178.4K

How much do senior prompt engineer jobs pay per year?

As of May 28, 2026, the average yearly pay for senior prompt engineer in Raleigh, NC is $123,024.00, according to ZipRecruiter salary data. Most workers in this role earn between $101,600.00 and $139,500.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Senior Prompt Engineer, and why are they important?

To thrive as a Senior Prompt Engineer, you need expertise in natural language processing, prompt engineering techniques, and a solid background in computer science or a related field. Familiarity with large language model APIs, AI development platforms, and tools like Python, Jupyter, and version control systems is typically required. Exceptional analytical thinking, creativity, and strong communication skills help you craft effective prompts and work collaboratively with multidisciplinary teams. These skills are essential to optimize model outputs, solve complex user challenges, and drive impactful AI solutions.

How does a Senior Prompt Engineer typically collaborate with cross-functional teams to optimize AI system outputs?

Senior Prompt Engineers frequently work alongside data scientists, machine learning engineers, product managers, and UX designers to fine-tune AI models and ensure prompt effectiveness. Collaboration often involves iterative testing, gathering feedback, and adjusting prompts based on real-world user interactions and business goals. This teamwork helps ensure outputs are accurate, contextually appropriate, and aligned with user expectations. Regular communication and shared documentation are essential to maintain alignment and drive continuous improvement.

What are Senior Prompt Engineers?

Senior Prompt Engineers are professionals who specialize in designing, refining, and optimizing prompts for AI language models, such as ChatGPT or other generative AI systems. They combine expertise in natural language processing, programming, and domain knowledge to ensure AI outputs are accurate, relevant, and aligned with user goals. Senior Prompt Engineers typically lead teams, develop prompt engineering best practices, and collaborate with product managers, data scientists, and developers to improve AI-driven products. Their work is crucial for organizations leveraging AI to enhance user experiences, automate workflows, or generate content.

Which 3 jobs will survive AI?

Senior Prompt Engineers are likely to continue thriving as AI relies on skilled professionals to design, refine, and manage prompts for effective outputs. Jobs that require complex problem-solving, creativity, and emotional intelligence—such as healthcare providers, educators, and specialized technical roles—are also expected to persist despite AI advancements. These roles often involve tasks that are difficult for AI to fully replicate or automate.
What are the most commonly searched types of Prompt Engineer jobs in Raleigh, NC? The most popular types of Prompt Engineer jobs in Raleigh, NC are:
What cities near Raleigh, NC are hiring for Senior Prompt Engineer jobs? Cities near Raleigh, NC with the most Senior Prompt Engineer job openings:

Senior Artificial Intelligence Engineer

BlueAlly Technology Solutions, LLC

Cary, NC • On-site

$106.50K - $146.30K/yr

Other

This job post has expired 1 day ago. Applications are no longer accepted.


Job description

Senior AI Engineer

We are hiring a Senior AI Engineer to design, build, and operate enterprise AI systems across our client portfolio. You will work end-to-end across the AI stack — from inference engines and platform infrastructure (vLLM, KV cache, Dynamo-style serving, GPU-accelerated AI Factory platforms) up through application-level engineering (RAG pipelines, agent workflows, prompt engineering, evaluation methodology).

This role is for an engineer who can lead workstreams independently, mentor more junior engineers, and serve as the technical authority that clients trust to deliver production AI outcomes. You'll engage directly with client architects, data scientists, application teams, and executives — and you'll leave each engagement having raised both the client's capability and BlueAlly's practice.

Key Responsibilities

  • Lead end-to-end design, build, and operation of AI systems on AI Factory platforms (HPE PCAI, Dell AI Factory, Nutanix Enterprise AI, and adjacent ecosystem layers) across multiple client engagements.
  • Engineer and tune LLM inference serving stacks — primary depth in vLLM with breadth across the inference ecosystem — for client latency, throughput, and cost targets.
  • Tune inference performance through KV cache management, paged attention, batching strategies, and Dynamo-based disaggregated serving.
  • Architect and operate MLOps pipelines covering model lifecycle, registries, deployment, rollback, and observability.
  • Design and engineer RAG applications on top of vector databases — chunking strategies, retrieval tuning, reranking, citation handling, and context-window management.
  • Build and tune prompt-engineering patterns at production scale — system prompts, structured output, tool and function calling.
  • Design and maintain LLM evaluation harnesses — golden sets, regression suites, and online quality metrics.
  • Engineer high-performance storage and networking for AI workloads — parallel filesystems, object storage tiers, and high-throughput, low-latency RDMA fabrics.
  • Operate Kubernetes clusters underpinning AI workloads — namespaces, RBAC, resource quotas, network policies, storage classes, and ingress.
  • Build and maintain container images, registries, and CI/CD pipelines for AI/ML services.
  • Implement monitoring, alerting, logging, and capacity planning across the AI stack.
  • Harden environments to meet client security and compliance requirements.
  • Lead troubleshooting across bare metal, BIOS/firmware, OS, containers, GPUs, frameworks, and models.
  • Engage directly with client stakeholders — technical and executive — to communicate status, root cause, options, and recommendations.
  • Mentor and code-review work from less senior engineers; raise the technical bar of every engagement you join.
  • Author runbooks, reference architectures, and knowledge base content; lead client knowledge transfer and enablement sessions.
  • Participate in on-call rotation and incident response for production AI workloads.
  • Contribute reusable patterns, tooling, and reference designs back to the practice.

Required Qualifications

  • Experience: 7+ years of software, data, or infrastructure engineering, with 3+ years specifically working with modern AI / LLM systems.
  • Software engineering: Production-quality Python at engineering level — testing, code review, version control fluency, and shipping code that other engineers depend on.
  • Linux engineering: Deep production Linux experience, including system internals, performance tuning, and troubleshooting.
  • Containers: Deep proficiency with Docker — image build, registry management, runtime tuning, and container security.
  • Hardware fundamentals: Strong server-platform skills including CPU/GPU topologies, PCIe, BMC management, BIOS/firmware lifecycle, and physical-to-logical troubleshooting.
  • AI Factory platforms: Hands-on experience deploying and operating one or more of HPE PCAI, Dell AI Factory, or Nutanix Enterprise AI.
  • Inference stack — vLLM: Production experience deploying, tuning, and operating vLLM.
  • Inference stack breadth: Working knowledge of multiple inference and model-serving frameworks beyond vLLM, with the ability to choose and tune the right tool for each workload.
  • High-performance storage and networking: Hands-on experience with high-throughput, low-latency storage and network fabrics for AI workloads — including RDMA-class interconnects, parallel/object storage tiers, KV cache management, and Dynamo-style disaggregated serving.
  • MLOps: Practical experience operating MLOps tooling and patterns — model registries, deployment pipelines, GitOps, lineage, and rollback.
  • Vector databases and RAG: Hands-on experience deploying, tuning, and integrating vector databases and RAG pipelines, including the application-level engineering that sits on top of them.
  • Prompt engineering and tool use: Production experience designing system prompts, structured output, function calling, and tool-using LLM patterns.
  • Evaluation methodology: Demonstrated experience designing LLM evaluation harnesses — golden sets, regression suites, and quality/cost metrics.
  • Client-facing skills: Demonstrated ability to engage directly with client stakeholders — running working sessions, presenting recommendations, and translating technical detail for non-technical audiences.
  • Communication: Strong written and verbal communication — clear reference architectures, runbooks, and incident reports.
  • Mentorship: Track record of mentoring more junior engineers and raising team technical quality through code review and pairing.
  • Networking fundamentals: TCP/IP, DNS, load balancing, VLANs, and firewall administration.
  • Multi-client delivery: Comfort working across multiple concurrent client environments and managing competing priorities under SLA.

Preferred Qualifications

  • GPU operations: Experience with GPU drivers, CUDA toolchains, GPU partitioning (MIG/vGPU), and GPU-level monitoring.
  • NVIDIA AI Enterprise: Deployment and operations experience with the NVAIE software stack.
  • Ray: Familiarity with Ray for distributed training and inference scaling.
  • Kubernetes: Working knowledge of Kubernetes administration — Helm, ingress, RBAC, storage classes.
  • Identity and access: Integrating SSO and enterprise identity (LDAP, AD, OIDC/SAML), secrets management, tenant isolation.
  • Fine-tuning: Familiarity with LoRA/QLoRA/PEFT and supervised fine-tuning workflows.
  • Token economics: Experience optimizing inference cost — caching, prompt caching, model routing, and distillation.
  • MSP / multi-tenant operations: Service-provider experience including chargeback/showback and tenant isolation patterns.
  • Compliance frameworks: SOC 2, HIPAA, FedRAMP, FISMA, or CMMC environments.
  • Public cloud and hybrid: Working experience with one or more public clouds and hybrid architectures.
  • Infrastructure as Code: Terraform, Ansible, Helm, or similar.

Certifications (Preferred)

  • Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD).
  • Cloud certifications — AWS, Azure, or Google Cloud.
  • Linux certifications — RHCE, RHCSA, or LFCS.
  • NVIDIA-Certified Associate: AI Infrastructure and Operations (NCA-AIIO) or higher NVIDIA certifications.
  • HPE, Dell Technologies, or Nutanix platform certifications.

What Sets You Apart

  • Genuine curiosity about how AI systems work end-to-end — from kernel and GPU up through frameworks and models.
  • Track record of restoring production AI services under pressure.
  • Ability to translate complex technical concepts into clear, client-facing communication.
  • Comfort with ambiguity and rapid change in the AI/LLM ecosystem.
  • Service-oriented mindset: you treat each client environment as if it were your own.
  • Bias toward leaving the practice better than you found it — patterns, tooling, and reference designs.

About BlueAlly