Job Summary:
Salesforce is the #1 AI CRM, where humans with agents drive customer success together. The software engineer role at Salesforce encompasses architecture, design, implementation, and testing to ensure high-quality product releases, focusing on building reliable and scalable machine learning infrastructure.
Responsibilities:
• Design, build, and operate systems to train, serve, and deploy machine learning models at scale, with a focus on reliability, performance, and operational simplicity
• Evolve GPU backed inference infrastructure to support high throughput, latency sensitive workloads, including large scale model serving
• Architect and optimize distributed training and data processing systems using platforms such as Ray, Airflow, Spark, or similar technologies
• Build and maintain Kubernetes based platforms and orchestration layers using tools such as KubeRay, vLLM, and internally developed services
• Architect solutions that bridge legacy systems with modern technologies while maintaining monolithic application stability
• Develop robust monitoring, observability, and alerting for production ML workloads to ensure operational excellence
• Partner closely with AI Platform, ML modeling, security, and product engineering teams to design infrastructure that supports evolving AI use cases
• Provide technical leadership through design reviews, mentorship, and by setting engineering standards and long term architectural direction for ML infrastructure
• Author technical design and architecture documentation, and contribute thought leadership through engineering blog posts
• Build and ship high-quality, production-grade software using modern engineering practices, with AI as a core part of your development workflow by pushing the boundaries of AI development tools to deliver secure, optimized, and high-quality code
• Design and orchestrate complex systems where AI agents integrate seamlessly into human workflows, driving efficiency and innovation at scale
• Contribute to building and maintaining the shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably
• Critically evaluate code (Human or AI-generated) for correctness, quality, security, and performance
Qualifications:
Required:
• Significant professional experience in software engineering with a strong focus on infrastructure, backend systems, platform engineering, or MLOps
• Deep experience building and operating distributed systems, including expert level knowledge of Kubernetes and container based platforms
• Hands on experience with modern ML infrastructure and serving stacks such as Ray or KubeRay, vLLM, or similar training and inference orchestration frameworks
• Experience working with GPU infrastructure, including performance optimization and operational management at scale
• Strong experience with data infrastructure and orchestration technologies such as Airflow, Spark, or similar systems
• Experience building and operating cloud native systems on public cloud platforms such as AWS, GCP, or Azure, including infrastructure as code
• A demonstrated ability to drive technical direction for complex systems and balance short term delivery with long term architectural goals
• Excellent written communication, as well as ability to thrive in an asynchronous and globally distributed infrastructure team.
• A related technical degree required
• A demonstrated, genuine AI-first approach to engineering. Using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.
• Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows
• Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready.
Company:
Slack is a cloud-based communication and collaboration platform for teams. It is a sub-organization of Salesforce. Founded in 2009, the company is headquartered in San Francisco, USA, with a team of 1001-5000 employees. The company is currently Late Stage.