Job Summary:
Dotmatics is a company dedicated to accelerating scientific innovation through its comprehensive digital science platform. They are seeking a Senior AI Software Engineer to build and advance the Luma platform, focusing on AI agent workflows and tooling that integrate with enterprise data systems.
Responsibilities:
• Design and implement AI agent workflows and tooling using LangChain/LangGraph, enabling AI models to plan actions, call tools, use APIs, search information, and reliably complete multi-step workflows.
• Build and maintain the tools, function interfaces, and system connectors that AI agents use to interact with databases, document stores, enterprise apps, and external APIs.
• Ensure AI agents operate safely, follow rules, respect permissions, and reliably execute within defined constraints.
• Lead and execute the design and implementation of core workflow orchestration and tooling features, including automated tasks and background processes.
• Build scalable FastAPI services with well-defined RESTful APIs and real-time streaming endpoints.
• Create modular, reusable service components with strong authentication, error handling, and pagination patterns.
• Develop and guide React frontend components for real-time interactions and data visualization.
• Implement multi-tenant architecture with secure isolation, resource boundaries, and long-term scalability.
• Provide technical guidance to other engineers during implementation, ensuring high-quality, maintainable solutions.
• Evaluate risk when implementing new features or refactoring, and propose safe rollout strategies.
• Clean Architecture principles with clear separation of concerns
• Microservices design patterns including service discovery, API gateways, and interservice communication
• Event-driven architecture with message queues and async processing
• Subprocess isolation patterns for credential management and security boundaries
• Influence architectural direction across teams, helping bring clarity and structure to ambiguous problems.
• Architect robust AI agent execution layers that ensure determinism, observability, and reliable stepwise execution.
• Write comprehensive automated tests using pytest and Jest, including integration and behavior-driven tests.
• Implement structured logging, correlation IDs, and observability patterns to ensure system clarity and operability.
• Contribute to and improve CI/CD pipelines with automated testing, linting, and deployment workflows.
• Set up effective monitoring and alerting for production systems.
• Lead or support critical incident resolution with calm, context-driven decision-making.
• Drive platform-wide improvements in performance, reliability, and technical quality.
• Exercise independent judgment in methods, techniques, and evaluation criteria to ensure robust outcomes.
• Instrument AI agent systems with monitoring, tracing, and guardrails to ensure safe and predictable behavior in production.
• Document architectural decisions, engineering patterns, and approaches that become long-term references for the team.
• Provide approach summaries and technical proposals before major implementations to ensure alignment with product and engineering partners.
• Participate in planning and estimation, applying deep technical judgment and strong product awareness.
• Mentor engineers, raise team capabilities, and guide others through complex engineering workflows (feature branches, PRs, ticket management).
• Build relationships across engineering and product groups, influencing roadmaps and cross-team initiatives.
• Communicate risks, challenges, and opportunities proactively and clearly to stakeholders.
• Document and evangelize best practices for safe, reliable, and maintainable AI agent design.
Qualifications:
Required:
• 10+ years of professional software development experience, including significant experience owning and delivering large-scale technical systems.
• Ability to design durable architectures, independently lead high-impact engineering efforts, and mentor other engineers while maintaining exceptional coding standards.
• Expert-level Python 3.11+ with deep understanding of async/await, type hints, and modern Python best practices.
• Experience building AI agents using LangChain/LangGraph, including tool creation, step planning, function calling, retrieval workflows, and reliable agent-state management.
• FastAPI experience building production RESTful APIs, streaming endpoints (SSE), and async request handling.
• Strong PostgreSQL expertise (including performance tuning and schema design) and SQLAlchemy.
• Strong understanding of dependency injection, clean architecture, and functional programming concepts.
• Experience designing and scaling microservices in production environments.
• Ability to assess engineering risk, propose rollout strategies, and make high-impact architectural decisions.
• Experience building safe execution environments and guardrails for AI decision-making.
• React & TypeScript with modern hooks and state management patterns (Redux/Context).
• Experience with Webpack Module Federation and micro-frontend architectures.
• Ability to design responsive, maintainable UI components using SCSS/CSS.
• Familiarity with Jest for robust frontend testing practices.
• LangChain & LangGraph expertise for building AI agent workflows, tool orchestration, and LLM integration.
• Proven experience building LLM-powered applications with frameworks like LangChain, LangGraph, or similar.
• Understanding of Retrieval-Augmented Generation (RAG) patterns and vector embeddings.
• Experience with agent orchestration, tool creation, and multi-step reasoning workflows.
• Familiarity with LLM serving endpoints (Databricks, OpenAI, Anthropic, or similar).
• Knowledge of streaming responses, callback systems, and real-time feedback mechanisms.
• Understanding of Model Context Protocol (MCP) for tool integration.
Preferred:
• Production Kubernetes experience with Helm charts and orchestration.
• Experience with Databricks or similar cloud data platforms.
• LangChain/LangGraph production implementations.
Company:
Dotmatics is an R&D scientific software connecting science, data, and decision-making. It is a sub-organization of Siemens. Founded in 2005, the company is headquartered in Boston, USA, with a team of 501-1000 employees. The company is currently Late Stage.