1

Data Engineer Flink Jobs in Raleigh, NC (NOW HIRING)

Data Engineer - Bilingual Mandarin required

Cary, NC · On-site

$106K - $127K/yr

SeaTunnel, Kafka, Flink, Spark (data integration, real-time, or offline processing) • Familiar ... data engineering foundation and the ability to quickly learn new tech stacks Data Quality ...

Champion best practices in ML, data governance, and security within the team and across the ... Familiarity with stream processing (ksqlDB, Spark Streaming, Beam/Flink) and modern ML deployment ...

Data Processing & Transformation:AWS Glue, AWS Lambda, EMR, Athena, Amazon Managed Service for Apache Flink,AWS Step Functions, AWS SageMaker Required 7 Years * Experience with DevOps practices and ...

Data Engineer Flink information

See Raleigh, NC salary details

$43.3K

$126.1K

$172.5K

How much do data engineer flink jobs pay per year?

As of Jun 23, 2026, the average yearly pay for data engineer flink in Raleigh, NC is $126,095.00, according to ZipRecruiter salary data. Most workers in this role earn between $111,300.00 and $133,700.00 per year, depending on experience, location, and employer.

What are some common challenges Data Engineers face when working with Apache Flink in a production environment?

Data Engineers working with Apache Flink often encounter challenges such as managing stateful stream processing at scale, ensuring fault tolerance, and optimizing resource usage for real-time data pipelines. Handling late-arriving data and tuning Flink jobs for low latency and high throughput are also frequent hurdles. Collaborating closely with data scientists and application developers is key to aligning data models and ensuring smooth data flow throughout the system.

What is Flink in data engineering?

Apache Flink is an open-source stream processing framework used in data engineering to process large-scale, real-time data streams with low latency. Data engineers working with Flink develop and maintain data pipelines, often integrating it with tools like Kafka and Hadoop to enable real-time analytics and event-driven applications.

What are the key skills and qualifications needed to thrive as a Data Engineer Flink, and why are they important?

To thrive as a Data Engineer specializing in Flink, you need strong programming skills (especially in Java or Scala), a solid understanding of distributed data processing, and experience with data architecture. Familiarity with Apache Flink, stream processing frameworks, big data tools (like Kafka, Hadoop, or Spark), and cloud platforms is typically required, along with relevant certifications. Excellent problem-solving abilities, attention to detail, and effective teamwork and communication skills help you excel in complex data environments. These competencies are crucial for building reliable, scalable data pipelines that power real-time analytics and business decision-making.

Are data engineers still in demand?

Data engineers, including those skilled in Apache Flink, are in high demand due to the increasing need for real-time data processing and scalable data infrastructure. Organizations seek professionals with expertise in big data tools, cloud platforms, and programming languages like Java or Python to build and maintain data pipelines. The role remains critical across industries such as finance, healthcare, and technology.

What is a Data Engineer Flink?

A Data Engineer Flink is a data engineering professional who specializes in using Apache Flink, an open-source stream processing framework, to build, maintain, and optimize systems that process large-scale data in real time. They design and implement data pipelines, ensure data quality and consistency, and collaborate with other engineering teams to deliver reliable and scalable data solutions. Their expertise allows organizations to process, analyze, and react to data as it is generated, enabling real-time insights and decision-making.

Is Flink an ETL?

Apache Flink is a stream processing framework often used in data engineering roles to perform real-time data transformations and processing tasks, which are key components of ETL (Extract, Transform, Load) pipelines. While Flink handles the transformation and loading stages efficiently, it is not a complete ETL tool on its own but is commonly integrated with other systems for data ingestion and storage.

What engineers make 500,000?

Senior data engineers, especially those with expertise in Apache Flink, cloud platforms, and large-scale data processing, can reach or exceed a $500,000 annual salary in high-demand markets. These roles often require advanced skills, certifications, and experience managing complex data pipelines and real-time analytics environments.
What job categories do people searching Data Engineer Flink jobs in Raleigh, NC look for? The top searched job categories for Data Engineer Flink jobs in Raleigh, NC are:
What cities near Raleigh, NC are hiring for Data Engineer Flink jobs? Cities near Raleigh, NC with the most Data Engineer Flink job openings:

Data Engineer - Bilingual Mandarin required

CWILL

Cary, NC • On-site

$106K - $127K/yr

Full-time

Retirement, PTO

Posted 11 days ago


Job description

CWILL (pronounced "quill") is the post-purchase and retention suite built for Shopify.

With strong product-market fit and expanding US operations, we're building out our security and compliance capabilities to meet global data privacy standards.

Learn more: www.cwill.com

I. Basic Information

Work Authorization

Green Card / U.S. Citizen required (we do nor sponsor)

Job Title

Data Engineer

Focus Areas

Data ingestion, data lakehouse, data warehouse, data platform, data service APIs, data quality & engineering agent development

Level

Junior to mid-level with high growth potential

Location

United States — on-site, remote, or hybrid (per company requirements)

Employment Type

Full-time

Collaborating Teams

CWILL Data Engineering, Data Analytics, Business, Product, and Technology teams

Language

English required; Mandarin is a strong plus

Cross-Timezone Work

Must maintain a regular collaboration window with the China team; strong async communication and documentation skills required (approx. 2 hrs/day overlap needed)

Collaboration Frequency

Every 1–2 days; approx. 2 hrs per session. Candidates in western US time zones preferred for scheduling.

II. Role Positioning

CWILL is building data infrastructure to support business operations, product capabilities, customer service, analytics, and intelligent applications. As a US-side data engineer, you will participate in multi-source data ingestion, data lakehouse and warehouse development, data quality governance, data platform capability building, and AI Agent engineering automation exploration.

We are looking for candidates with a solid foundation in SQL, Python, and data engineering — someone who can, with guidance from the existing data team, progressively take ownership of data ingestion, modeling, quality, and service tasks, while collaborating effectively with domestic data engineering, analytics, and business teams.

This is not a pure data analysis, BI reporting, or one-off scripting role. It is a comprehensive data engineering position focused on data integration, data warehouse development, data platform capabilities, data services, and engineering automation.

III. Role Mission

Through stable, well-structured, and scalable data engineering capabilities, help the company unify, govern, model, and serve data scattered across business systems, SaaS platforms, external channels, and internal systems — improving the usability, accuracy, timeliness, and reusability of CWILL’s data assets.

This role is expected to continuously drive:

• More standardized data source ingestion

• Clearer data lakehouse and warehouse structure

• More automated data quality monitoring

• More platform-driven data service capabilities

• Progressive adoption of agent-based and automated approaches for data development, troubleshooting, documentation, and quality checks

IV. Key Responsibilities

1. Data Ingestion & Pipeline Development

• Ingest data from internal and external business systems, third-party platforms, SaaS products, and external data sources; handle data collection, sync, cleansing, and loading

• Participate in building offline and real-time data pipelines using SeaTunnel, Kafka, Flink, Spark, or similar technologies to improve ingestion stability and processing efficiency

• Handle practical challenges in data sync: authentication, pagination, rate limiting, failure retry, incremental sync, backfill, schema changes, and task anomalies

2. Data Warehouse & Data Modeling

• Participate in layered data warehouse development across ODS, DWD, DWS, and ADS layers; build and maintain data models

• Support business domain modeling, metric standardization, shared data model development, and core table maintenance

• Optimize data organization and query performance on OLAP engines such as Doris to provide stable data support for product, operations, growth, customer success, and management analytics

3. Data Quality & Data Governance

• Build and maintain data quality rules for core data pipelines; ensure data accuracy, completeness, consistency, and timeliness

• Participate in data validation, anomaly detection, alerting, and issue resolution; help improve stability of critical data pipelines

• Contribute to data governance capabilities including DataHub or similar tools; improve metadata management, data lineage, data asset catalog, and data standards

4. Data Platform & Data Services

• Participate in building data platform capabilities including data development, task scheduling, monitoring, quality management, governance, and service delivery modules

• Use tools such as DolphinScheduler and StreamPark for task management, scheduling orchestration, and real-time task operations

• Support the data service layer by delivering standardized APIs, metric services, and data capabilities to internal systems, analytics applications, and business tools

• Support underlying data for tools like Superset; ensure data availability for BI dashboards, metric boards, and business monitoring

5. AI Agent & Engineering Automation

• Participate in designing and implementing data development automation tools and engineering agents

• Explore AI agent applications in data development, governance, quality detection, task operations, anomaly diagnosis, and documentation generation

• Leverage large language models and automation tools to improve data engineering efficiency, task stability, and platform intelligence

Requirements

Must-Have

Experience

• 1–4 years of experience in data engineering, data platforms, data warehousing, backend development, analytics engineering, or a related role

• Real project experience in data ingestion, data pipelines, data warehouse development, data modeling, data services, or data platform work

• Strong learning ability and execution skills; able to independently drive small-to-medium data engineering tasks with clear objectives

SQL Skills

• Proficient in SQL for querying, cleansing, aggregation, deduplication, comparison, validation, and metric calculation

• Familiar with joins, window functions, CTEs, aggregation analysis, incremental logic, and basic performance optimization

• Understands data warehouse layering concepts: fact tables, dimension tables, subject domains, metric definitions, and shared models

Data Development

• Proficient in Java or Python for API integration, data processing, automation scripting, and file handling

• Understands common engineering patterns: REST APIs, OAuth/API keys, pagination, rate limiting, retry logic, error handling, logging, and task idempotency

• Good code structure habits; writes clean, maintainable, and reusable code

• Familiar with Git, code review practices, README documentation, logging, testing, and collaborative engineering workflows

Pipeline & Platform Tools

• Familiar with one or more of: SeaTunnel, Kafka, Flink, Spark (data integration, real-time, or offline processing)

• Familiar with one or more of: Doris, ClickHouse, Snowflake, BigQuery, Redshift, Databricks, PostgreSQL (data warehouse, OLAP, or lakehouse systems)

• Familiar with one or more of: DolphinScheduler, StreamPark, Airflow, Dagster, Prefect, dbt (scheduling, development, or task management tools)

• Understands data pipeline operations: scheduling, dependencies, monitoring, failure retry, backfill, version management, and deployment processes

• Candidates are not expected to master all tools, but must have a solid data engineering foundation and the ability to quickly learn new tech stacks

Data Quality & Governance Mindset

• Understands data quality dimensions: accuracy, completeness, consistency, uniqueness, timeliness, and anomaly detection

• Proactively designs data validation rules and can identify and locate data anomalies

• Familiar with metadata management, data lineage, data asset catalogs, and data standards; experience with DataHub or similar platforms is a plus

Collaboration & Communication

• Able to communicate data requirements with analysts, business stakeholders, backend engineers, and product managers

• Clearly describes problems, solutions, risks, progress, and deliverables

• Comfortable with cross-timezone collaboration; strong written and spoken English communication skills

• Willing to participate in regular fixed collaboration sessions with China-based teams and drive work through documentation and async communication

Nice-to-Have

• Experience integrating third-party SaaS data: CRM, ERP, marketing platforms, customer service systems, logistics, e-commerce, payment systems, or ad platforms

• Experience building data lakehouses, data middle platforms, data platforms, or enterprise-level data warehouses

• Experience developing data service APIs, metric services, internal data products, or lightweight backend services

• Experience with data quality frameworks, data lineage, metadata management, data catalogs, observability, or monitoring and alerting

• AWS, GCP, or Azure cloud platform experience

• Docker, CI/CD, Terraform, Kubernetes, or basic DevOps experience

• Experience with LLMs, AI Agents, code generation, automated testing, task inspection, data quality agents, or engineering efficiency tooling

• Experience with cross-border teams, international business, supply chain, e-commerce, logistics, marketing, or customer success data scenarios

Benefits

Starting Pay: 90 - 130k depends on experiences, open to negotiation

401(k)

PTO

Paid Holidays

Insurance