1

Metadata Library Jobs in Pennsylvania (NOW HIRING)

Data Engineering Lead

Colmar, PA · Hybrid

$113K - $136K/yr

... libraries) * Support existing HANAbased data models and integrations while modern cloud ... Implement data quality checks, lineage, and metadata standards * Develop data validation frameworks ...

Familiarity with SAS Admin functions like setting up SAS servers, application components, defining libraries, administering repositories, folder structures, metadata movement, managing users, groups ...

Data Engineering Lead

Colmar, PA · On-site

$113K - $136K/yr

... libraries) * Support existing HANA-based data models and integrations while modern cloud ... Implement data quality checks, lineage, and metadata standards * Develop data validation frameworks ...

Data Engineering Lead

Colmar, PA · Hybrid

$113K - $136K/yr

... libraries) * Support existing HANA‑based data models and integrations while modern cloud ... Implement data quality checks, lineage, and metadata standards * Develop data validation frameworks ...

next page

Showing results 1-20

Metadata Library information

What are metadata librarians and what do they do?

Metadata librarians are information professionals who manage and organize metadata, which is data that describes other data, for library collections. They create, edit, and maintain metadata records to ensure resources are discoverable, accessible, and properly described in library catalogs and digital repositories. Their work supports searchability, digital preservation, and resource sharing by applying standards and best practices for cataloging. Metadata librarians often collaborate with IT staff, archivists, and subject specialists to enhance user access to library materials.

What is the difference between Metadata Library vs Metadata Specialist?

AspectMetadata LibraryMetadata Specialist
CredentialsTypically requires a degree in library science, information management, or related fieldsRequires similar credentials, often with additional certifications in data management or information systems
Work EnvironmentLibraries, archives, or information centers managing large metadata collectionsData-driven organizations, digital repositories, or information management teams
Employer & IndustryLibraries, museums, archives, academic institutionsTech companies, publishing, digital content providers
Search & Comparison IntentUnderstanding library metadata management rolesSpecialized data and metadata management tasks

The main difference is that a Metadata Library focuses on managing metadata within library and archival settings, while a Metadata Specialist handles metadata in broader digital and data environments. Both roles require similar credentials but serve different industry needs.

What are some common challenges faced by professionals working in a metadata library role, and how can they be addressed?

Professionals in a metadata library role often encounter challenges such as maintaining consistency and accuracy in metadata standards across diverse collections, keeping up with evolving cataloging guidelines, and integrating new technologies or platforms. Addressing these challenges typically involves ongoing training, collaboration with colleagues to develop clear metadata policies, and staying informed about industry best practices. Regular communication with IT teams and subject specialists is also key to ensuring that metadata effectively supports discoverability and access for library users.

What are the key skills and qualifications needed to thrive as a Metadata Librarian, and why are they important?

To thrive as a Metadata Librarian, you need expertise in cataloging standards (such as MARC, Dublin Core), metadata schema, and information organization, usually supported by a Master's in Library Science or a related field. Familiarity with integrated library systems (ILS), metadata management tools, and knowledge of cataloging software like OCLC Connexion is typical. Attention to detail, analytical thinking, and strong communication skills help ensure accuracy and facilitate collaboration with library staff. These skills and qualities are crucial to maintaining accessible, well-organized digital and print collections that support user discovery and research.
What are popular job titles related to Metadata Library jobs in Pennsylvania? For Metadata Library jobs in Pennsylvania, the most frequently searched job titles are:
What job categories do people searching Metadata Library jobs in Pennsylvania look for? The top searched job categories for Metadata Library jobs in Pennsylvania are:
What cities in Pennsylvania are hiring for Metadata Library jobs? Cities in Pennsylvania with the most Metadata Library job openings:
Infographic showing various Metadata Library job openings in Pennsylvania as of June 2026, with employment types broken down into 93% Full Time, and 7% Part Time. Highlights an 86% In-person, and 14% Remote job distribution.
Fabric Data Engineer - Workplace Engineering

Fabric Data Engineer - Workplace Engineering

Vanguard Group, Inc.

Wayne, PA • On-site

Full-time

Posted 9 days ago


Vanguard rating

8.7

Company rating: 8.7 out of 10

Based on 60 frontline employees who took The Breakroom Quiz

14th of 138 rated financial services


Job description

About the Role
Vanguard is standing up Microsoft Fabric as the enterprise data and analytics foundation that powers our Workplace AI, Power BI, and cross-cloud analytics estate. We are partnering with Microsoft on a CDAO-led Fabric Enablement engagement and are building this capability on an F256 Reserved capacity, integrated with the broader Vanguard data, identity, and security stack - including OneLake Direct Lake against AWS S3, Entra ID and Okta federation, and Microsoft Purview.
Role Summary
We are hiring a hands-on Fabric Data Engineer to own the data layer of that capability. This is a builder's role, not an architect-only role. The engineer designs and implements scalable data products in OneLake - lakehouses, warehouses, pipelines, notebooks, semantic-model-ready Delta tables - and is accountable for the lifecycle, governance, and operational health of the Fabric platform. The complementary AI Engineer role consumes that foundation to build agents, copilots, and Foundry orchestrations; this engineer makes sure the data underneath is governed, monitored, and ready.
You will partner closely with the AI Engineer on AI-ready data products and semantic-layer handoffs; with our Technical Project Manager on program delivery, enablement, and change management; and with our Cloud Domain Architect on platform alignment. You will work alongside the Microsoft CDAO Fabric Enablement team and Vanguard partners across CDAO and Workplace Engineering. You will be a core member of the emerging Workplace AI Fusion Team. This is a strategic engineering and implementation role, not a support position.
Key Responsibilities (Fabric Build & Data Engineering)
  • Design and implement scalable data storage in OneLake using Lakehouses (Delta) and Warehouses (T-SQL); choose the right item for each workload and configure SQL analytics endpoints, shortcuts, and OneLake security.

  • Build and maintain Spark notebooks (PySpark), Data Factory pipelines, Dataflows Gen2, Copy Jobs, and mirroring for batch and incremental ingestion at enterprise scale.

  • Build Real-Time Intelligence solutions: Eventstreams, Eventhouses / KQL databases, Activator reflexes, and Spark structured streaming for low-latency workloads.

  • Optimize Lakehouse tables (OPTIMIZE, V-Order, Z-Order, partitioning) and Direct Lake semantic-model-ready datasets so downstream Power BI and AI agents perform predictably.

ALM & Lifecycle Engineering
  • Implement source control, branching, and CI/CD using native Fabric Git integration (Azure DevOps and GitHub), Fabric Deployment Pipelines, and the Microsoft fabric-cicd Python library.

  • Automate Dev / Test / Prod promotion against the Fabric REST API using service principals and Workload Identity Federation; codify environment-aware bindings via Variable Libraries and parameter.yml.

  • Operate a Feature → Dev → UAT → Prod branching pattern - native Git on Feature and Dev workspaces, pipeline-pushed promotion to UAT and Prod - with mandatory PR review, cherry-pick promotion, and one repo per team to scope blast radius.

  • Own the lifecycle of Fabric data components from creation through retirement, ensuring every environment is reproducible from the GitHub pipeline rather than from the Fabric UI.

Platform Operations & Monitoring
  • Operate the Fabric F256 capacity: monitor CU consumption with the Capacity Metrics App, manage smoothing windows, diagnose interactive and background throttling, and right-size workloads.

  • Build telemetry using the Monitoring Hub, per-workspace Workspace Monitoring (Eventhouse-based KQL logs), Eventhouse monitoring, and the Admin Monitoring Workspace to surface refresh failures, pipeline errors, and semantic-model health.

  • Define dashboards and alerts for ingestion, transformation, refresh, and capacity health; drive root-cause analysis on production incidents and feed lessons back into platform standards.

  • Define and operate the on-call model for production data pipelines and Fabric items in partnership with Tier 3 Engineering.

Standards, Governance & Security
  • Define and enforce Fabric platform standards through Terraform-based IaC using the official microsoft/fabric provider (workspaces, capacities, domains, items), workspace templates, naming and tagging conventions, and automated CI policy checks against the Fabric REST API.

  • Manage tenant settings, domains, and capacity allocation in partnership with the Fabric Center of Excellence; align identity with Entra ID and Okta federation; rotate service principals and use PIM for elevated admin roles.

  • Implement RBAC patterns that separate workspace control-plane roles (Admin / Member / Contributor / Viewer) from OneLake data-plane roles (folder and table level); operate RLS, CLS, OLS, dynamic data masking, and item-level sharing.

  • Integrate Microsoft Purview for sensitivity labels, DLP, metadata scanning, lineage, and impact analysis; manage endorsement (Promoted / Certified) so AI agents and BI consumers only ground on trusted datasets.

Integration & Interoperability
  • Build cross-cloud integration patterns: OneLake Direct Lake against AWS S3, Mirrored Databases for Snowflake, SQL Server, and Cosmos, and shortcuts that avoid Athena and ODBC where Direct Lake delivers better performance.

  • Publish governed, AI-ready data products with Prep for AI configured on semantic models so Fabric Data Agents, Copilot Studio, and Azure AI Foundry can ground on certified Vanguard data.

  • Coordinate with Data, Cloud, Identity, and Security domain teams on data-sharing patterns, private link configuration, and on-prem data gateway operations across the current 6-8 gateway footprint.

Tier 3 Escalation & Expert Support
  • Serve as Tier 3 escalation for complex Fabric, OneLake, pipeline, capacity, and Direct Lake issues across the enterprise.

  • Provide deep technical consultation to Workplace Engineering, CDAO, and partner teams onboarding workloads to Fabric.

  • Build reusable patterns, reference implementations, and internal playbooks for ingestion, modeling, deployment, and capacity operations that scale beyond a single engineer.

Innovation & Strategic Oversight
  • Lead proof-of-concept work for new Fabric capabilities (Mirrored Databases, GraphQL APIs, the SQL Database item, Real-Time Intelligence enhancements, Fabric MCP integration, evolving Direct Lake and Prep-for-AI features).

  • Partner with the Microsoft CDAO Fabric Enablement engagement to bring product roadmap insights back into Vanguard's implementation.

  • Contribute to the Workplace AI and enterprise Data roadmap and operating model, and partner with champions and train-the-trainer initiatives to translate engineering work into adoption outcomes.

Required Qualifications and Skills
  • 8+ years of professional software / data / platform engineering experience, with 5+ years building production data solutions on the Microsoft and / or Azure data stack.

  • Hands-on production experience with at least three of: Microsoft Fabric (Lakehouse, Warehouse, Pipelines, Notebooks, Real-Time Intelligence), Azure Synapse, Azure Data Factory, Databricks, Power BI semantic models, Azure SQL / SQL Server.

  • Strong skills in SQL, PySpark, and KQL - the core Fabric language trio - and comfort moving between batch, streaming, and interactive analytics workloads.

  • Demonstrable experience designing and shipping CI/CD for data platforms: Git workflows, automated deployment, environment promotion, secret-less authentication, and infrastructure-as-code.

  • Working knowledge of Terraform (preferred) or Bicep for cloud platform automation, including provider versioning, state management, and policy-as-code patterns.

  • Experience implementing security and compliance controls in a regulated environment: Purview, Sentinel, Defender, Conditional Access, MIP, DLP, RBAC, RLS / CLS / OLS, dynamic data masking.

  • Identity fluency with Entra ID (Azure AD) and federated IdPs (Okta preferred); experience with service principals, managed identities, and Workload Identity Federation.

  • Experience working in financial services, healthcare, or another heavily regulated environment, or a credible plan to come up to speed quickly.

  • Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.

Preferred Attributes
  • DP-700 (Microsoft Certified: Fabric Data Engineer Associate) required or in-progress within 6 months of hire; DP-600 (Fabric Analytics Engineer Associate) and AZ-305 (Azure Solutions Architect Expert) preferred.

  • Hands-on experience with the Microsoft fabric-cicd Python library and the microsoft/fabric Terraform provider.

  • Experience operating a Fabric Center of Excellence, Power BI CoE, or comparable data-platform CoE.

  • Experience with cross-cloud data integration patterns (OneLake ↔ AWS S3, mirroring, shortcuts) and BCDR for analytics platforms at enterprise scale.

  • Experience configuring Prep for AI on semantic models and partnering with AI / agent engineers on certified data-product handoffs.

  • Background contributing to internal communities of practice, champions networks, or developer enablement programs.

  • Prior experience as a hands-on engineer in a Fusion Team (engineers + product + data + analysts) or Data / AI Center of Excellence model.

  • Additional vendor certifications welcomed but not required: AZ-204, SC-100, DP-203 (legacy, retired March 2025 but still relevant context).

Special Factors
Sponsorship
Vanguard is not offering visa sponsorship for this position.
About Vanguard
At Vanguard, we don't just have a mission-we're on a mission.
To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.
How We Work
Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

What Vanguard employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom