Data Engineer
Full-time AssociateJob Overview
We are building a next‑generation, business‑centric data intelligence and AI foundation that fuels Finance—from FP&A intelligence to product, compliance, and controllership decision‑making. As a Data Engineer of the Finance Technology – Data Intelligence organization, you will architect scalable data pipelines, semantic models, and end‑to‑end BI solutions, while safely operationalizing GenAI capabilities such as RAG, prompt engineering, evaluation frameworks, and agent‑based workflows. You will collaborate closely with analysts, data scientists, and engineering partners to deliver secure, reliable, auditable, and reusable data and AI services that materially enhance decision quality, automation, and speed across Finance.
Responsibilities:
Build the data foundation
Collaborate with Data Analysts, Data Scientists, Software Engineers, and cross-functional partners to design, build, and deploy scalable data pipelines that deliver high‑quality, governed analytical datasets across Finance domains.
Engineer high‑quality batch/streaming data pipelines (SQL/Hive/PySpark) across Lake/Lakehouse to power curated finance domain marts and a governed semantic layer.
Design dimensional/semantic models that enable self‑service analytics (Power BI / Fabric semantic models / SSAS Tabular) with performant DAX measures and row‑level security.
Operationalize Gen AI for Finance
Ship production‑grade Gen AI features (retrieval‑augmented generation, prompt‑chaining/agents) on governed datasets - implement vectorization strategies and chunking that respect PII/SOX controls.
Partner with DS/ML to train/fine‑tune and evaluate models - harden prompt templates, guardrails, and content filters - track hallucination, toxicity, and retrieval metrics (precision/recall, hit@k).
Build reusable components (prompt libraries, evaluation harnesses, vector store abstractions) and integration SDKs/APIs for reuse across Finance use cases.
Platform, reliability & DevOps
Implement CI/CD for data & AI (Git, Azure DevOps/GitHub Actions), data quality tests (Great Expectations or equivalent), and model/data deployment automation (MLflow/Fabric/Azure ML).
Define observability (lineage, drift, freshness, cost) with alerts/SLAs - drive continuous hardening for performance (SQL/DAX tuning), cost efficiency, and reliability.
Analytics enablement
Deliver high‑impact dashboards/scorecards (Power BI/Tableau) and governed certified datasets - coach analysts on best‑practice modeling and performance tuning.
Risk, governance & documentation
Embed privacy‑by‑design (PII masking, purpose limitation), finance controls (SOX, audit trails), and robust documentation (runbooks, data dictionaries, model cards).
This is a hybrid position. Expectation of days in the office will be confirmed by your Hiring Manager.
Make Your Resume Now