Data Engineer
Full-time Mid-Senior LevelJob Overview
We are looking for a talented and motivated Data Engineer with strong experience in PySpark and Python to design, build, and maintain scalable data pipelines and infrastructure. The successful candidate will support the delivery of data-driven insights by transforming raw data into clean, curated datasets for analytics and machine learning applications. Java experience is a plus and will be useful in hybrid environments.
Key Responsibilities:
Develop and optimize robust, scalable data pipelines using PySpark and Python
Clean, transform, and enrich large-scale datasets from structured and unstructured sources
Implement data ingestion, ETL/ELT workflows, and integration strategies across cloud and on-prem platforms
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements
Ensure data quality, integrity, and lineage throughout the data lifecycle
Participate in performance tuning, troubleshooting, and production support
Contribute to best practices in data engineering, including code versioning, testing, and CI/CD
Make Your Resume Now