Senior Data Engineer (1043) - DataSF
Full-time Entry LevelJob Overview
DataSF is seeking a Senior Data Engineer with 3+ years of experience to join our growing team. Reporting to the Principal Data Engineer, you will be instrumental in designing, building, and maintaining the City's data infrastructure, enabling robust data pipelines and reliable data access for analytical and operational needs. This is an exciting position for someone eager to apply advanced data engineering techniques to complex urban challenges, contributing directly to San Francisco's commitment to efficient, equitable, and ethical service delivery.
Learn more about DataSF’s recent work on our blog. If you are an entrepreneurial and passionate data enthusiast, join our team to improve government through good use of data!
Essential duties include, but are not limited to, the following:
- Platform Administration: Manage our central Snowflake data warehouse, including access control, security policies, resource monitoring, performance tuning, and cost optimization. Administer our platform with a focus on data democratization and accessibility, while protecting privacy and security.
- Pipeline Development: Build and maintain scalable and resilient pipelines to ingest and structure data from diverse sources. Design infrastructure to support both streaming and batch processes, and both structured and unstructured data sources.
- Infrastructure as Code (IaC): Use Terraform to define, deploy, and manage data infrastructure, ensuring our pipelines are reproducible, version-controlled, and production-ready.
- Best Practices & Innovation: Champion and implement best practices for documentation, data modeling, warehouse architecture, SQL optimization, and testing. Think creatively to find new ways to improve our data platform's capabilities and efficiency. Provide guidance to department partners on data engineering best practices.
- Collaboration: Work closely with data scientists, analysts, product managers, software engineers, and nontechnical stakeholders in diverse domains to understand data requirements and build solutions that meet their needs.
- Monitoring & Support: Proactively monitor the health of the data platform and pipelines, troubleshoot issues, and ensure high standards of data quality and availability.
Desirable Qualifications
- Technical Knowledge
- Hands-on experience administering and developing on managed cloud data platforms such as Snowflake, BigQuery, or Databricks.
- Demonstrated expertise in writing advanced, performant SQL, and using tools like dbt for SQL-based data transformation and modeling.
- Strong programming skills in Python (with libraries like pandas, PySpark) for data processing and automation.
- Proficiency with an Infrastructure as Code tool, with a preference for Terraform.
- Experience building and deploying data pipelines using orchestration tools like Azure Data Factory, Airflow, Dagster, or similar technologies.
- Deep understanding of data warehousing concepts, data modeling, and modern ELT principles.
- Understanding of data governance, data security, and data privacy principles.
- Experience with real-time data streaming technologies (e.g., Kafka, Kinesis, Snowpipe).
- Experience deploying and managing data pipelines for machine learning models.
- Collaboration and Communication Skills
- Strong problem-solving skills with ability to design practical and effective data solutions
- Excellent verbal and written communication skills, including the ability to explain technical concepts to non-technical stakeholders
- A collaborative mindset with enthusiasm to work across diverse, cross-functional teams
- Mission Alignment
- Commitment to equity, transparency, and ethical data use
- Passion for public service and using data to improve government services
- Empathy for San Francisco’s diverse communities and a drive to make data and services and more accessible for SF residents
- Interest or experience in public sector data or social impact work
Make Your Resume Now