Senior Data Engineer

Holcim España
Holcim España
Madrid, SpainOn-siteCompetitiveAdded 1 month agoInternship

Original Advert

SUMMARY OF THE JOB

We are seeking a seasoned Senior Data Engineer to design, build, and optimize our next-generation data platform. You will be responsible for architecting scalable data pipelines, managing large-scale distributed systems, and ensuring our data infrastructure in AWS and Databricks is robust and efficient. The ideal candidate is a Spark expert with a deep understanding of the AWS ecosystem and a passion for automation.

MAIN ACTIVITIES / RESPONSIBILITIES

  • Pipeline Architecture: Design and implement complex batch and streaming ETL/ELT pipelines using Python, SQL, and Spark to process massive datasets.

  • Cloud Infrastructure: Leverage AWS Data Analytics services to build scalable, secure, and cost-effective data solutions.

  • Orchestration & DevOps: Manage and automate data workflows using Airflow, while utilizing Docker and ECS for containerized application deployment.

  • System Optimization: Monitor and tune the performance of distributed systems (Spark Cluster) to ensure high availability and low latency.

  • Infrastructure as Code: Utilize AWS CloudFormation or Terraform to manage data infrastructure, ensuring repeatable and version-controlled environments.

  • Cost Optimization: Monitor and optimize AWS spend by selecting appropriate instance types (Spot vs. On-Demand) and refining data storage strategies.

  • Security & Compliance: Implement IAM roles, bucket policies, and encryption (KMS) to ensure data is secure at rest and in transit.

  • Collaboration: Work within an Agile framework to deliver iterative value, collaborating closely with Data Scientists and Stakeholders to translate business needs into technical reality.

JOB DIMENSIONS

List of direct reports:

  • Up to 2 Direct Reports, and around 15 externals

Key interfaces, stakeholders and relationships:

  • Internal:

    • GDS: product manager, application manager, data & analytics & AI team

    • Country business stakeholders

  • External : 3rd party vendors

PROFILE REQUIRED

  • Experience: Minimum 4+ years of hands-on experience in active Big Data environments and 2+ years specializing in Data Analytics within AWS.

    • Compute & Processing:Amazon EMR: Architecting and managing Spark clusters for large-scale distributed processing.

      • AWS Glue: Developing serverless ETL jobs, managing the Data Catalog, and implementing Glue Crawlers.

    • Storage & Warehousing:

      • Amazon S3: Implementing "Data Lake" best practices, including partitioning, compression (Parquet/Avro), and lifecycle policies.

      • Amazon Redshift: Designing star/snowflake schemas and optimizing query performance for high-volume data warehousing.

      • Amazon Athena: Performing ad-hoc SQL analysis directly on S3 data.

      • Experience with open table formats (iceberg/delta)

    • Orchestration & Integration:

      • Amazon MWAA (Managed Workflows for Apache Airflow): Deploying and scaling Airflow environments.

      • AWS Lambda: Building event-driven data triggers and micro-services.

    • Streaming (Advantage):Amazon Kinesis or MSK (Managed Streaming for Kafka) for real-time data ingestion.

  • Core Engineering: Expert-level proficiency in Spark, Python, and SQL.

  • Infrastructure & Tooling: Proven experience with Airflow for orchestration and Docker/ECS for containerization.

  • Good knowledge in Databricks and data mesh architectures. Good understanding in how to implement and maintain Lakehouse data models (bronze / silver / gold layers) using Delta Lake for reliability, ACID transactions, time travel and schema evolution.

  • Solid software engineering practices: Git, CI/CD for data pipelines, automated testing, code quality and documentation.

  • Communication: Excellent written and oral English communication skills, with the ability to explain complex technical concepts to non-technical audiences.

  • Degree in Computer Science, Engineering, Mathematics or related field, or equivalent practical experience.

PREFERRED "PLUS" QUALIFICATION

  • Real-time Processing: Experience with streaming and distributed messaging applications like Flink and Kafka.

  • Core Tech:Java programming.

  • Industrialise ML use cases

  • Data Visualization: Experience with QlikView or QlikSense to support BI initiatives.

  • Agile: Experience working in a fast-paced Scrum or Kanban environment.

  • Certifications: AWS Certified Data Engineer - Associate/Professional or AWS Certified Solutions Architect, Databricks Data engineer (Associated/Professional) certification

  • DevOps: Experience with Openshift, Github Actions or Jenkins for CI/CD of data workflows.

Sr / Principal Pharma Industry Business Impact Lead (CSM)

Barcelona, Spain (Hybrid)
4d ago

Machine Learning Engineer, Amazon Tablets

Madrid, Spain
1w ago

Gestor/a Data Scientist Risc Operacional (mad/Bcn)

Barcelona, Spain
1w ago
Visa Sponsor

Data Engineer

Barcelona, Spain
1w ago

Senior Product Data Analyst

Barcelona, Spain
€70K1w ago

Data Control / Project Data Analyst (construction / Data Center)

Zaragoza, Spain
1w ago

#Discover II 2026-2027 INTERNSHIP "Machine Learning for Stress Analyses"

Madrid, Spain
2w ago
Visa Sponsor

Associate Principal AI Engineer

Barcelona, Spain
2w ago

A400M Fleet maintenance data analyst

Madrid, Spain
2w ago

Electrical Supervisor (m/f/d)

Madrid, Spain
3d ago

Mechanical Supervisor (m/f/d)

Madrid, Spain
3d ago

Engineer Elec. Simulations & Models (m/f/d)

Madrid, Spain
3d ago

Intellectual Property Specialist Competitive Intelligence (m/f/d)

Madrid, Spain
3d ago

Engineering Project Manager (m/f/d)

Madrid, Spain
3d ago

Global Tendering Logistics Inland Lead PM

Madrid, Spain
3d ago

Business Analyst and Project Valuation (m/f/d)

Madrid, Spain
3d ago

BOP Tender Manager (m/f/d)

Madrid, Spain
3d ago

Concept and Technology Lead (m/f/d)

Madrid, Spain
3d ago

Installation Time Methodology Engineer (m/f/d)

Madrid, Spain
3d ago

FMEA Moderator (m/f/d)

Madrid, Spain
3d ago

Commissioning Process Engineer

Madrid, Spain
3d ago

Application managed by Holcim España