Senior Data Engineer

Holcim España
Holcim España
Madrid, SpainOn-siteCompetitiveInternship
English RequiredAdded today

Original Advert

SUMMARY OF THE JOB

We are seeking a seasoned Senior Data Engineer to design, build, and optimize our next-generation data platform. You will be responsible for architecting scalable data pipelines, managing large-scale distributed systems, and ensuring our data infrastructure in AWS and Databricks is robust and efficient. The ideal candidate is a Spark expert with a deep understanding of the AWS ecosystem and a passion for automation.

MAIN ACTIVITIES / RESPONSIBILITIES

  • Pipeline Architecture: Design and implement complex batch and streaming ETL/ELT pipelines using Python, SQL, and Spark to process massive datasets.

  • Cloud Infrastructure: Leverage AWS Data Analytics services to build scalable, secure, and cost-effective data solutions.

  • Orchestration & DevOps: Manage and automate data workflows using Airflow, while utilizing Docker and ECS for containerized application deployment.

  • System Optimization: Monitor and tune the performance of distributed systems (Spark Cluster) to ensure high availability and low latency.

  • Infrastructure as Code: Utilize AWS CloudFormation or Terraform to manage data infrastructure, ensuring repeatable and version-controlled environments.

  • Cost Optimization: Monitor and optimize AWS spend by selecting appropriate instance types (Spot vs. On-Demand) and refining data storage strategies.

  • Security & Compliance: Implement IAM roles, bucket policies, and encryption (KMS) to ensure data is secure at rest and in transit.

  • Collaboration: Work within an Agile framework to deliver iterative value, collaborating closely with Data Scientists and Stakeholders to translate business needs into technical reality.

JOB DIMENSIONS

List of direct reports:

  • Up to 2 Direct Reports, and around 15 externals

Key interfaces, stakeholders and relationships:

  • Internal:

    • GDS: product manager, application manager, data & analytics & AI team

    • Country business stakeholders

  • External : 3rd party vendors

PROFILE REQUIRED

  • Experience: Minimum 4+ years of hands-on experience in active Big Data environments and 2+ years specializing in Data Analytics within AWS.

    • Compute & Processing:Amazon EMR: Architecting and managing Spark clusters for large-scale distributed processing.

      • AWS Glue: Developing serverless ETL jobs, managing the Data Catalog, and implementing Glue Crawlers.

    • Storage & Warehousing:

      • Amazon S3: Implementing "Data Lake" best practices, including partitioning, compression (Parquet/Avro), and lifecycle policies.

      • Amazon Redshift: Designing star/snowflake schemas and optimizing query performance for high-volume data warehousing.

      • Amazon Athena: Performing ad-hoc SQL analysis directly on S3 data.

      • Experience with open table formats (iceberg/delta)

    • Orchestration & Integration:

      • Amazon MWAA (Managed Workflows for Apache Airflow): Deploying and scaling Airflow environments.

      • AWS Lambda: Building event-driven data triggers and micro-services.

    • Streaming (Advantage):Amazon Kinesis or MSK (Managed Streaming for Kafka) for real-time data ingestion.

  • Core Engineering: Expert-level proficiency in Spark, Python, and SQL.

  • Infrastructure & Tooling: Proven experience with Airflow for orchestration and Docker/ECS for containerization.

  • Good knowledge in Databricks and data mesh architectures. Good understanding in how to implement and maintain Lakehouse data models (bronze / silver / gold layers) using Delta Lake for reliability, ACID transactions, time travel and schema evolution.

  • Solid software engineering practices: Git, CI/CD for data pipelines, automated testing, code quality and documentation.

  • Communication: Excellent written and oral English communication skills, with the ability to explain complex technical concepts to non-technical audiences.

  • Degree in Computer Science, Engineering, Mathematics or related field, or equivalent practical experience.

PREFERRED "PLUS" QUALIFICATION

  • Real-time Processing: Experience with streaming and distributed messaging applications like Flink and Kafka.

  • Core Tech:Java programming.

  • Industrialise ML use cases

  • Data Visualization: Experience with QlikView or QlikSense to support BI initiatives.

  • Agile: Experience working in a fast-paced Scrum or Kanban environment.

  • Certifications: AWS Certified Data Engineer - Associate/Professional or AWS Certified Solutions Architect, Databricks Data engineer (Associated/Professional) certification

  • DevOps: Experience with Openshift, Github Actions or Jenkins for CI/CD of data workflows.

Data Engineer 2

Madrid, Spain
New

Founder’s Associate Intern (AI, Product & Growth)

Barcelona, Spain (Remote)
From €1K1d ago

Limited employment up to 6 months (f/m/d) - Data Scientist - Security Validation&Penetration Testing

León, Spain
1d ago
Visa Sponsor

Data Scientist (Experimentation & Personalization)

Madrid, Spain (Remote)
1d ago
Visa Sponsor

Senior Data Analyst

Madrid, Spain (Remote)
1d ago

Data Scientist - EY GDS Spain - Hybrid

Málaga, Spain
1d ago

Staff Data Scientist, Algorithm (Payments Optimization)

Spain On Site, Spain
1d ago

Data Scientist Manager

Madrid, Spain
1d ago

Expert Data Scientist

Madrid, Spain
1d ago

AI Data Specialist - Asia Time Zone ( 12 months contractor - Renewable)

Europe Time Zone, Asia Time Zone
1d ago

AI Data Specialist - Europe Time Zone ( 12 months contractor - Renewable)

Europe Time Zone, Asia Time Zone
1d ago

Senior Data Engineer

Amsterdam, Netherlands; Belgrade, Serbia; Berlin, Germany; Limassol, Cyprus; Madrid, Spain; Munich, Germany; Paphos, Cyprus; Prague, Czech Republic; Warsaw, Poland; Yerevan, Armenia
2d ago

Property Consultant (Shenzhen, China)

Madrid, Spain
New

Business Development Officer - Real Estate

Madrid, Spain
New

Associate Director (Beijing, China)

Madrid, Spain
New

Saturday Assistant | Edinburgh City

Madrid, Spain
New

Assistant Manager (Shanghai, China)

Madrid, Spain
New

Senior Manager (Shanghai, China)

Madrid, Spain
New

Valuer/Manager (Shanghai, China)

Madrid, Spain
New

Application managed by Holcim España