Data Scientist
Original Advert
At Bayer we're visionaries, driven to solve the world's toughest challenges and striving for a world where ,Health for all, Hunger for none' is no longer a dream, but a real possibility. We're doing it with energy, curiosity and sheer dedication, always learning from unique perspectives of those around us, expanding our thinking, growing our capabilities and redefining 'impossible'. There are so many reasons to join us. If you're hungry to build a varied and meaningful career in a community of brilliant and diverse minds to make a real difference, there's only one choice.
Data Scientist
About the Role
Are you excited to own forecasting solutions from data extraction to production deployment? We're hiring a Senior Data Scientist for the Machine Learning & Artificial Intelligence unit within Bayer's Enterprise Data & Analytics Platform.
This is not a pure modeling role.You will extract data from source systems, wrangle messy real-world data, build statistical and ML forecast models, engineer production pipelines, evaluate results, present insights to business stakeholders, develop interactive applications, and maintain the entire infrastructure. You'll tackle unstructured problems independently and deliver measurable impact across Finance, Supply Chain, HR, Procurement, and Commercial Operations.
Our international team spans Poland, Germany, Spain, and India. We work with time series forecasting, statistical modeling, data engineering, and interactive analytics on a modern cloud-native stack. If you thrive on end-to-end ownership-from source systems to stakeholder dashboards-and enjoy solving ambiguous business problems with minimal guidance, we want to hear from you.
Key Responsibilities
End-to-End Solution Ownership
- Own the complete lifecycle from understanding business needs through production deployment, monitoring, and maintenance
- Build robust ETL/ELT pipelines on Databricks; clean, validate, and transform messy real-world data at scale
- Transform loosely defined business questions into structured solutions independently-identifying data gaps, proposing approaches, and iterating based on feedback
Forecasting & Modeling
- Design and deploy production-grade forecasting solutions using statistical models (ARIMA, ETS, BSTS) and ML approaches (XGBoost, LightGBM, neural networks)
- Engineer sophisticated features: lag features, rolling statistics, external signals, calendar effects, and domain-specific transformations
- Implement forecast reconciliation and hierarchical aggregation for complex business structures
- Establish rigorous evaluation frameworks: backtesting, time series cross-validation, accuracy metrics, prediction intervals, and drift monitoring
Software Engineering & Infrastructure
- Write production-grade Python and R code with modular architecture, comprehensive testing, error handling, and documentation
- Build and maintain sophisticated R Shiny applications with integrated JavaScript components
- Orchestrate ML pipelines using Kubeflow for automated training, validation, deployment, experiment tracking, and model versioning
- Manage infrastructure as code: Databricks workspaces, Azure resources, CI/CD pipelines (GitHub Actions, Azure DevOps), containerization, and secrets management
Analysis, Debugging & Monitoring
- Troubleshoot complex issues across the full stack: data pipeline failures, model degradation, API errors, and integration problems
- Implement continuous monitoring: automated data quality checks, feature drift detection, performance tracking, and alerting systems
- Conduct root cause analysis of forecast errors, identify data anomalies, validate business logic, and communicate findings clearly
Required Qualifications
Technical Foundation
- Education & Experience: Master's or PhD with 3+ years delivering end-to-end data science solutions in production
- Programming: Strong Python, R and SQL proficiency
- Forecasting Expertise: Time series decomposition, seasonality, trend analysis, ensemble methods, probabilistic forecasting, hierarchical reconciliation
- Data Engineering: Databricks/Spark/PySpark, Delta Lake, ETL/ELT design, job orchestration, performance tuning
- KNIME: Building analytical workflows, data preprocessing, model pipelines, and system integration
End-to-End Capabilities
- MLOps: Kubeflow pipeline orchestration, experiment tracking, model registry, automated deployment
- Software Engineering: Git workflows, code reviews, testing frameworks (pytest, testthat), modular design, documentation
- Application Development: Build RESTful APIs and R Shiny applications from scratch; handle authentication, deployment, and optimization
- Cloud Infrastructure: Azure services (Databricks, Blob Storage, Data Factory, Key Vault, Functions), container orchestration, CI/CD
Essential Soft Skills
- Autonomy: Self-starter who can take vague requirements and independently drive projects from concept to production
- Problem-Solving: Systematic debugging across the full stack-from source data to infrastructure
- Business Acumen: Translate business needs into technical solutions; understand when "good enough" beats "perfect"
- Communication: Present technical work clearly to non-technical audiences; influence decisions with data; write comprehensive documentation
- Collaboration: Work effectively with cross-functional teams while maintaining end-to-end ownership
Preferred (Not Required)
SAP HANA/BW experience, advanced Kubeflow capabilities, Terraform, Kubernetes, PowerBI/Tableau integration, data governance frameworks, multilingual capability
Language Requirement:Fluent in English (written and spoken); additional languages from our team regions are a plus.
| YOUR APPLICATION | |
|
This is your opportunity to tackle the world's biggest challenges with us: Maintaining our health, feeding growing populations and slowing the rate of climate change. You have a voice, ideas and perspectives and we want to hear them. Because our success begins with you. Be part of something big. Be Bayer. |
Location:
Spain : Cataluña : Barcelona
Division:
Enabling Functions
Reference Code:
870952
Application managed by Bayer