Specialist, DDP Domino Platform Engineer
Specialist, DDP Domino Platform Engineer
Original Advert
Job Description
Required Skills:
Amazon SageMaker, Databricks Platform, Data Science, Domino Data Science Platform, Jupyter Notebook, Kubeflow, Machine LearningPreferred Skills:
CI/CD, GitHub, GitLab, Jenkins (Software), TerraformCurrent Employees apply HERE
Current Contingent Workers apply HERE
Secondary Language(s) Job Description:
The Opportunity
- Based in Hyderabad, join a global healthcare biopharma company and be part of a 130- year legacy of success backed by ethical integrity, forward momentum, and an inspiring mission to achieve new milestones in global healthcare.
- Be part of an organisation driven by digital technology and data-backed approaches that support a diversified portfolio of prescription medicines, vaccines, and animal health products.
- Drive innovation and execution excellence. Be a part of a team with passion for using data, analytics, and insights to drive decision-making, and which creates custom software, allowing us to tackle some of the world's greatest health threats.
Our Technology Centers focus on creating a space where teams can come together to deliver business solutions that save and improve lives. An integral part of our companys' IT operating model, Tech Centers are globally distributed locations where each IT division has employees to enable our digital transformation journey and drive business outcomes. These locations, in addition to the other sites, are essential to supporting our business and strategy.
A focused group of leaders in each Tech Center helps to ensure we can manage and improve each location, from investing in growth, success, and well-being of our people, to making sure colleagues from each IT division feel a sense of belonging to managing critical emergencies. And together, we must leverage the strength of our team to collaborate globally to optimize connections and share best practices across the Tech Centers.
Role Overview
A Junior ML Platform/DevOps Engineer supports the build and operation of AWS and Kubernetes (EKS) infrastructure for data science and ML workloads. You'll learn to use Terraform and CI/CD/GitOps to automate environments, help manage Docker images, and assist with platforms like Domino Data Lab, Kubeflow, SageMaker Studio, JupyterHub, or Databricks on Kubernetes. Under guidance, you'll contribute to observability (Prometheus/Grafana, ELK/OpenSearch, CloudWatch), help with incident response, and follow security best practices (IAM, secrets, RBAC, network segmentation, encryption, compliance). You'll troubleshoot basic issues across network, Kubernetes, container, and application layers, collaborate with teams, document processes, and develop skills in cost and reliability optimization.
What will you do in this role
- Assist in setting up and maintaining AWS resources (VPCs, networking, security, storage, compute) for ML and data science workloads, following established standards.
- Help deploy and manage Kubernetes (EKS) clusters, including basic workload deployments, autoscaling configurations, and routine updates under mentorship.
- Support users on ML platforms (Domino Data Lab, Kubeflow, SageMaker Studio, JupyterHub, Databricks on Kubernetes) by provisioning environments and resolving common issues.
- Contribute to Infrastructure as Code using Terraform and help maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins); learn and apply GitOps practices.
- Build and maintain Docker images and manage registries, focusing on secure, reproducible image lifecycle practices.
- Configure dashboards and alerts for metrics and logs using Prometheus/Grafana, ELK/OpenSearch, and CloudWatch; assist in maintaining SLOs.
- Participate in incident response by monitoring, triaging, and documenting issues; support root-cause analysis and postmortems.
- Apply security best practices in IAM, secrets management, RBAC, network segmentation, and encryption; assist with periodic audits and remediation.
- Collaborate with data scientists and ML engineers to understand requirements and deliver platform improvements that align with business goals.
- Help implement automated tests for infrastructure and platform components to improve reliability.
- Contribute to cost visibility and optimization efforts using FinOps guidance and tooling.
- Write and maintain clear documentation (runbooks, standards, onboarding guides) and share knowledge with the team.
- Participate in sprint planning and project delivery; seek feedback, grow skills, and take ownership of well-scoped tasks.
Required Experience or Skills:
- Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent practical experience.
- 3+ years of experience with AWS fundamentals (networking, security, storage, compute).
- Exposure to Kubernetes (preferably EKS) concepts and/or hands-on practice with deployments and autoscaling of containerized applications.
Desired Experience or Skills:
- Familiarity with ML platforms or data science environments (Domino Data Lab, Kubeflow, SageMaker Studio, JupyterHub, Databricks) is a plus.
- Basic proficiency with Terraform and at least one CI/CD tool (GitHub Actions, GitLab CI, Jenkins); understanding of Git workflows and desire to learn GitOps.
- Linux fundamentals; experience with Docker and image creation, tagging, and registry usage.
- Introductory experience with observability tools (Prometheus/Grafana, ELK/OpenSearch, CloudWatch) and an interest in reliability engineering.
- Security-minded approach with foundational knowledge of IAM, secrets, RBAC, and encryption; willingness to learn compliance practices.
- Strong problem-solving skills, attention to detail, and the ability to troubleshoot guided issues across network, container, and application layers.
- Clear written and verbal communication; collaborative, customer-focused mindset.
- Bonus: Exposure to automated testing for infrastructure, awareness of cloud cost management/FinOps, and pursuit of certifications (AWS Cloud Practitioner/Associate, CKA/CKAD, Terraform Associate).
Our technology teams operate as business partners, proposing ideas and innovative solutions that enable new organizational capabilities. We collaborate internationally to deliver services and solutions that help everyone be more productive and enable innovation.
#HYDIT2026
Search Firm Representatives Please Read Carefully
Merck & Co., Inc., Rahway, NJ, USA, also known as Merck Sharp & Dohme LLC, Rahway, NJ, USA, does not accept unsolicited assistance from search firms for employment opportunities. All CVs / resumes submitted by search firms to any employee at our company without a valid written search agreement in place for this position will be deemed the sole property of our company. No fee will be paid in the event a candidate is hired by our company as a result of an agency referral where no pre-existing agreement is in place. Where agency agreements are in place, introductions are position specific. Please, no phone calls or emails.
Employee Status:
RegularRelocation:
VISA Sponsorship:
Travel Requirements:
Flexible Work Arrangements:
HybridShift:
Valid Driving License:
Hazardous Material(s):
Job Posting End Date:
03/13/2026*A job posting is effective until 11:59:59PM on the day BEFORE the listed job posting end date. Please ensure you apply to a job posting no later than the day BEFORE the job posting end date.
Application managed by MSD