
Yahoo
It takes powerful technology to connect our brands and partners with an audience of hundreds of millions of people. Whether you’re looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process trillions of data points a day, what you do here will have a huge impact on our business—and the world.
A Lot About You
Yahoo Data Production Engineering team is seeking experienced DevOps/Cloud Infrastructure engineers with expertise in GCP (or AWS). If you are passionate about working in a dynamic environment and have the required skills, we would love to have you on board with us. Apply now to join our team!
We are looking for Production Engineers (SREs/DevOps) who are problem solvers at heart, with solid ability to dig into code and own the reliability domain. As a member of the PE team, you will work with your developer partners and implement operability improvements, security, infrastructure
Responsibilities:
- Lead initiatives to enhance and optimize existing cloud infrastructure, drive improvements in scalability, efficiency, and resilience, and oversee large-scale projects related to cloud platforms, automation, and performance optimization.
- Develop and optimize tools for infrastructure management and automation on cloud platforms, applying Software Engineering Reliability (SRE) principles to write high-quality, maintainable code in languages such as Python, JavaScript, and GoLang.
- Collaborate with engineering teams to integrate SRE principles into the product lifecycle, ensuring improved site reliability and product functionality across cloud platforms.
- Develop and implement automation strategies across cloud/on-prem environments to enhance system deployment, monitoring, and operational efficiency. This includes designing and managing CI/CD pipelines and utilizing infrastructure-as-code tools like Terraform, Ansible, and CloudFormation.
- Maintain and support production systems and associated infrastructure, ensuring their availability, performance, and scalability through continuous monitoring and automation.
- Work closely with cross-functional teams to understand product and technical roadmaps, identifying potential impacts on system operability and proposing proactive solutions for Cloud environments.
- Foster cross-functional collaboration between development, infrastructure, and operations teams to improve the overall performance and reliability of services on cloud.
Minimum Qualifications
- BS/MS in Computer Science or equivalent degree
- A minimum of 2+ years of industry experience in site reliability engineering, system engineer, or a related role, ideally in large-scale environments, with a focus on supporting 24×7 highly-available systems.
- Familiarity & working experience with Kubernetes and container-based orchestration
- Intermediate level of coding expertise in one or more language including Nodejs, Python, or Go
- Experience working with IaC (eg. Terraform, Ansible)
- Experience with using Git to manage code
- Experience with building CI/CD pipelines
- Good knowledge of TCP/IP and networking
- Familiarity with Observability tools, metric design and implementation
Preferred Qualifications
- Experience in designing and optimizing GCP Dataproc, Composer (Airflow, EMR on AWS) and bigquery slot management for big-data pipeline orchestration
- Experience in designing, managing large scale infrastructure in either AWS EKS, AWS Open search, OR GCP GKE, GCP Observability/Monitoring and with multi zone, multi region deployments
- Deep understanding of UNIX/Linux system internals and tools for troubleshooting application stack dumps and networking
- Experience working with GitHub Actions
- Experience with storage solutions like REDIS, DynamoDB
- Some experience with the Hadoop ecosystem comprising Oozie, Pig, Spark etc
- Prior experience in technical operations and exposure to tool/product development.
- Familiarity with observability tools & best practices and hands-on experience with applications like Chronosphere, Splunk, OpenSearch, GCP Observability Suite, Grafana & OTEL
Want to learn more? Visit the Yahoo company profile to browse the latest job listings.