Job Information
The Trade Desk, Inc. Senior Site Reliability Engineer - Infrastructure in Seattle, Washington
The Trade Desk is changing the way global brands and their agencies advertise to audiences around the world. How? With a media buying platform that helps brands deliver a more insightful and relevant ad experience for consumers - and sets a new standard for global reach, accuracy, and transparency. We are proud of the culture we have built. We value the unique experiences and perspectives that each person brings to The Trade Desk, and we are committed to fostering inclusive spaces where everyone can bring their authentic selves to work every day. So, if you are talented, driven, creative, and eager to join a dynamic, globally-connected team, then we want to talk! What you'll do: This is a Senior Site Reliability Engineer (SRE) position, responsible for the reliability, performance, and efficiency of The Trade Desk systems and applications. You will participate actively in all aspects of designing, building, and delivering reliable infrastructure and tools for our clients, partners, and employees. The Trade Desk infrastructure is "hybrid" both in operating system (Linux, Windows) and environment (bare metal, cloud). An SRE should be well rounded and "technology agnostic" with a pragmatic approach towards the best tool for the situation. You will have the opportunity to support thousands of hosts throughout the world, with petabyte-scale data challenges. Participating in an agile methodology, you will work closely with your teammates to monitor capacity, replace manual workflows with tools/automation, and evangelize scalable long-term solutions to technical problems. Who you are: You have a passion for efficient operations. You've made significant contributions to large and impactful technical projects. You think beyond just the task at hand to deeply understand the 'why' behind what you are doing. You can code. At our scale we are not interested in "boutique" manual management of servers and software. This is an engineer position, not an administrator or operator. You code with languages such as C#, Python, Go, Powershell, or Ruby. When a problem needs a software solution, you roll up your sleeves and get to work. You design for scale. You manage cattle, not pets. In other words, you understand that the only way to scale is to avoid special snowflakes of systems and applications. You design systems to auto-scale and auto-heal. Via automation, you relentlessly strive to eliminate manual toil. You are a broadly skilled engineer with an interest in service reliability, automation, monitoring, and/or capacity planning. But you have the breadth of knowledge necessary to support a wide variety of software and systems. You understand modern architectures. You know why Docker and containers are more than just buzzwords, but you are cautious against overcomplexity and overengineering. You are able to use traditional configuration management such as Chef, Ansible, or Terraform as well as modern infrastructure schedulers like Kubernetes and Mesos. You enjoy working with the latest monitoring and metrics platforms such as Prometheus. You are comfortable working on physical gear or in the cloud. Our hybrid environment requires objective knowledge of infrastructure, equally comfortable with traditional, physical servers as well as the software abstractions present in cloud platforms such as AWS. You work with confidence and without ego. Our engineers have deep knowledge and exercise a high degree of leadership in their daily work. You have strongly-held, defensible ideas, and advocate for what you believe is right. You are also adept at identifying and evaluating trade-offs, willing to be proven wrong, and quick to walk through fire to support your fellow teammates. You often have strong opinions but weakly held. You value, seek out, and foster diversity. We are a global team from many diverse backgrounds, with different experiences and perspectives. To complement this tea