Job Information
Ford Motor Company Site Reliability Engineer- Observability Expert in Mexico
The Ford Credit Command Center is seeking a highly skilled and passionate Observability Expert to join our team. You will play a critical role in designing, implementing, and maintaining our comprehensive observability strategy across our infrastructure. This includes leveraging existing and emerging technologies to provide actionable insights into the performance, reliability, and security of our applications and services. You'll collaborate closely with engineering teams to improve observability practices and contribute to a culture of proactive monitoring and incident prevention.
The people of Ford Motor Credit Company have a 60-year commitment to helping put people behind the wheels of great Ford and Lincoln vehicles. By partnering with dealerships, we provide financing, personalized service and professional expertise to five thousand dealers and more than four million customers in over 100 countries around the world. If you’re customer-focused, driven and seeking the opportunity to experience exciting challenges and growth, look no further.
Design, implement, and maintain our observability platform utilizing Splunk, Dynatrace, GCP Monitoring, Grafana and Nobl9.
Develop and maintain custom dashboards, alerts, and reports to provide real-time insights into application and infrastructure performance.
Integrate observability tools with CI/CD pipelines to ensure automated monitoring and alerting.
Work with engineering teams to troubleshoot and resolve performance issues, leveraging your expertise in distributed tracing, logging, and metrics.
Develop and maintain custom integrations and applications using Java, Spring Boot, and PostgreSQL to enhance our observability capabilities. This may include building custom agents or processors for our existing observability stack.
Deploy and manage observability components on Google Cloud Run.
Collaborate with SRE and engineering teams to establish and maintain service level objectives (SLOs) and error budgets using Nobl9.
Mentor and guide other engineers on best practices for observability and monitoring.
Stay up to date with the latest trends and technologies in the observability space.
Participate in troubleshooting on critical application bridges to perform root cause analysis and improve observability.
Automate manual processes and develop reliability reporting strategy.
Evaluate alternate tools and perform POCs with tools such as Datadog.
Bachelor’s degree in computer science, engineering, or a related field.
5+ years of experience in a DevOps, SRE, or similar role with a strong focus on observability.
Extensive experience with Splunk, Dynatrace, and GCP Monitoring. Experience with Nobl9 is a significant plus.
Proven experience designing and implementing comprehensive observability solutions for large-scale applications.
Strong programming skills in Java and experience with Spring Boot.
Experience with relational databases, particularly PostgreSQL.
Experience deploying and managing applications on Google Cloud Platform (GCP), especially Cloud Run.
Deep understanding of distributed systems, microservices architectures, and cloud-native technologies.
Excellent communication, collaboration, and problem-solving skills.
Nice To Have:
ServiceNow experience a plus.
Knowledge of ITIL / ITSM processes
Incident Management experience
DISCLAIMER:
Ford Motor Company is an Equal Opportunity Employer, as we are committed with a diverse workforce, and do not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity and/or expression, status as a veteran and basis of disability.
Requisition ID : 38014
Ford Motor Company
- Ford Motor Company Jobs