- Designing & implementing infrastructure for logging, monitoring, tracing, alerting ….
- Write well-thought-out, maintainable code and configuration.
- Workflow automation.
- Communicate technical decisions through design docs, tech talks, and code reviews.
- Practicing sustainable incident response and blameless postmortems.
- 3+ years with UNIX/Linux systems administration.
- 1+ years of production experience with Docker and Kubernetes.
- Experience with public cloud (GCP, AWS, Azure), GCP is a big plus.
- Experience in system administration.
- Experience with bash script or python.
- Experience with at least one monitoring tools (Thanos/Prometheus/Grafana is a big plus).
- Experience with at least one log gathering tools (ELK/EFK/Loki+Promtail+Grafana)
- Experience with Git.
- Experience in relational database administration.
- Experience with configuration management software as Ansible, Puppet, or Chef.
- Experience with CI/CD tools, including Jenkins, CircleCI, TravisCI or others.
- Experience with CNCF, including Helm, Istio, Terraform, Helm, Argo, Thanos or others.
- Experience with Grafana, Kibana, Stackdriver.
- Experience with setting SLO, SLI, SLA for enterprise-level SaaS.
To apply for this job email your details to email@example.com