Grafana Labs

Senior Site Reliability Engineer

at Grafana Labs
Technology & Programming Full-Time EMEA
545 days ago

Description

In June 2021 Grafana Labs acquired k6 - a Stockholm-based startup behind the open-source load testing tool for engineering teams. With k6, Grafana Labs adds extensible testing to its open and composable Grafana observability stack.

About k6

At k6, we build next-generation performance testing tools for developers and DevOps engineers. 

We are pushing forward the state-of-the-art in our industry, creating open-source tools that have great developer experience and enable engineers to build reliable systems. k6 is an open-source tool that we built to reinvent the engineering principles around performance testing and enable engineers to build systems that scale.

About the role

You will be a part of the k6 platform team that focuses on the infrastructure of k6 Cloud, our commercial product built around the k6 OSS targeted at users wanting to run performance tests at scale. Our SaaS application allows customers to load test their systems by running distributed load tests from 15+ regions across the world, using hundreds of thousands of virtual users able to send millions of requests per second. We ingest a huge volume of data generated by k6 which can be used to view, correlate and analyze metrics from each test.

The k6 platform team is being expanded due to the overall growth of the company, currently, there are 3 members. It is a small team whose main responsibility is maintaining the infrastructure layer of k6 Cloud and providing assistance and guidance to other teams (back-end, front-end, k6 OSS). Our infrastructure mostly consists of EKS clusters along with other AWS services (e.g., S3, RDS, ElastiCache). Currently, we are looking for a Kubernetes expert who can participate in developing and maintaining our  k8s based platform. Readiness to investigate, communicate and develop platform solutions is crucial for this role.

Your main responsibilities will be to:

  • Develop, innovate and maintain cloud infrastructure

  • Being on-call during working hours, monitoring and troubleshooting cloud infrastructure

  • Collaborate with other teams (back-end, front-end, k6 OSS) on delivering cross-functional features to the cloud product

  • Take part in shaping the k6 platform team roadmap

Required skills

  • Expert in Kubernetes and Docker in production environments

  • Expert in cloud infrastructure management: AWS, Terraform, Ansible

  • Excellent scripting using Python, Go, Bash

  • Experience in modern logging and monitoring: Loki, Prometheus, Grafana

  • Experience with GitOps approach (e.g., ArgoCD, FluxCD) and GitHub

  • Experience with SQL DB’s and messaging systems: PostgreSQL, RabbitMQ, Kafka

  • Experience comprehensively securing and monitoring AWS

  • Good technical communication skills

Not required, but great to have

  • Experience deploying IAM (Identity Access Management) solutions, (e.g. Okta, Auth0)

  • Experience with Kubernetes in other cloud providers (e.g. GCP, Azure) or on-premise

  • AWS/GCP/Azure Security Engineering, Architecting or Security Specialty certification


关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now