affix is delighted to partner with 99designs once again, this time helping to scale their site reliability engineering team.
Who are they?
99designs is a real success story and a highly thought of workplace for good reason. They’re a global creative platform that makes it easy for designers and clients to work together to make designs they love.
What makes them so great?
Well, there’s a lot of reasons! Their people, their collaboration, their motivation and desire to help designers work across borders with greater flexibility and freedom, allowing people to create their own success, and their pursuit of customer excellence; that every customer and creative is delighted by the 99designs experience.
They’ve now entered another exciting period of growth, and to support them in this, we’re on the lookout for a self-starting, passionate & collaborative Senior Site Reliability Engineer who is eager to improve their entire life cycle of services, scale their systems sustainability, and be a part of a team who have the perfect balance of hard work and fun. Oh, and they will definitely know their way around & love all things AWS.
The 99designs team take site reliability and their culture around it seriously and as such all engineers in their delivery teams are on call to support the code which they themselves ship. Continuous deployment is of huge importance for them too: engineers make daily releases & any software engineer on the team can deploy into production.
What is the role?
As a Senior Site Reliability Engineer at 99designs you’re at the heart of their mission to deliver the most trusted global creative platform for professional creators to find and do work online. This role can be broken down into a few key areas of responsibility across engineering, site reliability, and advocacy. So, perhaps it’s best that we explain the areas of which they need you to have the most impact in the role:
Engineering
You will spend 50% of your time in a delivery team contributing as a Senior Engineer, supporting them with their delivery challenges and to deploy to production quickly & with confidence. You will contribute to team rituals, such as planning, stand ups and retrospectives, and provide subject matter expertise in the field of Site Reliability. Depending on your personal preferences, you might also contribute to their product directly as a software engineer & jump into the code base; or your focus could be solely on site reliability support within the team… or some combination of the two! This will all come down to what your passions & preferences are.
Site Reliability
This is all about finding opportunities to modernise and simplify the operations of their stack & reduce technical debt. You will help scale systems through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity, and establish best practices around cost saving measures. You will standardise and automate monitoring, alerting, metrics and observability, and work with teams on a minimum set of mandatory metrics, incident response practices and SLO’s. You will also design & champion tooling that allows them to operate on services at scale. There’s also some upcoming CloudOps work for you to jump into – think major large-scale AWS migrations.
Advocacy
As a Senior member of the team, you keep your finger on the pulse of the development community at large and the trends & best practices emerging: your role will contribute to the overall engineering direction & strategy. You will work with management and cross-functional leadership groups to influence and direct culture change around best practice SRE. On the ground in the delivery teams, you will foster a culture of reliability and ownership amongst engineers. This will see you run technical training to roll out best practices, contribute to thought leadership with blog posts & articles, and be a true culture champion for SRE.
Why 99designs?
99designs is committed to building an inclusive and diverse team as they continue to scale – they want people to feel like they belong and can bring their whole selves to work. Below are just some of the things they do to support their people:
The techy bits
The 99designs platform consists of a fleet of microservices written in Go, Ruby and PHP with a GraphQL backend-for-frontend aggregation layer and a Typescript/React frontend. 99designs is an early AWS customer, and has been early adopters of Docker, ECS and most recently Fargate as the hosting platform of choice. Supporting the developers in their ability to ship multiple times a day is a large body of Bash based tooling, Buildkite for continuous deployment and DataDog for observability.
A little bit about you
You are currently a Site Reliability Engineer, or a DevOps Engineer (they’d love to help you make the transition into SRE). You have a background as a Software Engineer, and commercial experience shipping products across any tech stack, but with a love + appreciation of all things open source.
Next steps
Just a little bit curious?! Let’s have a chat to see if this can be your happy (work) place.
Kushla Egan
[email protected]