Profile

Cover photo
Sidnei da Silva
Works at Google
Attended Universidade de Caxias do Sul
Lives in Zürich, Switzerland
1,666 followers|1,098,755 views
AboutPostsPhotosYouTubeReviews

Stream

Sidnei da Silva

Shared publicly  - 
 
Switzerland has been getting more attention than usual, so we came up with the idea of Marathon Gespräche: A conversation series...
4
1
Christiane Pousa Ribeiro's profile photo
Add a comment...

Sidnei da Silva

Shared publicly  - 
 
 
Maslow's hierarchy of SRE needs

In the course of talking to other tech companies about what they consider the scope of their SRE/DevOps roles, I've realized that the scope of SRE organizations differs substantially across the industry. Many SRE organizations are limiting their potential by hiring teams to do only the work that keeps the service(s) they are responsible for running but not the work that substantially improves the service(s). It feels like their teams are stuck due to being too overwhelmed with the basics to get out of the rut and do more meaningful work.

What I'm dubbing 'Maslow's hierarchy of SRE needs' categorizes the state of a team into the following buckets:
+ physiological health - is the service functioning at all (e.g. not repeatedly hard-down/bleeding revenue)? is the pager quiet enough to get any other work done? are we learning from outages and resolving postmortem action items to avoid repeating the same outages?
+ maintain homeostasis - is it possible to carry out day to day operations (e.g. push code, tolerate machine failures) without excessive manual work? are people automating away manual work?
+ boundaries & objectives - do we have clear scopes for what we're responsible for (e.g. better to be responsible for one thing solidly than many things diffusely), and an agreed-upon SLA/standard that we aspire to achieve?
+ self-awareness - do we know when we deviate from the standards based on metrics so we can take corrective action? conversely, this also means we can ignore noise that isn't tied to these metrics because our monitoring about the things we care about is solid.
+ self-actualization - freedom in time, trust, and ability dimensions to make substantial design improvements to the service (and measure the improvements!)

You don't get to the later stages of the hierarchy of needs without hiring both systems engineers and software engineers - SRE only works at its best if you have people with both skillsets collaborating. If all you're doing is giving people from pure sysadmin backgrounds a shiny devops title and no other support, you're not going to see results that are meaningfully different from the pure operational model of sysadmin work. If you struggle to name the exceptionally strong coders on your team, you're going to have a lot of trouble with the last step of actually getting core service-level improvements delivered (e.g. improving the service components themselves, instead of just rearranging their relationships). If you don't have a solid product dev-SRE relationship with clear boundaries, it's far too easy to slip into the trap of having all the operational work pushed onto SRE without effort put into reducing the total operational burden.

It's fairly easy to spot a well-functioning organization -- if it's primarily doing work in the self-actualization category, everything less complex in the hierarchy is likely to be shipshape. If an organization is stuck earlier in the hierarchy, it requires a great deal of support in order to reach a fulfilled and functional state. The support required takes many forms - upper management support for principled "no"s and enforcing good boundaries with product dev, hiring to ensure the correct breadth and depth of skillsets is present on the team, and vision from the team itself to push towards more sophisticated work rather than becoming comfortable just doing operations.

What can you do as the leadership of an engineering organization if you're looking to make sure your SRE team grows to its full potential? First, hire people who are excited about the scaling/performance/reliability challenges that your product development generalists lack expertise in, not just people to do the grungy work you don't want to be doing. Second, make SRE's goal to change the service based on experience running it, rather than just keeping it running. Third, make sure a majority of your SRE team's time is actually developing projects and learning new things. Finally, empower your SRE team to take full ownership of the service, including backing their ability to say no to product development.

If you don't do these things, you'll have trouble attracting new talent[1], and your best site reliability engineers will eventually become bored and leave for where they can enjoy self-actualization.


[1] For a potential external hire that wants to be doing work towards the latter steps in the hierarchy, it's a rather risky proposition to join a team that is currently stuck. Visibility into the root causes of the stuckness is often opaque from outside the organization, and whether there will be organizational support for making the necessary changes is also hard to assess from the outside. There's always a great feeling of accomplishment from being empowered to fix a situation and doing so, but it's best to avoid the situations where one is set up to fail from the beginning.
7 comments on original post
6
1
Christian Heimes's profile photo
Add a comment...

Sidnei da Silva

Shared publicly  - 
7
Add a comment...

Sidnei da Silva

Shared publicly  - 
7
Add a comment...

Sidnei da Silva

Shared publicly  - 
 
 
My team at Rackspace is hiring. It's a US team, but we're open to have people work from the Zurich office! We have three positions open:

Front-End Software Developer: https://uscareers-rackspace.icims.com/jobs/12965/front-end-software-developer/job
Senior Python Developer: https://uscareers-rackspace.icims.com/jobs/12970/senior-python-developer/job
Python Developer: https://uscareers-rackspace.icims.com/jobs/12968/python-developer/job

I work in the DevOps Automation Team where we build systems that make deployments easier and trouble-shooting faster for our support techs. It's #Python in the back-end and #AngularJS in the front-end.

There are a couple of projects in the team. The project I work on and know the best collects data from customer systems so we can react proactively when we see problems. The kinds of issues we detect range from OpenSSL is outdated (Heartbleed) to Apache is mis-configured and will use too much RAM. We store all the data in #MongoDb , I believe the database is around 0.5 TB these days.

It's a great team — feel free to ask me for more details if you like! Please reshare this with anybody who you think might be interested.
View original post
1
Add a comment...
Have him in circles
1,666 people
Claudio Berrondo's profile photo
William Ferreira's profile photo
Leonardo Barros's profile photo
Pedro Diogo's profile photo
Jason DeRose's profile photo
Simon “H Pisces” Weiss's profile photo
Manutenção De Notebooks's profile photo
Diogo Atiê S. Ongaratto's profile photo
Izaias Nunes's profile photo

Sidnei da Silva

Shared publicly  - 
7
Sandrinho da Silva's profile photo
 
Lindas...
Add a comment...

Sidnei da Silva

Shared publicly  - 
12
Juliana Schlup's profile photoDaiane da Silva's profile photoPaulo da Silva Silva's profile photo
3 comments
 
que lindas princesas!!!
 ·  Translate
Add a comment...

Sidnei da Silva

Shared publicly  - 
 
 
Proud to be part of the awesome team that built this baby. Download it and let us know what you think! US only for now.
83 comments on original post
1
Add a comment...

Sidnei da Silva

Shared publicly  - 
10
Add a comment...

Sidnei da Silva

Shared publicly  - 
 
Fasnacht im Baar
 ·  Translate
6
Add a comment...
Story
Introduction
You can find more about me on my website.
Education
  • Universidade de Caxias do Sul
Basic Information
Gender
Male
Other names
Sid