We really hope you can make it to the event - but if you're iffy on your availability, we'll still send the recording of the sessions via email.
While crunching numbers for the 2020 SRE Survey conducted in January, it felt like a whole 'nother world.
We can't release the 2020 SRE Report without discussing how SRE has evolved this year under crummy circumstances.
Through talks, polling, and panels, 'SRE From Home' will explore how Site Reliability Engineers are adapting to 'all-remote' operations and what we can learn from each other.
We also have some fun planned. So grab your favorite drink - Matcha, Brew, or Fizzy Lifting - and come hang out with your fellow SREs!
The first talk will provide new best practices, frameworks, and roadmaps for practicing SRE successfully when all of your engineers - not just systems - are distributed.
Details coming soon.
Details coming soon
The Q&A panel will focus on shared experiences and learnings; how have SREs adapted goals, processes, and communication to navigate 'all remote' operations. What's worked well, what hasn't, and how will the SRE practice and teams change moving forward?
Todd Palino is a Senior Staff Engineer in Site Reliability at LinkedIn on the Capacity Engineering team, where his team is creating a framework for application capacity measurement, analysis, and change intelligence. Prior to that, he was responsible for architecture, day-to-day operations, and tools development for one of the largest Apache Kafka deployments. In his spare time, Todd is the developer of the open source project Burrow, a Kafka consumer monitoring tool, and is the co-author of Kafka: The Definitive Guide, now available from O'Reilly Media.
Holly Allen is the head of reliability at Slack, with SRE, Monitoring, and Resilience Engineering in her portfolio. She is tireless in her efforts to make Slack the software reliable and scalable, and Slack the company a delightful place to work. Prior to Slack Holly worked at startups, DreamWorks Animation, and was Director of Engineering at 18F, a civic tech startup in the US government.
Lex Neva is interested in all things related to running large, massively multiuser online services. He has years of Systems Engineering, tinkering, and troubleshooting experience and perhaps loves incident response more than he ought to. He’s previously worked for Linden Lab, DeviantArt, and Heroku and currently works as an SRE at Fastly helping to make sure the Internet keeps running.
Jaime began his career as a molecular biologist before following his passion for communications, working at DigitalOcean, Riot Games, and Shopify, where he launched the engineering communications function. He co-founded Incident Labs: check out Ovvy Insights, which helps provide teams the right data to improve incident response and return hours for planned work. He has spent two years learning about mental health and mindfulness. He is also an avid lover of dumplings.
Liz is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with 16+ years of experience. She is an advocate at Honeycomb for the SRE and Observability communities, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights.
An experienced Senior SRE Manager; responsible for building, operating and managing highly scalable and mission critical cloud services. Jonathan works for a highly prominent multi-cloud services provider and is a thought leader in SaaS transformation, operations and incident management.