I haven't worked at Slack, so I can't speak with high confidence. A traffic spike is a possible reason, but I'm willing to bet that it's not the reason:
> Doubt anyone releasing big changes Monday morning.
This is definitely an engineering best practice, and by best practice, I mean something that Uber's, I mean Slack's SRE team strongly pushed for, and got politely overruled on. After a code freeze is lifted, it's quite common for lots of promotion-eager engineers to release big changes.
IMO it really doesn't have to be promotion-eager engineers or antsy product managers. I'm fairly satisfied with my role and comp and work type with where my career/life-stage is. I just did a code release first thing this morning, not because I am promotion-eager, but just because I'm picking back up where I left off, like any normal day. Granted I work at a much smaller company than Slack with orders of magnitude less traffic.
I'm not sure about that. I feel like I get more upvotes from sarcasm and jokes than from insight. In this instance, I think it's because when people hear something dumb said seriously in real life, they're not going to readily recognize online that it's a joke.
Why? I had a rewrite of some core logic the last day before Christmas that I didn'td deploy, as it wasn't time critical to get out and I didn't want to be disturbed during holidays. Today it was perfect to deploy, as I can watch it the whole week if needed.
Well, I think it probably depends on where you work. At my work, people just took 2-3 weeks of time off. It takes a moment to get your head back in the game.
Everywhere I've worked often has a massive backlog of things that get released after a moratorium or extended holiday week. Those are usually the worst weeks to be oncall since things are under so much churn.
Interesting, I've never worked anywhere where engineers decide when to release changes. That's a product decision, and there is a process of review and approval at both the code level and the functional/end-user-experience level that has to happen first.
Did you mean that literally? E.g. is it common at Uber that engineers can release changes to production on their own?
At Cisco (Webex team), the engineers decide when to release code, and most features are enabled by configs or feature flags independently of the deploys.
The engineering team is responsible for the mess caused by a bad deploy, so it's appropriate that those engineers should also choose the timing.
Our team typically deploys between 10am and 4ish, local time, since that's when we're at our desks and ready to click through the approvals and monitor the changes as they go through our pipelines.
The feature enablement happens through an EFT / beta process, and the final timing of GA enablement is a PM decision. But features are widely used by customers ahead of that time, as part of the rollout process.
Our team usually rolls out non-feature changes to services via dynamic configuration switches, so that we can get new bits in place, and then enable new behavior without a redeploy. This also enables us to roll back the dynamic config quickly if something unexpected happens.
(We generally don't do this for net new functionality; there's lower risk in adding a new REST endpoint etc. than in changing an existing query's behavior or implementation.)
Does Uber/Slack not release in CI/CD? At least in backend?
I don't see any need to deploy a big change at once in the software world today. At worst feature gate the thing you want to do and run it in a beta environment, but still push the actual code down the pipeline.
I'm actually more confused after reading that. I assumed that you meant that tested in production on purpose, but it sounds, at a skim, like they do non-prod testing environments - in fact, it looks like they've gone to having multiple beta environments of every service?
My understanding is that they have a "tenancy" variable in every service call which can take a different code path. They seem to only have one environment for everything and do tests/experiments at code level based on this variable.
> Doubt anyone releasing big changes Monday morning.
This is definitely an engineering best practice, and by best practice, I mean something that Uber's, I mean Slack's SRE team strongly pushed for, and got politely overruled on. After a code freeze is lifted, it's quite common for lots of promotion-eager engineers to release big changes.