I feel security programs are all about small nuances and there is plethora of generic advice out there. So, though this is good generic advice overall but I would have loved to see some more details.
> With this freedom, teams have the responsibility of securing their services, and ensuring security issues are fixed within an established timeframe.
How does the security team ensure that security bugs are being addressed. Individual engineering teams sometimes do not have visibility into the whole system and may think of an issue as low severity but combined with another bug in another system and its now a sev:high (SSRF type of things.) With individualized team dashboards, how does the security team handle this.
> The Security Team is currently rolling out metrics to our service dashboards tracking out of date patches, with an alarm to page the team when security patches are left unapplied. Giving the teams visibility into the risks lets them prioritize and schedule their patching, instead of relying on the security team to monitor patch levels.
How is the compliance to security policies ensured ? What if a team does not want to patch or the patch breaks their existing tools. A key element is these faced paced dev->deploy environments is exceptions. How are those handled and tracked ?
It varies widely from company to company. It depends on how dangerous of a bug it is as well as the exploitability; this determines the remediation period. Usually, very severe bugs like heartbleed or front facing deserialization are fixed within 1-7 days, whereas a CSRF in some random POST call could be 90/180 days depending on how a company treats less severe bugs.
You have a security guy assigned to keep track of bugs and is notified after the remediation window has passed. It's usually assigned to the person that found it because they are able to easily understand the issue and then verify the fix. They will confirm it has been fixed for all the machines it was found on, then close the ticket out. However, often bugs that are considered unlikely to be exploited will be kicked back, and long chains of emails between the security guys saying "Please fix this bug! I'm just trying to do my job", and developers saying "We have 0 dev cycles and a release coming up, not a high priority! Can't you just close it?"
This email thread goes on for weeks/months, and the security guys push the redemption period to help developers meet their deadline. It gets pushed back so many times that the security guys get sick of reporting it because they know it's a low severity issue, but still is a problem none the less. They push it up the chain of command and force the PM of the project sign off on the risk associated with what could happen if exploited, they almost always have it fixed immediately then it's closed.
I wish this post wasn't downvoted because I think it's a valuable one, even if it relies on a perception that I don't think quite holds. In my experience, what everybody calls "devops" mostly encompasses human problems. It's about communication and clear goals and having empathy for the other people around you who rely on you and on whom you rely. You can't "-as-a-service" human problems. And this article, IMO, demonstrates that well--they're talking mostly about people, except where services just have to plug together and wait for that human to come by with a roll of duct tape.
Technical: honestly, meh. You pick a cloud, you use their hosted services where it's safe to and you pull something tested off-the-rack where it's not (i.e., I am all over RDS, but I steer clear of DynamoDB--that sort of thing), you go on with your day. Over time you'll probably (everyone does, and IMO it's at a point of maturity where I am, too) gravitate towards containers, using them to make the "developer" part of the stack taller and leveraging tools like Kubernetes to handle much of the deployment side. Operations' bailiwick is then (fortunately) much more restrained, and in the two-stddev size scale that stuff is just not that complicated.
What does become complicated is people. Having tried to -as-a-service the technical side of things in ways beyond "hosted X", it quickly becomes something where the business processes of your company (the bits that actually matter) inevitably require doing something-or-other in a sufficiently bespoke manner that an "-as-a-service" option for that becomes at best clumsy or at worst hostile. (I'm reminded of stuff like Rightscale or AWS OpsWorks when I say that, though my product was rather more in-depth. Probably to my detriment; it made it even harder for bespoke processes to fit into the system!)
Invariably, the challenges for almost all of my client are people challenges. Training developers into their roles' expansion, training operations people into a culture of and a habit of automation, and facilitating frank and open discussions between them. That stuff is hard. IMO, it's also the rewarding part of devops, and it's the part where your team can be a multiplier rather than an adder. But I do think that ownership of the orchestrative bits of the technical side are still (maybe always?) necessary in order to have something to multiply.
(Usual plug: I both advise and assist with implementations on the technical side and I consult, travel and remote, on the human side. Happy to chat and help figure out if and how I can help you and your folks. Contact details are in my profile.)
The closest you will find is a consultancy company which has a set of pre-built frameworks/tools for their customers, and then based on budget to allocate number of “consultants” to work with client (they might work 3 days a week “full day”) depending on the budget.
Is that what you are asking? Or you want more like a comprehensive solution like “install some agents, let us access to your github org, so we can tell you when there’s vulnerability, we will managr server updates, automatic deployment and what not” basically we build everything for you and sell that as a solution?
Spot on mate! I am leading security/operations team for a large telco and the developers are the 1st to get stressed out, when there is a major production issue... We are the ones leading the complex debugging sessions and solving problems in the middle of the night...
[ed: TL;DR
Is your service more difficult to deploy to production with all best practices, real security, redundancies and backup - than it is to build? Then you're not doing devops.]
I don't think there's great consensus on what devops means. According to Wikipedia[w] the original definition was:
"DevOps is a set of practices intended to reduce the time between committing a change to a system and the change being placed into normal production, while ensuring high quality."
Which is a bit of a non-definition if you ask me. You can have that with traditional good system administration.
I see it as two things: the change from how a service or product is viewed, to a more holistic, system oriented view: the acceptance that "the system" consists of the interaction of code, configuration, infrastructure and users (along with changing/evolving requirements, because: humans).
And that one entity needs to own the entire process; the devops team. That team might change a bit in composition; but it means that rather than write some "application" and ship it to the customer on a floppy disk - most useful software is a process more than it is a product.
The other part (which is connected) is that as software delivery is a continuous process, it becomes natural to incorporate/inform "development" from the ops side (12-factor apps, plan for dev/stage/prod deploys, make a program/service that is easy to monitor, test, run and manage -- not "hexedit configuration in this zip of executable and config data and run it with magic_jar_executable, make sure to restart it after it's written 48mb to disk"). And vice-versa: be more explicit in treating administration as a traditional software project; version control, tests, immutable deployment states with snapshot / rollback support.
There's another group that seem to think "write software and throw it over the wall to the admins" is a perfectly sound model; all you need is better admins.
And to be clear: bad admins is a bad idea we/they will break your fool-proof single fat jar/single fat c++/go executable with resources baked in.
And good sys admins can work.magic with your broken software.
But it's not devops when the two groups are on different teams, doing different things - throwing rocks, insults and build artifacts over the wall at each other.
Good comment - I am a DevOps lead and have spent time thinking about this. PErsonally, I try to promote Dev/Ops - the slash is important - you still have dev and you still have ops but there must be a culture of co-operation and flexibiltiy on both sides. IT is very much a people problem rather than a technical thing
This is a great post, and specifically what we preach to our customers. Having data that is exportable, respects the developers' time, and accurate is immensely important, and specifically what we focus on. We get flak sometimes for focusing too much on developers and not enough on security engineers, but we really do think that's the future.
If you agreed with this post, or if you're interested in chatting about DAST for web applications or APIs (launching next month, actually thorough scanning of APIs that isn't just throwing a webapp scanner against it), feel free to reach out. I'd love to chat.
> With this freedom, teams have the responsibility of securing their services, and ensuring security issues are fixed within an established timeframe.
How does the security team ensure that security bugs are being addressed. Individual engineering teams sometimes do not have visibility into the whole system and may think of an issue as low severity but combined with another bug in another system and its now a sev:high (SSRF type of things.) With individualized team dashboards, how does the security team handle this.
> The Security Team is currently rolling out metrics to our service dashboards tracking out of date patches, with an alarm to page the team when security patches are left unapplied. Giving the teams visibility into the risks lets them prioritize and schedule their patching, instead of relying on the security team to monitor patch levels.
How is the compliance to security policies ensured ? What if a team does not want to patch or the patch breaks their existing tools. A key element is these faced paced dev->deploy environments is exceptions. How are those handled and tracked ?