Ah, my sister endured all this sort of thing during 20 years as a VP in corporate America. She successfully deployed new data systems to 120 plants in 50 regions in one year. Didn't cost $25M.
Her method? Ruthlessly purge the region of the old data system and install the new (web-based API to a central, new data system). Investigate every regional difference and consolidate into one model.
Before deployment day, get all the regional Directors in one room and tell them it was going to happen. Tell them there was no going back, no push-back would be permitted, and have the CEO in the room to confirm this.
It all went well, with her team of a dozen professionals that did a surgical strike to each region. IT equipment installers, trainers. Even one lady who's job was to ferret out the hidden PC somebody was squirrelling away in a closet (there was always that person) and rip it out, take it to her car and lock it in the trunk.
All went well, except for one region. That regional Director had missed the big day, and tried (vainly) to push-back for all the familiar reasons: special circumstance, retraining was going to keep them down for too long, and on and on.
100 years of data systems gone in a year (including paper systems) and replaced with a web API.
And then the devs moved on to something greater, the support was offshored and nobody could get the software to work as they wanted any more. 5 years later someone has the great idea to empower each plant by having different versions and the cycle continues.
It’s impressive that this worked. In my experience this doesn’t work because there’s so much “process” built around the tool to handle tons of undocumented edge cases that deploying the new tool just doesn’t work so it’s extra hard to get it up and running successfully.
The key to being successful at something like this, whether it's internal LOB software or a B2B enterprise product, is to provide a way for users to handle any exceptions out of band. If you try to code for every process exception you will (a) never go live and (b) create a horrendous monstrosity of a system. But if you can find the right spots to put in manual hooks then the users can be empowered to solve their own process exceptions.
An easy example is if you have an application that does a bunch of different calculations based on random rules and data sets. Instead of trying to code every calculation and random exception that the users throw at you, you start with the top 80% (or 50% or whatever) and give them a robust way to add a "manual calculation". Then tell them to handle their own exceptions by doing whatever crazy crap they need to do and just stick the result in the system.
It's difficult because if you're not familiar with the industry then you won't even be able to imagine the places where exceptions can occur. And the users will complain about having to do stuff manually. If you automate 80% of what they are doing right now and they still have to do 20% manually, stuff that they are doing manually right now, that manual work suddenly becomes a critical blocker that prevents the whole system from being useful. So you need someone with a strong personality who will not just accept that whatever the users are saying is right.
Some of the worst turmoil happens when suddenly 'business as usual' for customers/users you didn't recognize the full value of disappears, and the front line people have to say things like "the new system doesn't let us do that any more".
You're absolutely right that they'll complain and drag their feet no matter what and that rock-solid leadership is needed to overcome this, but if you at least give them a way to do all the good-for-business stuff they did before in a similar time frame (with an overall faster/better system in the long run, of course), the majority of the sane ones will eventually come around.
> Tell them there was no going back, no push-back would be permitted, and have the CEO in the room to confirm this.
I'm sure this is great when it works, but it's also reminding me of things like the TSB switchover disaster which lost them hundreds of millions of pounds and a lot of customers when their systems were down for a week. Apparently rollback wasn't planned for.
"No pushback" also means "do not tell us how this is going to go wrong, we aren't going to listen".
The hard part is each plant can have hundreds of workers whose entire jobs depend on the way things are set up. It can be hugely expensive, time-consuming and disrupting for months to try to replace processes with new ones, particularly if you expect things to continue running without interruption. It's one of those rebuild-the-airplane-while-flying-it situations.
Maybe this was a uniquely lucky situation, because it doesn't sound like a generalizable magic-bullet to me.
It would definitely be interesting to hear the other side of this anecdote. My guess is that there's a lot of workarounds happening at the plant level that are being hidden.
The plant now tracks each individual product, line, customer, contract in realtime with complete transparency to the parent company. They swipe barcodes and use online tools. Their processes were streamlined and paper removed.
It wasn't painless for everybody. In each region some good ol boy would resign the week before changeover. When they'd come in to do inventory some room or shed would be locked and nobody would have the key. They'd break the lock and whattaya know, the room would be empty. Supposed to be full of inventory.
>The plant now tracks each individual product, line, customer, contract in realtime with complete transparency to the parent company.
I'm sure that's what the parent company thinks is happening. Whether that's what actually is happening at the plant level is another question. So is whether the improvement came with a hit to productivity or any other hidden cost. Obviously I know nothing specific about the company but speaking generally I've seen a lot of work arounds and inefficiencies in my day.
I used to work in retail at Circuit City. Our inventory was a mess, so corporate hired a company to fix that. At 4 AM a bus rolled up with people that spent all morning scanning items. We had to monitor them for theft and such (yes we saw several attempts).
They did an awful job. For example they scanned an item and then saw there were 10 items behind it and just multiplied the first item by 10 on the scanner (think of thumbdrives where the first one is a 512 MB drive and towards the back there may be a 256 MB drive). Afterwards, our inventory was worse than before. Corporate of course couldn't know this. I know your story is different, but they could still be getting duped.
I've seen this before with customer satisfaction. In data collection efforts driven by executives you can easily get bad results because they have zero visibility into the data collection process and its abuses and problems. I call it garbage in, gospel out.
>Yes, every single item on every single line is tracked. Nothing (can) get sold without being in the system.
Worker: "Steve from Acme Co. needs a doohickey with a widget today but we don't have a SKU for that"
Boss: "Corporate takes 2 days to issue SKUs so just mark it as a doohickey and charge him 10% more for the widget"
Or maybe the customer has to go somewhere else because their turn around time can't be met.
>Doubters abound. Even now a decade later, folks meet this story with disbelief and doubt.
Because it's not our first time at the rodeo. If massive IT projects were this easy we'd all be doing it like that. They're generally not. Steam rolling the end user may have worked this time but other times it's ground the business to a halt because those individual plant level differences were there for a reason.
They're massive because they're made to be that way. Technology transition takes will and backbone. If that's missing, there's where $25M gets spent down the drain.
I agree they generally don't go well. Because they're run by weak Directors and VPs who are 2nd-guessing IT and getting in the way.
Plant differences are all well and good. But so is "IT not collapsing under 20 years of cruft". It costs something to bring all the plants in line. But it also profits e.g. they're still in business after 10 years.
Here's how the money part went. Once the tracking went online, they realized (region by region) they were losing $1M per month. 2nd month: losing $0.5M. Third month: break even. 4th month: 1M in the green.
Why? Because they now knew (for the first time) what the hell they were selling, how much to whom, and for what price. The VPs quickly became totally addicting to this instant-knowledge web API and used it to turn the business around.
I was a consultant and ran IT security years ago at a company that was bought in a hostile takeover for ~$1B which was a significant premium in order to get some expertise for a special market they didn't have in house. One of the first things the purchaser did was tell the acquired company that they needed to change all their SKUs to match the corporate standard. At least some management at the purchaser knew that the acquired company used SKUs to track batches so they could detect, track and troubleshoot manufacturing and design defects, but apparently thought that standardization was more important. That was far from their only bad decision but within 5 years they'd run their 1B investment down to just the residual value of the IP about 100M. Fitting things into a procrustean bed (standardization) isn't always a good business strategy and inventory management in particular can be more complicated than it looks.
They had all those problems, and solved them by being prepared. Training happened before changeover with a separate team moving from region to region.
Because something is difficult doesn't mean it is impossible, or shouldn't be done. The critical part is deciding you are going to do it, and not go back.
I did the same thing a few years ago in a healthcare place about faxing. I just issued a decree that we were done with paper fax. One my my team had identified a great product, my team ran with integration, we staged everything, tested, did training, then on rollout day we switch over all 45 locations, and terminated analog lines in each location (unplugged them and took the cords if we had to). A few people bitched but most people loved it. Sometimes you have to be the benevolent dictator.
Call me a pessimist, but all went well for a couple of weeks until they all realized they needed reports that don't exist in the new system and all the users individually realized the new system didn't enable the thousands of workflows that people had adapted their old systems to do that the managers didn't know about.
By this time your sister had hopefully moved sideways with "another successful migration" added to her resume.
Nope. Still running after a decade; still selling tires. Still adapting to requests for reports and workflow issues, as is possible with web-based tools. Because they are so much better than the old C-coded static tools and static reports running on a PC (not to mention paper). That's why we use web tools after all.
It requires you to have complete and unwavering backing from top management even when some of the guys whose toes you're stepping on haven been golf buddies with said top management for 20 years.
It also depends on you having a crack IT team who can get it right the first time. No months of stuff crashing or having a horrible interface that impedes work or being intolerably slow. You have to be confident enough to push back against the exec who says "oh, we'll just hire $massive_consultancy_firm to do it" and takes years to roll out a half baked hackjob on top of their stupidly expensive CMS.
> ...No months of stuff crashing or having a horrible interface that impedes work or being intolerably slow.
Dude, that's the not a temporary transition state. That is normal _all_ _the_ _time_ for much of enterprise systems. The crashing might stop, eventually, but the byzantine intractable complexity maintained by a small army of information-hording corporate battle-axes... that stays forever, regardless of whether you go to a single "web-api" or not.
That sort of nepotism can be a problem. It's why Amazon's search sucks so bad. In the book 'The Everything Store', it tells the story of when a team at Amazon responsible for something else saw how bad search was and decided to fix it. So they built a prototype with Elasticsearch and whatnot. It worked great. The dude at the head of the search team, good friends with Bezos, got all territorial and angry. Bezos solution? Have a test between the two solutions to determine which was best. Bezos idiocy? He let the head of the search team decide who won. So, no surprise, the head of the search team said the search team won. So the improved search was shut down and the garbage search we all have to deal with regularly got to stay.
Once you've got that kind of thing going on, only retirements and deaths can effect good change at an organization.
It is, in the sense that I wouldn't expect it to happen, at least to me. But it's not unrealistic that some CEO would actually listen to you about the matter and put his foot down to make it happen. These are not matters difficult to comprenhend, my guess is that people use to take IT stuff for granted and that's it. I've also known guys who didn't give a fuck when they saw upper management screw up tech-wise so as not to get into arguments. That also cripples healthy feedback.
Of course, now they've lost the people who designed that system, and are hard up to keep it updated. Like most large corporations, they 'outsourced' IT years ago. Which means, they lost their IP for their business processes to a revolving door of contractors.
My sister is retired, but predicts the collapse of the US business sector due to critical process failures within a few years. Its a pretty dismal outlook.
I have spent more time than I'm comfortable with over the last couple of years railing against microservices. But this seems like a classic case where microservices are needed. Each team can handle their own writing/validation and expose them to themselves and other services(including the global service). Of course you run into issues with transaction occurring across multiple databases but these problems are hard but solvable.
> All incoming writes need to go into a centralized log, such as Kafka, and then from there the various databases can pull what they need, with each team making its own decisions about what it needs from that central log.
This sounds crazy. I don't know any large companies that have successfully implemented it. This is basically arguing for a giant central database across the entire company. Good luck getting the 300 people necessary into a room and agreeing on a schema.
P.S. I don't know if he's using database as a shorthand for a service. If he is then you can ignore everything I've written.
This is why I always say that microservices are a technical fix for a political problem.
Technically they're harder to manage and often come at a high technical cost that you should never attempt to pay if you don't have to, but it's an effective way of managing the often extreme coordination costs that exist in large, dysfunctional companies that simply don't exist elsewhere.
I used to rail against microservices for years (working in startups) until I finally worked at a large dysfunctional corporation trying to run a lot of complex processes that spanned a lot of tightly coupled departments and then I started to see how they made sense.
It also gave me an appreciation of the problems Bezos recognized when he tried to decouple the organizational fiefdoms under his control.
100% agree but I use the term organizational instead of political because I feel like "political" hints that it's only necessary because of the organization's dysfunction and I believe microservices are useful even when the organization is functioning well.(though maybe the point where it makes more sense to transition is later in a well functioning organization)
It is somewhat limiting the fact that in our industry, challenges related to social and human factors are either dismissed, denied, or taken under negative tone. Social sciences and political philosophy could really benefit how we treat ourselves daily, in my opinion. However, it seems that if it is not “technical” is dysfunctional.
PS: I am not saying that’s my take from your post, just something I observe in general. Heck, even I use the term political to signal dysfunctional work. :/
I'm not sure I'd call it microservices, since they could be monolithic inside each organization, but I had the same thought. Just spend the time to agree on a standardized API that everyone can support, and then the central API just proxies to the appropriate subsidiary. I wonder why they haven't done it that way.
Coming from a clunky, overused but underdesigned monolithic structure, the work to standardize inter-team/inter-department communication is really hard.
In general, the easiest thing is to overspecify those APIs and fight anyone who wants to simplify them. Complicated work looks good on a CV, after all.
So now you have hard-to-implement, full-of-cruft API designs. At that point, teams realize that the easiest thing is to work around them.
And off you go into splintered components which ignore the standard APIs as much as possible. Turns out that's much easier, and you deliver results with higher velocity!
From far enough away, you can just see "more standard APIs => higher velocity", so obviously you keep doing that, right?
Developers and product people have a hard time apply the idea of only building the simplest, most minimal "thing" the consumer needs first and gradually iterating it when the developers and other internal personnel are the consumer.
An internal API is just as much a revenue driving product as an external one.
What might be easiest is "federated" or "decentralised" - since coordination is prohibitively difficult here, build a set of pieces independently. The truth is that they behave as separate companies in a Coase theory-of-the-firm sense. So treat them that way from an IT perspective.
>But this seems like a classic case where microservices are needed.
I dont think that will help. The fundamental problem is the amount of work necessary to begin producing results. Take the POS talking directly to the DB. What you want to build is a unified API but to do that you first need to write a new POS system and replace a bunch of physical devices. The problem is how do you convince the business folks to spend years and millions on something that will save/produce the company 0 dollars?
The path to do something like what's in the article is to first spend a year or so digging into code and finding every single issue like the POS one. Then you spend years fixing all of those issues. Then you spend more years actually building what you want.
As a consulting firm if you come in and bid $100 million and 7 years while everyone else does $30 million over 3, well, you lose. So thats what you bid. Who cares if it goes tits up? You collect your fat bonuses for 18 months and then move onto your next firm/position. The inevitable failure can be blamed on the person who takes over.
> Of course you run into issues with transaction occurring across multiple databases but these problems are hard but solvable.
The only thing you need to do to fix this is run all the services on the same DB.
> This sounds crazy. I don't know any large companies that have successfully implemented it. This is basically arguing for a giant central database across the entire company. Good luck getting the 300 people necessary into a room and agreeing on a schema.
You don't need every service to use the same schema. You only need transactions that span all services. They can use any data schema they want. A single DB is only used for the ACID guarantees.
I think his paragraph on outsourcing and "core competencies" is spot on. I would take it a bit further, however, since another consequence of this is IT trying to buy every solution from an external software vendor (Oracle, SAP, MuleSoft, etc.). The word "custom" becomes a "bad word" and IT uses it whenever there is an effort to build software internally. Unfortunately very often external vendor solutions are not apt to the task in big corporations. There is also the issue that each vendor implements its own database and very quickly in IT you spend most of your time "integrating" solutions with each other, especially writes as the article author points out.
The issue of trust is also real, but I noticed that if you build a strong team with a strong "internal brand" you can drive a lot of the execution and technical decision making from there (the world used for these teams in enterprise is "CoE's" but I mean something a bit more substantial by that than what you usually read around). The thing is that in reality your CEO doesn't care about your technical implementation details. This is a blessing and a curse. The blessing part is that if you offer a team with a brand and a vision that has collected some internal trust from multiple parties over a couple of years, you might be able to take your own technical decisions independently from IT or other departments around the world.
As for the technical part of this, I am not an expert, but security, RBAC and friends are hard problems, I'd be surprised if you find a vendor that can do this for all your edge cases. I'd be for a fully internally built solution. But again, no expert.
"Core competencies" is a widely misunderstood term. Lots of people equate it to "business model", as in "we sell widgets so therefore selling widgets is our core competence".
A thing is a core competence if, and only if:
* It makes a difference to your customers.
* It is difficult for your competitors to replicate.
* It provides access to a wide range of markets.
Janitorial services, as the OP says, do not tick any of these boxes (unless you are a restaurant or something, in which case it ticks the first). Black and Decker's core competence is not selling cheap hand tools, it is making small electric motors. Think about it. https://en.wikipedia.org/wiki/Core_competency
IT frequently ticks all three: if your IT goes down your customers may know about it before you do. Your IT is going to be difficult for competitors to replicate (as long as you haven't outsourced the whole thing), and you can use that IT in lots of different markets.
However there are too many senior managers who have heard the words "core competence", not read the article, and assumed that since they are not running an IT company it follows that IT is not a core competence.
Check out Wardley mapping - interesting technique to get teams discussing what is a commodity part of their business and what is something custom that really benefits them to do in-house. I agree people often judge this wrong, and it's an important point to raise.
I was thinking of Hertz, but the corporate history doesn’t align. The post said the company had about 100 years of history, but Hertz was founded in 1958.
Isn't the "she" here refering to a potential new CEO? I assumed it was akin to the style of academic writing where you use she instead of he to refer to a person whose gender is unknown in order to fight gender stereotype or something similar (example: "the developer, when she sees fit, can[...]")
Is this really a style though? I always assumed it was due to writers being non-native english speakers, and applying their primary language's noun gendering to english (In my case, in French all nouns are gendered)
The article I think misses reasons for keeping the different business units separate (assuming they are indeed BUs). They can act as the smaller “agile” companies in their own markets that the author seems to say can get a lot more done. Who cares if there’s a lot of duplication. There’s a huge human reason to keep these separate: autonomy. Nothing sucks more than being stuck in org bureaucracy for years never solving a customer problem, watching the smaller poorly funded competition do better. You’ll never retain talent doing that, which frankly is the real issue large companies struggle with. And you’ll never satisfy customers when reuse and not quickly solving customer problems is the priority.
Consolidation and “reuse” is often the problem, not the solution.
The (latest) CEO calls to "unify" (restructure - yet again) the company. That vague call results in the age-old purchase of a "technology" to solve the problem from a vendor - MuleSoft.
An "API" is the answer they are told they need - easy. Millions of dollars later after much time has past, the project is of course in trouble.
The author is asked if he knows anything about APIs. He says sure, they're easy. Just expose the database through an API. How hard can it be.
My prediction (guarantee) is that a lot more time and money will go down the drain before this API effort is finally abandoned. The issue is the chain of naive and poor decisions being made, which are exacerbated by the advice of the author.
> which are exacerbated by the advice of the author.
Really? what advice is being given here that would exacerbate anything? From what I see, the author is carefully enumerating the problems which need to be carefully addressed before this kind of change can be made.
He is deliberately holding up the 'You just need an API, how hard can it be?' perspective as a naive one and looking at why it isn't that simple.
I found this an interesting and useful argument, particular the element about trust and loss of control
From experience working on these sorts of sites - No its not - how you structure a large world wide care hire site is a known thing now its not rocket science.
In the past I have worked on one of the Big car hire sites and in my opinion Accenture and Hertz where just incompetent.
Probably Non Culpable Incompetence for Hertz, Accenture not so much.
Cases like these are either solved by giving someone absolute power to steer the project, resulting in squished toes and hurt feelings and potentially becoming a HBS postmortem, or accepting that it's going to take a while to reconcile all of the different fiefdoms.
There seems to be a lack of translation to "what should actually be built." Management wants a unified API, but my first (and probably incorrect) idea would be to develop an ORM + set of functional libraries which can be implemented independently by each of the involved parties. That has its own risks, but getting everyone speaking the same language is a good place to start.
This is an enlightening article, but I think it's missing an implicit conclusion to accompany the explicit conclusion that "one is constantly fighting against history".
History is absolutely one part of it, but the other part is that there must be a benevolent dictator (person or small group/council) with the authority and courage to make technology decisions. Without that role, bikeshedding can often drain valuable inertia — sometimes fatally — from the decision-making process.
My experience is that the actual power wielded by a high level executive is usually inversely proportional to the size of the organization they manage. Such executives are typically more effective at guiding strategic direction, not prescribing tactics.
Large, mature organizations have a lot of inertia, and that inertia is difficult to overcome except by perhaps the most charismatic individuals. I’ve rarely seen a leader dictate or bully their way to effective, positive change. Resistance is extremely likely for various reasons, and it’s not likely to be in the form of vocal mutiny. Instead, it typically appears in the form of excuses or lethargy. A waiting game is played, usually with success, until the musical chairs shift the management once again.
The scale of the story was intriguing, but it feels like a murder mystery novel missing the last chapter. What did the company finally do to solve the problem? :)
Tef's "How do you cut a monolith in half?" is relevant:
I’m starting to be of the opinion that large companies are impossible to ‘rescue’. Unless you count sustained change over 20 years as a rescue.
I swear, there are so many exceptions and special cases embedded in a large company that it’s impossible to replace any one product without also overhauling the rest, oh and overhauling customer expectations, because if they’re used to a static website that takes 10s for each page load, then whatever replaces it better takes at least 10s to load.
You're right. But then everyone who wants their systems worked on starts attributing "business value" to each change request. So now you have a huge backlog of changes with made-up points, and each manager says their change is more important than the other's.
The only way to get any traction is to get VP level people involved to prioritize which systems to work on. But some VP level people might see you reaching out to them as incompetence - "you can't prioritize your own work?! What did I hire you for!", etc.
At the end of the day it's really about the people in the org and how altruistic they are when considering the constraints of the IT dept.
> Just the pieces that are currently bottlenecking the org's mission.
The problem is that all pieces are interdependent. So you cannot just replace only a single part (of course that’s not always true, but often enough that it might as well be the rule).
the fun I find is there are so many touch points that fixing or replacing gets bogged down in just identifying them. then you are up against each touch points budget and hours availability to adjust to changes you have set forth.
one solution I have seem similar to the story's problem was to adapt an edi like solution, there was one group whose entire job was to transform data from group to group and each group in turn had to meet the interchange formats. that can be a political fire storm in itself because the other truth of big companies is that there are many fiefdoms within it and some have a lot of pull.
Unfortunately getting the actual requirements after the 10 layers of indirection in the enterprise means you just hear the Product Owner say that the requirement is the page load should take 10 seconds.
By that point any justification has been long lost.
Maybe they wrote an Excel spreadsheet that imports data by scraping the webpage using VBA (don't laugh, it can happen) and they relied on that 10 seconds as part of their VBA code?
I've already had the case of clients complaining when the load time was greatly reduced, because they felt the only way it could have been so fast is that the system didn't do as much as before. Which was true in a way (the best way to be fast is to do as little as possible) but the results were correct so nothing was wrong.
I think maybe that problem is that external sites are owned by old school marketing types - who do not have the knowledge and skills required for the 21st century.
For example the botched SEO changes ASOS made recently - that actually made the front pages of the financial press in the UK
This is my persona opinion and does not imply any connection with my employer.
Global changes like this are near impossible, but most of the work is for the local internal customer which is very possible and rewarding. Large projects are normally won or lost on the strength of the upper management, it is normally not a technical issue that fails large projects in large companies.
This blog post is a better case study than any HBR or consulting firm study I've read because the author has a grasp of each of the technology, governance, and economics of the problem. Purely economic or political models of organizations assume you can just reform one to align with incentives, but tech is a physical limitation, it's the new inescapable geography.
The companies and institutions I've encountered all essentially say, "We want success but without change," and then wonder why their initiatives fail. The point about "Agile" being a euphemism for "trust," is huge. Startups can be farcically naive (or cynical to the point of evil) about how trust in organizations works.
C&C enterprise structures don't scale, and they become immensely vulnerable to challengers as a result. This article reminded me of the opportunity that large companies create, like stored potential energy in the form of opportunity costs. It's like a kingdom spread so widely that a small raiding army can feed itself on what these companies left undefended while planning campaigns.
How much of this article is true for the rest of the Fortune 500?
In some sense, departments in a large company develop an institutional resistance to IT centralization efforts as a defense against the inevitable next reorg. Or rather, departments that don’t have this resistance didn’t survive the last one.
Though I wonder, if large companies are routinely this inefficient (which matches my limited experience as well), how do they survive? Naively put, why isn’t every large company killed by a startup next week?
> if large companies are routinely this inefficient (which matches my limited experience as well), how do they survive?
Because every other large company is as equally inefficient. When it's par for the course, it's not seen as an issue (or even seen at all). Efficiency is also usually not the metric that gets optimized towards - other dimensions such as predictability, reliability, longevity, consistency, stability tend to take precedence. Take the loyalty program vendor mentioned in the article - sure it's limiting their flexibility for what they want to do now, but it still exists. So depending on your viewpoint, that was a pretty solid choice in vendor.
Also, large companies have the scale to absorb the costs of their inefficiency with fairly minimal impact on their unit economics. And if a startup does pop up and gain traction with a much more efficient operational model, BigCo can write whatever check is necessary to gobble them up before it becomes a risk.
Information is very poorly distributed. It's hard to overstate how bad this is. Yes, that's contrary to what's required for a well functioning market. That markets function even a tiny fraction as well as they're supposed to on paper is practically a miracle, given, you know, reality. That they often wreck everything or are much more awful than we might expect or do weird crap like letting large orgs be astoundingly inefficient is unsurprising, given this. This is relevant because it means it's damn hard to evaluate a vendor aside from "they're big and everyone's using them so they're probably fine?" So big companies get big contracts, even when they suck, especially from other big companies (and governments).
Making a big organization work well probably requires a bunch of highly-paid people to take on personal responsibility and make judgement calls where the buck stops with them. No-one wants to do this. It's personally risky and there's a kind of game everyone plays where they know everyone's avoiding this and it's considered fine as long as you fake Doing Manager Stuff and bring the Big Four in when you can't avoid making a decision, so you can blame them if something goes wrong. The same attitude infects the entire organization, unavoidably. If the big contracts are rolling in anyway (see above) there's little incentive to take personal risks.
It's an example of success breeding success, purely for its own sake.
A ‘tech first’ approach (create an API) will not make this company agile. The question that the CEO of SuperRentalCorp needs to answer is ‘do we want to become a tech company?’ If yes, then start a multi year transformation starting with defining tech career ladders, a strategy how to train internals and hire externals, how to separate with those intervals that are unable to follow. Be clear and offer an opportunity for those that want to stay and be generous for those that can’t. If done correctly you will have the necessary basis for beginning a tech transform in a couple of years.
If the CEO does not want to be a tech company then I’d investigate options for spinning out a service
company to serve the rental market, ideally in collaboration with some competitors.
Excellent article. I’ve experienced a few of these issues at two vastly different companies (one had 60k employees, another only had 200). I implore developers to avoid working at companies with this class of problems. Work for a company that is strongly customer focused instead.
But you can have a company that is strongly customer-focussed that still faces this class of problem, surely? It would seem to me that solving these problems at a large company where there actually is trust and esprit de corps to get it solved could be tremendously satisfying.
I used to work with an old IBMer from the MQ team. He'd also worked at DEC and the BBC before joining my team at Chase in the late 90s. He used to joke that it was no accident that IBM almost collapsed at the same time as the Soviet Union as they were very similar organisations!
Interesting article, and it points out real problems that large organizations face. It made me think of the issues I've seen dealing with government contracts in the US. Policies can create perverse incentives and have unintended consequences regardless of their intention, and when dealing with large organizations those sorts of things have to be considered as they end up steering the ship.
Near the end of the article the author mentions that 5 people can look each other in the eye and trust each other, while 11,000 can't. This suggested a question to me: Where does that stop? How many people do you have to have before a 'local culture' isn't enough and trust breaks down? I've read things in the past about Dunbar's Number which suggests this number is around 110. That seems to be about the number of people that individuals can 'keep in mind'. I have often wondered whether organizations might benefit from taking that limit into mind and structuring things so that no level or group would be permitted to grow beyond that. To grow larger, build a hierarchy where groups of about 100 choose a representative whose task is to know the concerns of everyone in the group and represent those concerns when discussing with reps from other groups, etc.
Isn't the issue with the "primitive" customer loyalty company a classic case for abstraction? They should build an ORM like internal API to deal with the loyalty program.
It can send emails or even order a person to make a phone call if necessary. Whatever "primitive" means the "Invented-here" API can abstract that functionality, so that the internal API devs can access and update the data from the external company.
>To a large extent “be agile” is almost synonymous with “trust each other.” If you’re wondering why large companies have trouble being agile, it is partly because it is impossible for 11,000 people to trust each other the way 5 people can.
I wish the author had explained this more. What is it about agile that requires more trust than anything else? Can't you write lies in any format equally easy?
MBAs and other “professional managers.” Persons in a position of power that have not done any company business or tasks, either never, or for more than 5 years.
I work for Pivotal and I feel I have some relevant experiences (from which the following personal opinions are derived). I started in Labs, our consulting division, known for being vociferously about XP, Lean product management and User-Centred Design. These days I work in R&D. The latter has been very educational, as we've been involved in building larger and larger systems, with lots of "oh that's what our customers meant when they talked about problem-of-scale XYZ". I know that my own temptation as a Labs Pivot was to imply that customers just weren't agile-ing hard enough; R&D has kicked a bit of that stuffing out of me. As Brooks argued in No Silver Bullet, there is essential complexity and accidental complexity. Enterprises deal with amounts of both that are hard to comprehend. You get some fascinating problems and pathologies at Enterprise scale. As the article implies, nobody ever really intends to create them, they emerge from the system's evolution.
I think the core logic of outsourcing is sound, but it's like many simple ideas: simple, but not easy. It's actually a restatement of economics 101 -- comparative advantage and gains from trade. Even if you are strictly better at doing everything yourself, it still makes sense to trade with others. At Pivotal we often discuss this as "the value line", the point in the stack where, we argue, it is more effective to delegate your technology problems to us (or -- this is how I feel personally -- to a comparable vendor, like Red Hat, over pure DIY) and to turn your own technology efforts towards the things that add value for your customers. Even if you could do a better job than any of the vendors, it might not make sense for you to do so.
That value line isn't static, of course, and it's differently located for each customer. And it remains necessary to retain enough inhouse expertise to sensibly assess vendor solutions. Enterprise software and software development is a complex "experience good". This isn't limited to software, it's true of many complex purchases -- I have a book in my collection called Industrial Megaprojects which contains a fascinating parade of such examples.
All that said, transformation is possible in my experience. The key is, as the article describes, that it takes a long time, a lot of work and a lot of money. It most of all requires enough time for cultural norms to resettle and for trust to solidify. I've worked with companies operated on a fear basis and basically, the best we could do was to create insulated bubbles of agility. We've worked with other companies which have more or less transformed themselves once we brought a spark. Sometimes we will be the true vanguard of a revolution, sometimes we will be the fad-of-the-week that the hardened veterans will shrug and mouth the slogans for while continuing with the actual way things get done. Sometimes, because many of our customers are so vast, we will be heroes to one division and villains in another.
The best that I and my peers can do is to help each client as honestly, empathetically and fully as we can to transform themselves. But, as the article points out, that is hard.
Large old software companies have hilarious deep layers of abstraction and tooling on top of legacy code from people that have long left the company. Usually getting anything done requires gathering word of mouth arcana from multiple people.
Yep, some of these people leaving the company are developers who suffered from burnout crisis inside the 'Big Company'. :p
I'm one of these sort of professional developers leaving 'Big Companies' due burnout.
I think that 'Big Companies' need to deal with Ethics before trying to deal with Agile/Scrum. xD
> In terms of the best integration architecture, what seems to me the only long-term solution is something like the unified log architecture that Jay Kreps wrote about back in 2013. All incoming writes need to go into a centralized log, such as Kafka, and then from there the various databases can pull what they need, with each team making its own decisions about what it needs from that central log.
Then, from the linked Jon Kreps post, 2013 [0]:
> In order to allow horizontal scaling we chop up our log into partitions. Each partition is a totally ordered log, but there is no global ordering between partitions (other than perhaps some wall-clock time you might include in your messages). The assignment of the messages to a particular partition is controllable by the writer, with most users choosing to partition by some kind of key (e.g. user id). Partitioning allows log appends to occur without co-ordination between shards and allows the throughput of the system to scale linearly with the Kafka cluster size.
In practice, wouldn't the end "centralized log" be essentially a collection of the logs from each original/logical database, now managed by a specialized log team? To read the log, the interested consumer team would have to create the appropriate read-only replica of the original database, using the respective database technology / binaries / schemas. As such, isn't the overall system better described as "database replica as a service", as opposed to the finer grained implication of the "unified log as a service" of the article / Jon Kreps post?
I'm trying to figure out if the implication of such decisions are organizational, "who's responsible for the logs, who's responsible for the replicas", vs. architectural, "where does data bit X go, how does it maintain consistency with data bit Y, how do we read its current state?".
Her method? Ruthlessly purge the region of the old data system and install the new (web-based API to a central, new data system). Investigate every regional difference and consolidate into one model.
Before deployment day, get all the regional Directors in one room and tell them it was going to happen. Tell them there was no going back, no push-back would be permitted, and have the CEO in the room to confirm this.
It all went well, with her team of a dozen professionals that did a surgical strike to each region. IT equipment installers, trainers. Even one lady who's job was to ferret out the hidden PC somebody was squirrelling away in a closet (there was always that person) and rip it out, take it to her car and lock it in the trunk.
All went well, except for one region. That regional Director had missed the big day, and tried (vainly) to push-back for all the familiar reasons: special circumstance, retraining was going to keep them down for too long, and on and on. 100 years of data systems gone in a year (including paper systems) and replaced with a web API.