I worked at a startup in the very early 2000s that was making online games in Java. Think applets with 2D sprite graphics.
They developed their own custom database, a shitty key value store that didn’t even have transactions because “Oracle doesn’t scale and it’s too expensive”. They couldn’t make reporting work so they copied the data hourly to Oracle anyway. (It turned out that Oracle does scale.)
They also invented their own RPC protocol, had their own threading library, used an obscure build system, an even more obscure SCM, etc…
Nothing was standard. It was all bespoke, webscale, and in-house.
When I started there as a junior they had me fix bugs in their code. Instead, I would simply delete thousands of lines of spaghetti and replace it with a call to a standard library. It was like spitting into a volcano to put out the fires of hell itself. There was no hope.
They burned $10M and made $250K revenue and collapsed.
Their competition used Oracle, ordinary tooling, and made tens of millions in profit.
As a counterpoint, I interned at Jane Street about a decade ago—when they were much smaller than now—and they also built everything in-house. I'm not just talking about the trading/finance-specific systems (I didn't interact with those as an intern), but general-purpose internal tools: a code review system, a pub-sub system, custom binary and text protocols, lots of developer tooling, their own build system, their own standard library and async runtime... All with <100 developers total.
And you know what? It worked amazingly well. I'm not just talking about scale, but also about how much any single developer seemed to be able to do. Their tools and APIs were tastefully designed, tailored to their own needs and fit together remarkably well. I remember that they were consistently building things with just one or a handful of people that would have taken multiple teams at other organizations I've seen.
Looking back, that was a remarkably productive engineering environment because of how much they were willing to do themselves. The key insight is that building something for yourself is qualitatively different from building something for external consumption or using something general-purpose that already exists. You naturally tailor the code and tools to what you need, which is much easier than building something for what you think other people might need—and, surprisingly often, easier than bending a different tool to your own ends.
But, perhaps more importantly, you develop a shared mental model of the system you're developing and how it fits in with everything else that you are doing. That shared mental model is, ultimately, worth far more than the code by itself... but it isn't legible to management, so it's hard to foster and preserve without a strong engineering culture backing it up.
The other lesson I took away from that experience (and others!) is that the high-level questions people focus on ("buy vs build", "monorepo vs polyrepo"... etc) are nowhere near determinative. Whether or not any given approach works out comes down far more to the little details and nuances of what you're doing and how you're doing it, as well as a healthy dollop of circumstances beyond your immediate control.
At the end of the day, if everything else was the same, but the startup you're talking about was using perfectly standard and popular tools, would you expect the outcome to be any different? I've seen far too many Enterprise-grade trash-fires to believe that. If the core dynamics that led to a mess don't change, you can have a "not invented here" mess or an "Enterprise Best Practices" mess or a "buy everything, build nothing" mess—but there's no simple strategic direction that won't get you some kind of mess.
Yeah, but it's Jane Street. They've got the money and talent to do it right and they've got a very strong dev culture that esteems correctness.
And they're in a market where a tiny competitive edge can reap large rewards.
They're rather the exception that proves the rule.
I'm sure all their competitors do the same for the same reasons. But for companies that view software primarily as a cost centre, I can see it going horribly wrong.
>I'm sure all their competitors do the same for the same reasons.
Trading firms tend to build a lot of their own tech, but
Jane Street is special in that they build almost everything in house. Ocaml is practically their own language. No other competitor comes close to this level of owning the entire ecosystem, including programming tools.
> shared mental model of the system you're developing and how it fits in with everything else that you are doing. That shared mental model is, ultimately, worth far more than the code by itself... but it isn't legible to management, so it's hard to foster and preserve without a really strong engineering culture backing it up.
Sounds awesome, did they have engineering leaders with long tenure?
Jane Street started as and remains a proprietary trading firm. They added consulting services as some point as well, but that was after they proved their competence.
For both parent and GP comments, would this not simply be explained by the quality of the engineers and engineering culture working on the tooling? I'm sure that Google today are perfectly happy that they did not build on Oracle, but they also had Jeff Dean etc to build everything they needed in-house. And meanwhile, there are others who tried to do the same and failed miserably, ala GP.
I imagine it's the quality of the team that matters, not the factors that go into whether or not to build bespoke in-house.
GP here: there's a couple of key distinctions to be made.
Jane Street is operating in a market where there are few off-the-shelf tools, the tools are your competitive edge, and hence it makes sense to design your own. Similar-but-different tools won't work, at all.
The place I worked at were doing the same thing everybody else was. They were making a pre-digital business digital, just like everybody else in the 2000 dot com boom. All the same off-the-shelf solutions applied. The secret sauce was the niche market, legal loopholes, and first-mover advantage.
The custom DB vs Oracle thing is a great example. The startup saved no money by developing a custom system, because they had to license Oracle anyway. They gained no scalability advantage, because Oracle scaled to their required level just fine. There's no edge there to be gained over the competition. Worse still, they wasted 3+ years developing their own system, which gave their competition a head start.
Goldman Sachs are famously the same. Propriety language (Slang), proprietary object database (SecDB) and then a million and one proprietary apps on top of that where a commercial or open source alternative exists. It may have moved on since I worked there, but it seemed to be worth the investment in the early 2000s when I worked there.
Yeah. One of my friends and coworkers started his career at GS around that same time and had some (very!) positive stories about both the tools and the people building them. It sounded like the combination of Slang and SecDB was a real force multiplier for building and evaluating the kind of complex mathematical models he was working on. That's the kind of tool development that shows off just how high-leverage engineering work can be.
My (100% outsider's) impression is that the focus, culture and teams at Goldman Sachs are totally different now thanks to market and regulatory changes, and that it is not a good place to work in a primarily technical sort of role any more.
Big surprise, the old adage "it's all about the people" rings true.
I'd guess this is also an area where the Dunning–Kruger effect and misaligned developer incentives are involved. Inexperienced teams are more likely to take on reinventing the wheel. When you're starry eyed it can sound fun, plus its a great opportunity to do some resume building.
But as folks become more experienced, they can end up salty and disheartened. If you've been around the block a few times writing your own database might no longer sound so exciting. Plus life has become comfortable, you've mastered the tools and aren't really all that interested in taking risks so it's easier to just recommend IBM or Oracle.
Both scenarios aren't really all that great.
Maybe Jane Street and their ilk are just better at rewarding risk while simultaneously providing strong mentorship opportunities (I see the intern projects).
> they were consistently building things with just one or a handful of people that would have taken multiple teams at other organizations I've seen.
> you develop a shared mental model of the system you're developing
How much of this was due using OCaml?
Despite some language-specific quirks, trying Elm has almost convinced me that dynamic typing is a mistake for any parts of a project which don't need to be used from a REPL context.
I like to think their choice of language definitely helped, but I'm also pretty biased :)
In particular, I've found that having an expressive type system makes it easier to develop and communicate this kind of mental model, as well as to maintain a close mapping between the concepts you use to understand the system and the code itself. I'm convinced that this is a powerful approach that can be incredibly effective and is hard to reproduce in "normal" languages, but it's hard to state this too confidently since it is entirely based on my own qualitative experience with programming.
This tracks with how I've seen "normal" languages converge on similar, flawed imitations of better type systems through tools and repurposed syntax. Thank you for confirming.
Do you have any recommendations or warnings regarding general languages which reach in the opposite direction? Reason[1] and F#[2] are both examples: they attach pre-existing ecosystems and compile-for-$PLATFORM tools to OCaml-like typing.
OCaml itself is also intriguing. However, I'm concerned that suggesting it for non-personal projects will go over poorly. The "GPL" in its standard library's LGPL license may scare people despite both the linking exception and Jane Street's MIT alternative.
This comment and its parent are really great. I see both of these viewpoints all the time, also in the analogous stuff you mentioned ("monolith vs microservices" also). It really seems like the specific approach doesn't matter as much as the skill and motivation of the team.
> Whether or not any given approach works out comes down far more to the little details and nuances of what you're doing and how you're doing it, as well as a healthy dollop of circumstances beyond your immediate control.
This is the most important thing I've learned working for 15 years in tiny SV startups as well as the most successful company in the history of capitalism.
The main problem is you can't really teach what all those little details and nuances are (the tao that can be told...). You have to live it and experience it. Over time, you meet people who Get It (the founder of the aforementioned called them "A players"), and people who don't.
All you can do is focus on doing great work with the people who Get It, cutting those who clearly won't/don't, and investing mentorship in young people who seem like they're on a path to it.
The most successful companies I have worked for in my 30+ years career built most of their tools internally. It gave them a huge competitive advantage.
The worst companies I have ever worked for didn’t have the skills to do it.
So yet again it depends on the quality of your developers. Not how you do things.
Last startup I worked at spent a lot of time and money on creating a scalable cloud infrastructure. And thought every single thing we needed to do had to be written from scratch. And then rewritten to be nicer. But no market fit was found, or even an actual product at the end of the day.
The last one I worked at was like this, due to its founder being 19yo and extremely inexperienced.
Raised 6M, no customers, no revenue, no product, a buggy, non-functional MVP.
19yo founder was terrified to relinquish control to senior devs because he was scared they would implement something he wouldn’t understand, so the company sat there burning cash making tiny incremental improvements and flip flopping on priorities as the new-shiny whims of the founder changed every few days.
I have no idea how he raised 6M, I think it was a combination of lies and FOMO from latter investors.
I tried my best to guide him and give advice but his ego was out of control (but only backed by mediocre skill set) so I had to leave when I realised I couldn’t help anymore
my company raised a small round e years in. enough to keep us engineers comfortable but no where near 6 million. meanwhile my cofounders have years of experience in financial analysis and running a restaurant while I have 10 years of experience as an engineer before stepping in here as cto and first technical hire. We actually do have traction and getting new customers every day off a product that was self funded. I can't understand investing 6 million on a 19 year old with no experience with an ego fueled by insecurity. A startup needs to be run like a phalanx because there isn't room for redundancy and dead weight. And definitely can't afford to stick to a ego fueled "vision." I trust my cofounders to do their job and they in turn trust me. the "vision" should always be about serving the users and thats pretty easy to quantify if you are willing to humble yourself and engage with them.
the good news is we will have more favorable terms when we do a series A. The good thing about starting a startup in a recession is that you're more accountable to making your business actually be something people want.
A few years ago, if some angel or 'friend of family' had a few hundred $k sitting around, they were getting 0 return on it. Investing in speculative startup ideas was at least a promise of a return. Now those same startup ideas are going to be competing against 5% no risk return. Some of the ideas like the one referenced above may not see $6m-level funding activity again because of higher rates now.
> A few years ago… a few hundred $k sitting around… getting 0 return on it.
People say this with poorly invested money all the time, but I must be missing something. You could invest it in an index fund, say the S&P 500, and annually yield over 10% on average [1].
Until recently, this was less than inflation. A year ago the inflation rate of the US$ was more than 8%. So the net result when applying an interest rate of 5% was -3%. Less than in most of the period before 2021 when assuming an interest rate of 0%. I doubt that investors are not aware of this, arn't they?
Certain schools are charging people north of 100k in tuition just because they can. Even at the ones that don't you see students driving six figure cars somewhat bizarrely regularly. It's not hard to imagine finding six people whose parents can scrounge up a million in financing, especially if you have sold the kid on your idea and they seem passionate about it as their first big investment.
It's better without a plan. Plans have risks and foreseeable problems. Handwavey dreams are all upside and no downside.
Of course you gotta find the guys with more money than sense. They're not exactly thick on the ground but during a boom cycle there seem to be quite a few. (Often the winners of the last boom cycle, convinced that they were smart rather than lucky.)
It also boggles my mind how these investors give money without doing ANY technical due diligence. This company claimed all of its value was in IP (i.e. source code) and it took me 15 minutes of reading the code to see there was nothing unique there, and deep down inside the core was just a bunch of //TODO comments.
The same is true for operations, if your business scales using humans you should avoid complicated tooling until you have product market fit. An old CEO curbed my enthusiasm by telling me “businesses are built on excel” and he was right.
I've spent a lot of time cleaning up the messes created by Experts like the article's author and I'm simply not convinced. It's easy to say "product market fit is King!!!" When you aren't the one working long nights and weekends reverse engineering code after a bug in pmf-hunting payment processing code made all purchases free for a day.
There is a difference between "table-stakes" complexity and premature complexity. I'd argue that a simple but sane CI / deployment pipeline takes relatively little work to set up and falls under "table-stakes" in that even a pre-pmf team will have a positive ROI in doing it.
On the flip side I have been the one working long nights and weekends reverse engineering code by engineers who prematurely built complexity into the system because they wanted to add a GraphQL api in addition to a rest API. All while in the pre-pmf days, with no value-add to the features that ultimately DID find pmf.
I do generally believe that cleaning-up after the pmf-hunting phase is itself a privilege that many startups do not get to experience, and should be treated as such. I understood the author as arguing that we shouldn't chase shiny things and should ruthlessly avoid complexity in favor of finding pmf. This philosophy is clearly illustrated in the devtools startup he is running. I thought there were some cool ideas there.
I simply reject the premise that all problems a startup needs to solve are original problems. Your customers have lots of ordinary problems too, as do you. Sure, you can't justify spending months on building some custom GraphQL infrastructure or the perfect CI/CD deployment system, but your customers do care about things like "when I download and install this software it's not a corrupt build" and "the software's updater works" and "when I pay these people money for their software I get what I paid for and don't get double-billed". These are all unoriginal problems that are nontrivial - ideally your startup solves them with off-the-shelf solutions to save time, but you still spend engineer hours integrating those solutions.
It's interesting to see how most of the comments here are either "no, this is how lots of startups end up with a mess they have to clean up" or "yes, lots of startups optimize this stuff too early".
And I think that's because both things are true! This is one of the many hard parts about starting a brand new company, figuring out the right balance to strike on this. It's no surprise that companies mostly get it wrong, and in both directions.
There's no single right answer here. It depends on exactly what the company does, the exact path to product market fit, what growth looks like afterwards, and how lucky the guesses about all that stuff were.
This is basically what I was trying to say as well. The trade-off I'm talking about is not "off the shelf and boring" vs. "bespoke and exciting", it's more nuanced than that. The decision companies have to make is more like "what is worth our time and investment and what isn't?". That "time and investment" trade-off may involve custom code vs. off-the-shelf solutions, or it may be between a managed psql that doesn't provide a useful configuration vs. self-hosting and maintaining that configuration yourself, or any number of other build-vs-buy and expedience-vs-longevity decisions.
My point is just that it's very case specific, and you can easily guess wrong in either direction.
Are there companies that die because they produced something people loved and wanted to buy but just couldn't deliver because off the shelf components were just too subpar? I haven't seen that case. I have seen companies die trying to perfect software no one is buying though, quite often.
Of course companies have died because of quality issues. Framing it as "they produce something people love but the quality is subpar" is a false framing.
There are no subpar products that people love. First of all, "quality" is relative. See chatgpt, it's wrong half the time, in a decade if someone were to release a chatbot of chat gpt quality we would say it's terrible. But today, it's the best we have.
The classic story is how airbnb and stripe launched without coding anything, everything was done manually.
Now launch an airbnb competitor today using the same strategy. Obviously comparing yourself to airbnb is dumb, because back then all software was terrible.
The actually successful modern companies of the past few years are openai, tiktok and figma. They all launched with complete products and are massively successful. That's what it takes today.
> I have seen companies die trying to perfect software no one is buying though, quite often.
Quality is absolute.
There were multiple LLMs released before ChatGPT, none made any splash, including GPT-3. Meta released one like 2 weeks before ChatGPT, and had to shut it down for bad output quality. Only GPT3.5 started to meet the 'wow factor' for it to go fully viral.
Until a product hits a minimum quality threshold, its useless. Which is basically what you stated later. Now most areas are mature enough that a copy pasted solution hits that minimum threshold (Eg, setting up a basic ecommerce site with ordering and payment). But for uncharted territory, that threshold is very real, hit it or die.
I think almost certainly, yes? But I think it would require a whole research project to gather persuasive data on this, in both directions.
But my intuition is that failing to find product market fit (for whatever reason) kills companies earlier, whereas hitting product market fit with a subpar engineering foundation is more likely to slow companies down in later stages where they die more slowly or just underachieve.
I think Twitter is probably the best known example of that second pattern. It may be apocryphal that this was their problem, but either way, I think this is a real phenomenon in general.
Part of the difficulty is navigating two sets of incentives. While yes, you can write fast and loose code that only a mother would love and do plenty of things much faster than whatever way is perfectly orthodox and correct. However, to be competitive in the job hunt, its expected people to have this portfolio of clean model code that follows all the best practices. The startup could fire you at any time, and you better have hoped you did your due diligence to spend part of your time on your own marketability instead of devoting it all to the startup, in this all too likely situation.
I believe the term for this is resume driven development. This also leads developers to focus on learning new technologies and integrating them to their current project, whereas what the business really needs are stable boring solutions. Of course if developers do focus exclusively on serving the needs of the business, they’re not building up their skills for their next job, and the business could still fail for circumstances outside of their control.
There’s a software law called Teslers law [1], which says that complexity cannot be eliminated, the most that can be done is to shift it from one part of the system to the next. You can make a similar law about risk. As a developer, if I ship a build system (complete a project successfully), or I build something with a new technology, I have a point to put on a resume and a successful accomplishment to talk about in my next interview. It may be that this is the wrong move for the business. It may be that they need a crappy spaghetti code abomination to achieve a product market fit in that time. If founders know what’s going on and they stipulate boring solutions the developer accepted risk. They’ve completely invested in the business and are completely tied to the decisions of their founders and leadership for their success.
In a perfect world this would start a conversation about compensation for accepting or balancing risk, but in my experience this absolutely never happens because it gets political. Leaders always win because they’re better politicians than developers. It’s easy for leaders to be willfully ignorant and dismiss these concepts as too detail oriented, and devolves into leaders saying “we’re nice people, trust us”. But the risk is there and someone is accepting it and developers respond to this by mitigating it on their end deep in the implementation details.
We have optimized at a local maximum, but are globally suboptimal. I do not believe we’ll ever achieve something more optimal because in business climates, everyone is trying to get something for very little effort. It takes much time to earn trust, but very little time to destroy it.
> I believe the term for this is resume driven development. This also leads developers to focus on learning new technologies and integrating them to their current project, whereas what the business really needs are stable boring solutions.
This can also help recruiting; that's why Uber had all those technical blogs about how super-complicated and full of Scala their solutions were, when Uber's business could be run on a PC under a desk.
Also why Palantir is called Palantir. They don't actually do anything cool or spy on people, but the name is intended to help get young tech people to work on boring government work without having to pay them more.
Out of curiosity, how long do people think it takes to setup e2e testing infra and ci/cd? Because for my startup I'm building as a solo founder, it took me 2 weeks and I really just can't imagine working any other way.
Yeah this startup stuff can feel like “don’t waste time buying a dishwasher for your cafe you just need to get coffee in people’s hands!”. Just buy the dishwasher you know you will need it. Unless the date is 1960 or something.
The trick is: learn CI/CD at your job. When you work on your startup it will be a trivial thing. I mean you probably will be using Vercel or similar where it takes 120s to setup and deploy from nothing. Most of that time is npm install on your machine.
What about other things? k8s for instance?
Simple: if it is second nature to you then you can use it. If you can’t fix 99% of the issues on the spot then don’t. Pick something else.
For my space, it takes months and hundreds of thousands of dollars minimum to build new testing infra from scratch. You get to choose how many of those dollars are spent on developer salaries vs vendor solutions [0].
For my projects I use containers. My rule is `docker build .` in the project directory must build the project and run all the necessary tests. It works for all technologies I had to use so far.
Making `docker build .` run on push is not a hard task and probably could be done in a few hours with cloud services. I spent a day to write a custom github workflow to run it on our runner, but that's probably not needed for most people.
After your OCI image is built, you can push it into the registry.
I guess that counts as CI.
Now E2E tests: I don't know anything simple yet. I guess I can hack together some script which will run docker compose and invoke that script with github push webhook. Probably would do it that way. Should work for not-so-complex projects.
I read the article as a general philosophy to spend more time on the product instead of premature scaling optimization. If e2e testing + ci/cd is cheap/easy and helps in your situation, then that's what you should do.
I feel the author is more criticizing the CTO who prematurely decides to rewrite the infra to handle scaling to millions of concurrent customers before they have those customers.
I view it more as a “% of resources” and “required number of nines uptime” kind of thing. If you’re a solo founder pre-PMF or even pre-customers, pausing feature work to build infra for 2 weeks is IMO often going to be excessive.
If you have a couple engineers and you spend one engineer for one sprint to get things running automated for the next quarters push, that could be a good spend.
The MVP of CI/CD is a few hours wiring up Gitlab/GH Actions. The MVP of e2e tests is more variable but a moving skeleton can be hours if no UI, days if including browser testing.
However what makes sense for a first pass is very context-dependent.
Maybe an hour for something complex? Five minutes for a web project? Because I'm going to use GitHub Actions and Vercel and not spend time building something.
This was a hard-learned lesson for me, because I enjoy toying with build systems. If I can get a project into production from the start, I'm much more likely to come back to it and keep working on it.
I imagine it would take me that long the first time having not set it up before, but that even if you started over from scratch it’d take you much less time.
I made another comment to this effect, but I think testing and CI (not necessarily CD though) have such quick payoff times in terms of productivity that you almost always want to set them up ASAP.
> Encore automates infrastructure for seamless development, from local to your cloud.
> Develop locally with instant infrastructure, preview PRs in dedicated environments, and skip tedious Terraform with automatic infrastructure setup in your cloud.
If you thought this sounded like a sales pitch for bad engineering practices, it’s because it is.
I find it very funny that a company hawking their mutant CI “solution” talks so much about pmf when what they’re selling is pretty undifferentiated
I agreed with a ton in this article, but just wanted to comment on this paragraph because I've seen similar notions be poorly interpreted:
> For engineers, this means taking shortcuts, making do with ready-made solutions, and focusing on speed of iteration over quality of iteration. You don’t get to spend time perfecting CI/CD pipelines or setting up the latest and greatest infrastructure-as-code versioning tool. You haven’t earned that privilege yet.
I agree, you shouldn't spend much time setting up the perfect CI/CD pipeline. That said, setting up a perfectly capable CI/CD pipeline in something like GitHub Actions took me literally less than a day, and maybe another day to work out some kinks.
I think it's important to focus on the author's primary point: The thing that is most critical is optimizing your speed of iteration. But too often I've seen this devolve into "we don't have time to build a simple deployment system!" or "we don't have time to write tests!" But the problem is that the lack of one-click deployment or any tests can become a huge drag on every single iteration in the future, and I've seen it absolutely sink companies.
Yes, things can definitely be over-engineered, and these days there are so many great, proven solutions for these common problems that it's a definite red flag if you build it yourself. But don't let that be an excuse for just generally shitty "we're a startup, we don't need to do XYZ" engineering practices.
> or any tests can become a huge drag on every single iteration in the future, and I've seen it absolutely sink companies.
And on the other end treat test performance as iteration speed so what once was a test suite you could run with vim-test on every little code change in seconds with a sqlite/redislite now requires docker, a 10 second startup time, and 5 minutes for a full run.
It's amazing the difference in productivity in projects where the test suite is fast.
Kind of disagree with this, which seems outdated given that it becomes easier and easier to implement best practices. Setting up CI/CD for a basic MVP is trivial, making it super easy to contribute makes iterations faster than just working on "product fit"
I think the overall point for this is fine, but I'd argue that both Node.js and and MongoDB have basically crossed over into the "boring" realm at this point. I see Node.js used in lots of megacorporations, and I feel Mongo is more or less creeping its way into megacorps to for "stuff that needs to be fast and I don't care about ACID"; I think at this point I'd have to use one innovation token to talk some companies out of choosing Node.js.
That said, I do kind of hate this mentality; part of the reason that crap like Java 8 refuses to die is because companies always want to "established" tech, no matter how horrible of a fit it is.
I’m not a huge fan of this because it’s kind of painting all “original (technical) problems” with a broad brush, while also being a rewording of the typical YAGNI/establish PMF advice lines. I suspect the author is coming from the perspective of primarily working on “applications software” like consumer sites or solving some novel/poorly addressed business need with software.
A lot of these things aren’t binary yes/no decisions. They range from “don’t even do the thing at all” to “develop a custom solution completely in house”, sure, but with a huge spectrum of choices in between: buy off the shelf, get something basic but inflexible working, get something more advanced and flexible with more setup working, establish a process-to-be-automated that engineers perform manually.
Especially on the <easy setup, kinda shitty> to <hard setup, powerful/flexible> gradient you really need to consider your needs and options’ risk/reward. Some things like tests and CI have such quick payoff times you almost always want something that does a decent job of things. Moreover, when selling technical products half the time your competitive advantage is actually being able to efficiently iterate and improve on the product faster than everybody else because competitors are bogged down in tech debt and manual operations.
The article is a rehash of other sources and not particularly original, although it packages it in an easy to consume format.
It does have, though, an important insight that I haven't seen talked about much: when you do some upfront optimization (e.g. for scalability), even if you end up needing it, most often your solution ends up not being fit for purpose.
This is true for startups, but it also applies to software in general. By the time your product is scaling, it will probably have gone through quite a few changes, and it's likely that what you predicted would be scalability bottlenecks are off the mark. Not to mention how damn hard is to predict what issues a product you haven't yet built will face on the first place.
Indeed; this is the bulk of AWS / Azure / Google Cloud's business model, the so called "kill zone" adjacent to their infrastructure offerings. They let startups conduct the risky, expensive R&D to find product-market fit, then step in with massively more operational resources to clone whatever worked, only with massively more resources on developing better tooling.
Would love some talks covering where the adjacent areas are to the cloud offerings - are you thinking at AI inference startups or more dev tooling?
The oddest trend I have noticed (may be selection bias) is how many people build dev tools these days... to the point even non-devs are starting to talk about their no-code builder startups... it's getting crazy.
Am convinced there exists web tech that enables you to move fast enough without requiring a full rewrite once you confirm product market fit. Preferably with first class type safety throughout (rules out RoR/Django).
In ye olden days, we referred to this as “skill,” but I’m more sure we’re allowed to talk like that now :)
Why avoid the rewrite? Why require your work to be "skillful" in the first place? Wouldn't it be better to focus all of your time and effort on the hardest problem of all, finding customers, and leave whatever wisps of brainpower you have left to the initial implementation, until you can hire people dedicated to that task?
Engineers think a tech startup is mostly tech, but in reality it's almost no tech at first. In fact, the more tech you introduce early, the worse off you are.
Phoenix framework is what you're looking for. I can build out mvps as quickly as in django/ror but the runtime perf is on par with go or rust (for io bound stuff)
Been in operation for 3 years and scaling has come out though incremental improvements. no large scale rewrites needed.
1. out of the bod req/res time is fast
2. multiclustered websockets come out of the box with channels for subscribing to events
3. easy to spin up a "service" as a genserver
4. best ecosystem of tooling for building asynchronous and parrallel systems
5. liveview is THE standard for reactive frontend frameworks where you are updating in real time from serverside html. (no shade thrown on rails but hotwire is a joke by comparison) liveview processers are persistent for the live of teh user session and can subscribe to server side events so you can have a user make a request, push the task into the background and update the ui when its completed with very little code.
ok so type safety isn't at ocaml, rust or haskell level but its good enough that Its rarely the source of bugs assuming you perform even minimal testing.
Phoenix is quite nice if you're coming to it from a dynamically-typed environment, but I'd strongly dispute that its type system is "good enough" at large (human, not technological) scales. Phoenix itself makes good decisions in a lot of places but it relies on things like Ecto for database manipulation, which (as just one example) is absolutely not enough for substantiative, human-scalable development--your "minimal testing" is my, as even a comparatively easygoing TypeScript-down-the-stack enjoyer, a whole lot of extra stuff.
It also faces impedance mismatch with a lot of cloud tooling--which is not a disqualifier by itself and BEAM might not even be wrong, but most Elixir-in-anger systems I've seen end up abandoning much of the benefit of BEAM clustering and running a bunch of horizontally scaled web applications because they've got containers to manage and it fits the overall get-it-out-the-door plan better. This makes things like "just add another genserver" not really map to reality all that well, though of course genservers are a good layer of abstraction and modularization on their own.
LiveView is great, though. I am hopeful for React server components and server actions to continue to mature; they're promising as it is.
> is absolutely not enough for substantiative, human-scalable development
you'll have to elaborate there. we have a pretty large codebase at this point that serves 3 different apps. granted our engineering staff is small but thats exactly what makes elxiir so great. It lets you build scalable infrastructure with a small team of humans
> most Elixir-in-anger systems I've seen end up abandoning much of the benefit of BEAM clustering and running a bunch of horizontally scaled web applications because they've got containers to manage and it fits the overall get-it-out-the-door plan better.
containers and clustering are not mutually exclusive. we run a elxiir cluster and take advantage of that with shared data and the ability to send mesages between each machine in pure elixir. At the same time, each vm is running inside a docker container managed with kubernetes. they work well together. we had one instance where a vm went down and kubernetes immediately brought it back up followed with it connecting to its peers.
It's never product-market fit that requires a rewrite. It's bumping into the limits of the tools you chose up front. In my experience, that's mostly a function of how technologically wrong your guesswork was. If you set out to build cloud cost tracking but mostly build AWS accounting, that's pretty close and you'll likely do fine. If you set out to build games and wind up finding fit with an enterprise chat system, you're likely to end of with a full rewrite no matter what web tech you started with.
In other words, I don't think there is any web tech that does this because it's all about what's between the ears of humans.
They developed their own custom database, a shitty key value store that didn’t even have transactions because “Oracle doesn’t scale and it’s too expensive”. They couldn’t make reporting work so they copied the data hourly to Oracle anyway. (It turned out that Oracle does scale.)
They also invented their own RPC protocol, had their own threading library, used an obscure build system, an even more obscure SCM, etc…
Nothing was standard. It was all bespoke, webscale, and in-house.
When I started there as a junior they had me fix bugs in their code. Instead, I would simply delete thousands of lines of spaghetti and replace it with a call to a standard library. It was like spitting into a volcano to put out the fires of hell itself. There was no hope.
They burned $10M and made $250K revenue and collapsed.
Their competition used Oracle, ordinary tooling, and made tens of millions in profit.