I just wanted to chime in that we're a YC company as well (S16), and I'm thankful to the HN community for having been supportive through our whole journey.
It's a great idea, but I can't believe that the market is that large for this kind of data for 2 reasons: 1 - there's certainly a point of diminishing returns; and 2 - having good, clean data that's proprietary is a _huge_ differentiator. If I am the leader in autonomous driving, I doubt I'd want to pay someone else to help them train models that will help my competitors.
The problem I see with wading into other subfields (like my own) that need high quality training datasets, is that the datasets may be proprietary, and may not really overlap that much between companies in the same industry. For example, assembly line datasets for companies making almost the same product may be vastly different. I'm really struggling to see how you can possibly achieve the same scale in other industries.
Is it weird sharing the same name as a fashion icon ;)
And I'm curious about your ML "stack". Particularly the chicken and egg problem. Are you using something like Tensorflow with pre-trained binaries, perhaps from a vendor? Or is it 100% proprietary. Thanks!
Re 1—It has been a bit of annoyance growing up (for example, Google autocorrects "Alexandr Wang" to "Alexander Wang"), but we run different circles ;)
Re 2—As with most companies working on ML these days, our stack is not fully proprietary. We don't take too strong an opinion on ML framework and use both Tensorflow and Pytorch currently. We generally use neural network architectures from the literature and then iterate on top of them to suit our unique problem requirements.
The biggest change is your jobs goes from doing things (which makes sense) to building an incredible team that can do things (which is a more unintuitive job). In the limit, it’s always a people business.
Overcome many challenges, but per my last answer, building a team of the best people has been the most important and most challenging. That, and learning how to do sales ;)
Too many mentors. People in Silicon Valley are incredibly helpful. To name a few: Dan Levine, Mike Volpi, Nat Friedman, Adam D’Angelo, Ilya Sukhar, Jonathan Swanson, Albert Ni, Jeff Arnold, Charlie Cheever, and Drew Houston to name a few. I’m very very lucky.
We have a rule when hiring people—we look for people with an internal locus of control. Roughly speaking, this means people who believe they have control over outcomes in their life, as opposed to external forces beyond their control.
It’s a small thing, but it’s surprising easy to spot once you look for it. And it really matters—startups are the business of building something from nothing. You need people who believe they can bend the earth.
Congratulations on your fast growth! It is always great to see examples of companies like yours actually solving real-world problems in AI with original ideas and obtaining large clients that rely on your work.
I'm really looking forward to more of what Scale will do in the future!
Self-driving is one of many applications of AI/ML to the real world, each of which likely requires high-quality labeled data to truly be production-ready. This includes other robotics, self-checkout like Amazon Go, natural language understanding, and more.
Second, self-driving as a problem space will need labels for a very long time. In an application where (1) verifiable model performance is paramount, and (2) the models need to be extremely robust for cars to be safe, the need for labeled data is only magnified.
I see, I saw that you guys were doing a huge amount of value-add with things like segmentation for self-driving but didn't know you were differentiating yourselves from other competitors in the general labelling space like e.g. Mechanical Turk. Cheers!
I just wanted to chime in that we're a YC company as well (S16), and I'm thankful to the HN community for having been supportive through our whole journey.