I have a close friend working in core research teams there. Based on our chats, the secret seems to be (1) massive compute power (2) ridiculous pay to attract top talents from established teams (3) extremelly hard work without big corp bureaucracy.
Anecdotal, but I've gotten three recruiting emails from them now for joining their iOS team. I got on a call and confirmed they were offering FAANG++ comp but with the expectation of in-office 50h+ (realistically more).
I don't have that dog in me anymore, but there are plenty of engineers who do and will happily work those hours for 500k USD.
So in the end did he get anything? I dont know how these things work but did he just walk away with ~50k in pre tax income and 0 for RSU or did Musk pull a Twitter and not even pay him for those months?
It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt). Will be interesting to see if the relationship between power/compute/parameters and performance is exponential, logarithmic or something more linear.
It's logarithmic. Meaning you scale compute exponentially to get linearly better models.
However there is a big premium in having the best model because of low switching costs of workloads, creating all sorts of interesting threshold effects.
It's logarithmic in benchmark scores, not in utility. Linear differences in benchmarks at the margin don't translate to linear differences in utility. A model that's 99% accurate is very different in utility space to a model that's 98% accurate.
Yes, it seems like capability is logarithmic wrt compute but utility (in different applications) is exponential (or rather s-shaped) with capability again
Not really since both give you wrong output that you need to design a system to account for(or deal with). The only percentage that would change the utility would be 100% accurate.
> It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt).
IIRC achieving full AGI requires precisely 1.21 jigawatts of power, since that's when the model begins to learn at a geometric rate. But I think I saw this figure mentioned in a really old TV documentary from the 1980s, it may or may not be fully accurate.
And fun fact, without govt subsidirles, a nuclear power plant isn't economically feasible, which is why Elon isn't just building such a plant next to the data center.
While I sort of agree on the complaint, personally I think the best spot of C++ in this ecosystem is still on great backward-compatibility and marginal safety improvements.
I would never expect our 10M+ LOC performance-sensive C++ code base to be formally memory safe, but so far only C++ allowed us to maintain it for 15 years with partial refactor and minimal upgrade pain.
I think at least Go and Java have as good backwards compatibility as C++.
Most languages take backwards compatibility very seriously. It was quite a surprise to me when Python broke so much code with the 3.12 release. I think it's the exception.
I don't know about go, but java is pathetic. I have 30 years old c++ programs that work just fine.
However, an application that I had written to be backward compatible with java 1.4, 15 years ago, cannot be compiled today. And I had to make major changes to have it run on anything past java 8, ~10 years ago, I believe.
Compared to C++ (or even Erlang), Go is pretty bad.
$DAYJOB got burned badly twice on breaking Go behavioral changes delivered in non-major versions, so management created a group to carefully review Go releases and approve them for use.
All too often, Google's justification for breaking things is "Well, we checked the code in Google, and publicly available on Github, and this change wouldn't affect TOO many people, so we're doing it because it's convenient for us.".
Nope. It has been like five, maybe eight years, so I do not remember. There have been more since then, but after seeing how Google manages the Go project, I pay as little attention to it as I can possibly get away with... so I do not remember any details about them.
Java has had shit backwards compatibility for as long as I have had to deal with it. Maybe it's better now, but I have not forgotten the days of "you have to use exactly Java 1.4.15 or this app won't work"... with four different apps that each need their own different version of the JRE or they break. The only thing that finally made Java apps tolerable to support was the rise of app virtualization solutions. Before that, it was a nightmare and Java was justly known as "the devil's software" to everyone who had to support it.
That was probably 1.4.2_15, because 1.4.15 did not exist. What you describe wasn’t a Java source or binary compatibility problem, it was a shipping problem and it did exist in C++ world too (and still exists - sharing runtime dependencies is hard). I remember those days too. Java 5 was released 20 years ago, so you describe some really ancient stuff.
Today we don’t have those limits on HDD space and can simply ship an embedded copy of JRE with the desktop app. In server environments I doubt anyone is reusing JRE between apps at all.
While "Well, just bundle in a copy of the whole-ass JRE" makes packaging Java software easier, it's still true that Java's backwards-compatibility is often really bad.
> ...sharing runtime dependencies [in C or C++] is hard...
Is it? The "foo.so foo.1.so foo.1.2.3.so" mechanism works really well, for libraries whose devs that are capable of failing to ship backwards-incompatible changes in patch versions, and ABI-breaking changes in minor versions.
> Java's backwards-compatibility is often really bad.
“Often” is a huge exaggeration. I always hear about it, but never encountered it myself in 25 years of commercial Java development. It almost feels like some people are doing weird stuff and then blame the technology.
> Is it? The "foo.so foo.1.so foo.1.2.3.so"
Is it “sharing” or having every version of runtime used by at least one app?
> I always hear about it, but never encountered it myself in 25 years of commercial Java development.
Lucky you, I guess?
> Is it “sharing” or having every version of runtime used by at least one app?
I'm not sure what you're asking here? As I'm sure you're aware, software that links against dependent libraries can choose to not care which version it links against, or link against a major, minor, or patch version, depending on how much it does care, and how careful the maintainers of the dependent software are.
So, the number of SOs you end up with depends on how picky your installed software is, and how reasonable the maintainers of the libraries they use are.
> So, the number of SOs you end up with depends on how picky your installed software is, and how reasonable the maintainers of the libraries they use are.
And that is the hard problem, because it’s people problem, not technical one, and it’s platform independent. When some Java app was requiring a specific build of JRE, it wasn’t limitation or requirement of the platform, but rather the choice of developers based on their expectations and level of trust. Windows still dominates desktop space and it’s not uncommon for C++ programs to install or require a specific version of runtime, so you eventually have lots of them installed.
I don't see how Microsoft's and Sun's/Oracle's decision to encourage bundling all dependent software (including what would ordinarily be considered to be system libraries) with your program has to do with long-established practices in the *nix world.
I do agree that the world becomes much easier for a language/runtime maintainer if you get to ignore backwards-compatibility concerns because you've convinced your users to just pack in the entire system they built against with their program.
First of all, *nix is not synonymous with C++ programming, so focusing on it specifically is bringing apples to discussion about oranges. When Java is brought to the discussion about C++ I do expect that variety of platforms is taken into account.
Second, you can have shared libraries/runtimes on Windows or in Java world. There exists versioning and *nix is not unique in that. Both are rather agnostic to the way you ship your app. In server Java unless you ship a container, you usually do not ship the JRE. On a desktop - it depends, shared JREs were always possible.
Third, DLL hell does exist in *nix environments too. The versioning mechanism you mention is a technical solution to a people problem and it doesn't work perfectly. Things do break if you relax your dependency constraints too much. How much - it depends on developers and the amount of trust they put in maintainers. So you inevitably end up with multiple versions of the same library or runtime on the same machine, no matter what OS or cross-platform solution do you use. It is not much different from shipping a bundle.
> First of all, *nix is not synonymous with C++ programming...
Agreed. This is obvious. You even mention it below:
> Second, you can have shared libraries/runtimes on Windows or in Java world. There exists versioning and *nix is not unique in that.
As you said, Windows has the same issue (because it's a fundamental problem of using libraries).
> Third, DLL hell does exist in *nix environments too.
IFF the publisher of the library fails to follow the decades-old convention that works really well.
> Te versioning mechanism you mention is a technical solution to a people problem and it doesn't work perfectly.
Sure. Few things do. That's what pre-release testing is for.
> Things do break if you relax your dependency constraints too much.
Yep. That's why we test.
> So you inevitably end up with multiple versions of the same library ... on the same machine...
Sure. But they're not copies of the same version. That's the entire point of the symlink-based shared object naming scheme (and the equivalent in Windows (IIRC, it used to be called SxS, but consult the second bullet point in [0])).
There're hundreds of thousands of '3D worker' working behind the scene to create the 3D models for makeshift ads, and as far as I know many of them (including my high school mate) already got displaced by Midjourney and lost their job. This used to be a big industry but now almost entirely wiped out by AI.
> This used to be a big industry but now almost entirely wiped out by AI.
To my knowledge, 3D artists weren't that huge of an industry to begin with. One of my friends went to college researching 3D physics models, and never landed a job in the field long before the AI wave hit. Unless you're a freelancer or salaried Pixar employee, being a 3D artist is extremely difficult with extraordinarily low job security, AI or no AI.
I think "almost entirely wiped out by AI" is hyperbole, because the primary employer of these artists will still be hiring and products like Sora are a good decade away from being Toy Story quality. AI will be a substitute product for people that didn't even want 3D art in the first place.
Note that there's a more reader-friendly list with support for filtering by category at https://software.nasa.gov/ . Specifically I found DATA SERVERS PROCESSING AND HANDLING especially helpful for general SWEs. Here's a list of my personal picks:
- Shift: Self-Healing Independent File Transfer
- BASSHFS: Bash-Accessible SSH File System: SSHFS but without fuse dependency
- Ballast: Balancing Load Across Systems: load balance for SSH servers
I must be doing something wrong, but I can't download any of the software I am interested in, like SequenceMiner. After clicking a series of links, I end up on a page with no mention of SequenceMiner.
Nevertheless, interesting website and thanks for sharing. (Sequence/process mining seems especially intriguing to me.)
It depends. O0 turns off a few trimming optimizations and could potentially causes more information (code or DWARF) to be included in the objects, which may eventually slow down the compilation. In our large code base, we found that -O1 works best in terms of compilation speed.
To anybody unfamiliar with recent progress, boost::unordered_flat_map is the fastest open-addressing hash map implementation in this field for C++. Coming out just several months ago, it outperforms absl::flat_hash_map and various other implementations benchmarked in https://martin.ankerl.com/2022/08/27/hashmap-bench-01/.
It's a little hard to take seriously some of the top performers in that table, like emhash8, when the implementations are only a couple of weeks old and the commits are things like "fix crash", "fix clear", and "fix overflow".
I think I need more context for whet you’re trying to say here. I understand “web scale” in the marketing term Mongo sense but I don’t know how that applies to this
There's also a more blasphemy-ish approach of interop between C++ and Rust if the C++ code already have good Python bindings: C++ <=> Python <=> Rust. It's not bad as you may think. My company uses it to adapt C++ to Rust without rewriting the 40K LOC Pybind11 boilerplate. Going through the Python interpreter is definitely slower than a native call but perhaps <3x since we only rely on Python being a hosted + GC environment, plus it's much easier to express lifetime in Rust given that everything is managed on Python heap.
This reminds me of the -Wlifetime proposal, which provides similar checks but requires annotation of ownership at struct level (hence the check only applies to new structs):