Yes, it would be fantastic to have more languages to test off of. I picked the base language I did (Mamba) because it was easy to modify and integrate into Python.
Generating the problems: I just thought up a few simple things that the computer might be able to do. In the future, I hope to expand to more complex problems, based upon common business situations: reading CSVs, parsing data, etc. I'll probably add new tests once I get multi-shot and reliability working correctly.
New base programming languages would be great, but what would be even better is some sort of meta-language where many features can be turned on or off, rather than just scrambling the keywords like I do now.
I did some vibe testing with a current frontier model, and it gets quite confused and keeps insisting that there's a control structure that definitely doesn't exist in the TiānshūBench language with seed=1.
I find that these books have to be read by the right person at the right time. Think and Grow Rich by Napoleon Hill did nothing for me when I first was exposed to it, but later on, helped me greatly.
BTW, the business book that helped me the most is barely known: Making Money is Killing Your Business by Chuck Blakeman.
> Another on that really irritates me is the kind that presents a series of integers and asks which integer comes next. Any integer will do, you just have to fit the appropriate polynomial.
But surely someone with a strong imagination could come up with a pattern to fit any number as the next in the sequence. I doubt most elementary educators even grasp the issue.
Yes. AutoLisp was available from the early days of AutoCAD. I didn't use it much myself. I just helped some mechanical engineers with it in a company where I worked, in small ways, just tinkering, really. At that time I was quite junior, so I didn't really grasp the power of it, so I didn't play around with it much.
I've been slowly working my way through this book for the past couple months. It's been amazingly helpful in learning all of the deep learning terminology, and giving a good overview of the technology.
I was doing all of the examples and exercises for a while, but gave up on that at some point. My main goal, after all, was to learn about how the technology works in order to separate the wheat from the chaff, not become an AI researcher.
Slop has been around a while. I was researching a topic, and noticed that most of the top search results had the same misunderstanding of some of the definitions. The writers were clearly not familiar with the topic, and I'm sure they were just copying each other. All of the articles pre-dated GPT-3.5.
The kicker is that if you ask GPT-4 about it, it spits out the same incorrect information, meaning that GPT-4 was likely trained on this bad data. FWIW, GPT-4o gives a much more accurate response.
I wonder (as an outsider) why would GPT-4o give more accurate responses? The training data is the same, right... so maybe somebody can point this out like for a kid, thank you.