The midjourney rockband example here is another case of the two many fingers problem. I had heard that was solved? When you start looking closely, the faces are all mutated in odd ways, but somehow it’s convincing when quickly scrolling by.
tl;dr - AI researchers have never had to undergo code review, deal with continuous integration, or manage expectations of inexperienced users of their software.
Which is just far too true, and remains true outside of VC-backed scenarios like huggingface, but only because their playbook explicitly optimizes for this.
On the other hand, it’s a great time to be a seasoned software engineer who can help with these sorts of issues. If you’re trying to level up your machine learning experience - consider a strategy like this:
Pick a random lucidrains repository.
Try to get it running on your setup. It probably won’t. (No offense Phil! You’re doing God’s work and can’t cover every possible configuration).
Fix the issue.
Write a test for it.
Setup basic CI for the test.
Try to get the PR merged!
As somebody who was paid for many years to turn AI researchers' code into usable products, I have one piece of advice. If you want to become an AI researcher, don't fix their code. I did this in the hope that I eventually would get to work directly in an AI project. When I finally got to participate in such a project part-time, it turned out that I could finish in a matter of days a task that would take the AI researchers weeks. That didn't please them at all so soon enough I was switched full-time to my non AI project since that one "needed me more".
If you want to become an AI researcher, do AI researcher projects, simple as that.
I'm not sure I follow. They weren't pleased with the speed at which you completed tasks? I feel this might be specific to your company. I find this to be a very desirable trait, considering most professors at my university won't take undergraduate researchers on the basis that they simply don't complete tasks as quickly as a PhD would. Different contexts but I feel this example still holds some merit.
Also, we're entering a paradigm in which a lot of research is constrained by compute, and it can be very difficult to just "do AI research projects" when such projects involved training policies in virtual environments or designing the next transformer, for instance.
Professors, of course, don't feel threatened by students so want them to be as good as possible. But in a company where I had the same pay and title as the AI researchers, just different fields of work, me applications, they research, they realized real quick that if I kept doing 2-3 times more tasks than they were doing I would get experience with everything they knew. Everybody wants to have their own little field where they are the boss. A newcomer that keeps putting their nose everywhere might not be very desirable. I knew what I was doing and I knew the risk I was running. I was hoping that the project manager would support me. The manager did not support me, I was out.
Still, after hearing this story, my takeaway is not "Do not fix AI researchers' code if you want to become an AI researcher."
Rather, "Good news! It turns out a software engineer has much to offer AI researchers! However, as all too often, beware of politics - you need to find an organization where the AI folks won't be threatened by your fast work but will welcome you."
I wager in some part of your analysis you're not dead on the money but in broad strokes this all rings true. The political acumen necessary to survive as a programmer is intense, it is one advantage black programmers i've met had a better handle on than white programmers who are more often autism-spectrum n therefore terrible at office politics, cept when being oblivious is advantageous which it can be. It was ironic in the scheme of racial stereotypes present in American culture, that both black coders were employed while the white coders were not, this was in a hacker house, but it was simply because they were better coders. No issues there. We got along great, but in terms of office politics, n without being aggressors either, j very good instincts. It isn't talked about in plain terms enough even on HN, well i guess it is discussed a lot, but never like "alright, here's the strat if you're facing X" like you talk about in the face of other commentors's naysaying.
Probably true for any other researcher. Building tools (even hardware) is worthy or a researcher resume. During my Ph.D. I worked on adding python bindings to a few simulators. One paper is still pending over a decade and other one was published but never cited even though people use python bindings, they cote the original.
I am no longer in academia but have promised myself not to engage with an academic open source software. There is simply no incentive for development. Well, when you think about it, most academic work is publish and forget: maintenance is not a strict requirement.
As another ex-academic: maintenance is even bad for your career!
You're being evaluated based on how many papers you can publish, so the academic process selects for good (well, fast...) writers, not good coders. Papers are selected for novelty, so it's much easier to publish a paper based on a 'novel' algorithm than to publish a paper based on v2.0 of the algorithm. There might be novelty in the v2.0, but it's risky, reviewers and editors might not agree. There's always novelty in the v1.0 (well, it's academia, so it's more v0.1)
As someone who isn't in academia, I've heard of this being a problem before, but in the context of research related to computer science, it seems like private research at companies like Microsoft might be better for such research. A lot of interesting research comes from Microsoft, and I don't think they have a problem of over-incentivizing the speed of research publication.
That said, I'm not in academia (and have never done research) nor employed by Microsoft; I'm an undergraduate in computer science. Just speculating. Do you think this could be plausible, or is it way off?
As a researcher, I've even had difficulty making sure the code matches the algorithm as described in the paper. Getting old code running and validating that it's correct so you can compare is really non-trivial.
I have a habit of rewriting code because it is hard for me to understand without doing this and frankly, it isn't uncommon for the algorithms to be different. But the most common thing is that a small one-liner is a critical component to making the model work. Sometimes this is a one-liner in the paper, sometimes this is a one-liner in the paper that they built their work on, of the paper they built their work on, of the paper they built their work on.
People building on top of one another's code definitely helps make projects get up and running faster, but I do wonder if it makes research faster and/or better. If a secret sauce gets lost in the game of telephone, it can be hard to know how much is the secret sauce. Especially when comparing to works that don't use that line. Sometimes the secret ingredient gets lost and similarly the gains. It's also not uncommon for this secret ingredient to not be known, or acknowledged. Especially if it is "well known" (won't be for long).
There is so much "secret sauce" that we're unconsciously relying upon. I learned a lot of secret sauces by reading through various implementations, "logbooks" [0] by various researchers, and so on. It definitely makes AI/ML feel more like an art than a science, and it makes it very difficult to re-implement in another language (e.g., Julia) or framework.
Can you provide an example of where this has been successful?
I've spoken with many researchers and grad students only to find that there was a critical typo in the algorithm or undescribed setting (e.g., only converges when a learning rate scheduler is applied) in the code. I'll see the same algorithm implemented differently in different repositories. This is even the case for papers I've found with thousands of citations. It can be tremendously difficult to reproduce the results of papers, especially when they may require large amounts of compute that small researchers don't have. And I don't know whether it's a bug in my code or the paper or the algorithm.
I mean, what you describe is an unfortunate and unavoidable issue in academia (and in the world in general). GPT4 doesn't work magic here, of course.
You still have to:
1) Understand the work and the motivation. (GPT4 can help by playing the role of junior PhD if you can play the role of astute advisor.)
2) Sniff out things that are underspecified or seem wrong. (GPT4 also can help here, see above.)
3) Email the authors with questions, compare against shitty published codebases, etc.
depending upon how gnarly/rushed the prose is.
With that said, it's also "research smell" (compare "code smell") if a paper is so hastily written and undercited that you're the first person replicating it. And maybe instead of going for "my new bleeding edge approach that got 0.1% score better than boring old model with 50 cites", you probably should just implement boring old model.
So, where this has been successful for me is in implementing denoising diffusion for different problem domains. Given that there is broad literature on denoising diffusion, when some things are underspecified you can start looking at best practices for other researchers.
Alternately, other things like specific transformers etc.
Basically what I'm saying is that if you are trying to reimplement something that is so niche, it's like catching butterflies. A better research agenda involves working within a particular field of study where there is supporting evidence and approaches to compare and constrast to. This goes without saying, regardless of whether an AI is involved or not.
That’s just a first step. The next step is to fix issues with, for instance, training instability.
It is definitely free work though. There are absolutely ways to get paid to do this sort of thing. But if you believe in learning by volunteering it’s a good option. Having contributions to popular ML repos looks good to tech recruiters as well, so there’s that.
you can see all the merged pull requests (for PRs created within the last 120 days). And if you look at the "gift icon" you can see how long the contributors have been contributing to the project for. For many, it's less than 4 months, which means many new contributors are successfully getting their changes merged.
Hey thanks! And hopefully your green username doesn't make people think we are one of the same as your comment, while greatly appreciated, is not some smart (or stupid) marketing strategy on my part.
I honestly believe we can work smarter and I talk about the paradigm shift here https://devboard.gitsense.com/ggerganov/llama.cpp?board=gits... Note, the repo that is talked about in the intro DevBoard hasn't been pushed yet. DevBoard will be open sourced and I'm currently working on a browser extension (will also be open sourced) that will convert GitHub's insights tab into a hackable dashboard, which should make it easy for people to address GitHub limitations.
lol, this hits close to home. As a PhD student with several years of past experience in Dev work, I have become the main reference point of the lab to deal with this stuff. Just now I have spent 2 days debugging a Dockerfile shared in a repository published in a paper to build the environment to run the experiments.
One interesting thing about this is that since the moment the experiments have been done, and the paper is actually published, you are probably several (even major!) versions behind on every possible software. If the shared information does not take this into account (i.e., versions are not fixed when trying to create reproducible environments), you're in for some painful debugging :).
Honestly, keeping the best ML repos hard to run unironically gives strong job security. I’m happy when something requires a bit of work to get running, means I get to stay employed longer and the normies who will ruin it (I.e. the idiots writing shit like “wormGPT”) will be delayed.
We are also in a rapidly ending golden age of the AI research community treating copyright like it doesn’t exist (which is good and Aaron Swartz was and is a saint) Every PR that makes it easier to run their code makes us that much closer to going back to the hell of strong IP regulations and laws.