GPT-4 Week 4. The rise of Agents and the beginning of the Simulation era

QuadrupleA · on April 17, 2023

Does anyone have an example of agents like AutoGPT doing something useful? Everything I've seen seems to be stuff that GPT could do anyway without the agent cruft. And iterating seems to multiply the opportunities for LLM bugs and mistakes.

The hype seems to be that agents are an emerging form of AGI, but the fine print is always "it's not quite ready for production yet, it makes a lot of mistakes, I had to fix 20 things in the output, but it's fun to watch and I'm sure once we work the bugs out..."

jasfi · on April 17, 2023

The intention is for them to do things that are more like projects. E.g. build a website or even a business (e.g. an ecommerce shop with marketing, etc). AutoGPT and BabyAGI are only a few weeks old, so they need time to build. So they are being criticized too early in my opinion.

saurik · on April 17, 2023

They are also being hyped too early. It is 100% fair--hell: I'd even say, more correct--to criticize something for what it is, instead of trying to predict what it might one day become. If the stuff gets better, it can come back around then and soak up its deserved praise.

(Though--and this is an unrelated issue--in the case of some of this AGI work, if it actually works we are probably going to collectively wish it hadn't been built, as no good is going to come from rapidly empowering an AI to become in any way autonomous, even momentarily.)

jasfi · on April 17, 2023

I think a lot of the hype is because it's close, but not quite usable. Nobody's really sure what the weak points or limitations are.

This sort of dev isn't easily slowed or stopped, it's Open Source except for GPT-4. Even the Open Source LLMs are getting there.

ratg13 · on April 17, 2023

Right but all you have to do is learn to engineer prompts better.

You should be able to start with a fresh session and feed it what you need to get the result you want.

If you’re always relying on the entire history of your sporadic conversations you’re often going to get a mess.

IMO these “improvements” are just crutches for people that don’t want to understand AIs, and want to see them as they are portrayed in movies.

It’s an attempt to make them seem intelligent, but it’s just a gimmick, not a tool.

jondwillis · on April 17, 2023

Also, anyone who has looked at Auto-GPT and used it will notice how FAST features and improvements are being added. It boggles the mind.

jasfi · on April 17, 2023

I think they're being inundated with issues and pull requests. So, the dust really has to settle.

jondwillis · on April 17, 2023

I got a very recent version of Auto-GPT to clone a template repository, launch a nextjs dev server, and badly start iterating on a product last night. It took about 3 hours of prompt and setting tweaking and cost about $35.

derwiki · on April 17, 2023

That is super rad! Are you able to share anything?

jondwillis · on April 17, 2023

of course!

- using redis for long term memory/vector search

- using gpt4only mode w/ 8k context for "fast" and "smart" llms

- allow local command execution

My (admittedly, magical and weird) ai.yaml looks like:

  ai_name: [app]-founder
  ai_role: You are working on a AI SaaS product called [app]. You complete all goals continuously to build [app]. You are a highly autonomous and efficient software engineering agent composed of ChatGPT 4 instances. You are running on macOS via AutoGPT. You rely heavily on memories from your past. You know [app] is an existing app in the current directory. You avoid 'search_files' because it will crash, so use ls instead. You avoid performing the same actions in a loop or repetitively. You manage a fleet of agents that can answer questions up to Sept. 2021.`
  ai_goals:
    - [something to effect of - git clone template repo from this url if it isnt already cloned]
    - continuously build an app- add relevant features by questioning the state of the web app that is located at localhost:3000 && localhost:5556, and improving the code.
    - understand and learn about what might make a good [product idea]
    - research and build a business (with moat) around [app/biz name] (limit sources to high quality such as news.ycombinator.com, github.com, reddit.com, etc),

ipaddr · on April 16, 2023

A lot of things people are doing. I can't help think how boring it all is. Is there anything interesting being solved?

lm28469 · on April 17, 2023

The only thing being solved is the reddit OP getting paid, they're overhyping everything and plays on FOMO so that you subscribe to their newsletter

> I'm kinda sad I wrote about like 3-4 of these stories in detailed in my newsletter on thursday but most won't read it because it's part of the paid sub

cardosof · on April 16, 2023

I also have this feeling of "everyone is doing the same thing, a GUI for chatgpt with some prepared prompts". The thing is, it's too risky and too soon for bigger issues to be tackled. It will take quite some time before LLMs provide medical and financial advice, if they ever will.

og_kalu · on April 17, 2023

There are organizations working behind the scenes doing just that. https://twitter.com/ai__pub/status/1644735555752853504

Take the gpt-4 based law ai, harvey for instance. I doubt many people know about what they're up to at the moment but they already have deals signed with some of the biggest law firms on earth with revenue quickly growing.

now realize that 4 isn't any worse in medicine than it is in law. this stuff is far closer than people think.

The hard perhaps uncomfortable truth is that GPT-4 is already proficient enough in several fields to bounce ideas off of as a colleague/equal.

euroderf · on April 17, 2023

Will there be a public defender version of Harvey ?

No ?

Well. Just as I suspected.

lucubratory · on April 17, 2023

Why is that something to hold against LLMs? They're making a ton of progress and will probably significantly improve the productivity of e.g. paralegals. The fact they're not legal persons and our rules against unauthorised practice of law mean they're not going to be in courtrooms any time soon even if hypothetically they were twice as good as defense attorney work (they aren't, right now). A technology can be really impressive and indeed revolutionary without making literally every job redundant. That seems like an absurdly high bar to hold anything to.

euroderf · on April 18, 2023

Yeah but... the law is already weaponised against the underclass. I'd like to think that A.I. could change this but somehow I doubt it.

"The law, in its majestic equality, forbids the rich as well as the poor to sleep under bridges, to beg in the streets, and to steal bread."

SilverBirch · on April 17, 2023

Yeah, it's kind of ironic that there's all this concern that LLMs will automate content to an extent that there'll be an unfathomable amount of text material on the internet that it'll be impossible to distinguish but in a way... it kind of already happened with crypto. There emerged a class of people who were financially incentivized to just constantly pump out bullshit about crypto- fake projects, fake coins, fake nfts, fake metaverses, all this crud, and because there was nothing underlying it it just raised the noise floor incredibly. Well now crypto is mostly dead and those same people have moved onto noisily bullshitting about AI.

There's genuinely some amazing stuff happening, but it's being damaged by a class of morons who want to run to the front of the crowd and shout "follow me!". The important thing is to just focus on the level-headed reporting of what's really going on - places like Hard Fork, TechMeme, and Pivot are doing great thoughtful reporting on what actually matters. It's amazing to see the contrast between All-In's hosts frothily making absurdly overstated claims based on some research paper that was just published, and then you hear about the same research from actual tech reporters and guess what... you get to actually find out about the research was and what they found rather than hear some no-knowthing gleefully spew about how he's going to make the entire middle class unemployed and sit on his pile of money like a fictional dragon.

anony23 · on April 16, 2023

I don't know how you can read this list and find it boring.

ipaddr · on April 17, 2023

What jumps out for you personally?

nosmokewhereiam · on April 17, 2023

A verbal calculator for one. I need to shout equations across the room and hear an answer. My hands are always covered in cannabis oil.

If it spoke like this one, bonus points: +16507299536*

*I believe it's accent is set to millennial.

dunefox · on April 17, 2023

That's just a calculator with extra steps. I don't find it terribly exciting in the 'age of ai'.

arroz · on April 18, 2023

I think they were kidding. I found it funny at least.

mwint · on April 17, 2023

I don't know why you're being downvoted, I think this phone service is amazing and I'm saving it to my contacts. I now wish to find a bunch of them with different 'accents'.

ChatGTP · on April 16, 2023

People are already playing around with adding AI bots in games. A preview of whats to come [Link]

Ok