Give an LLM the text of a PDF document. Ask the model to extract values in the d...

PH95VuimJjqBqy · on Feb 23, 2024

what you're describing is automation and companies have been doing it for years.

humansareok1 · on Feb 23, 2024

Having worked on this directly and used basically every other piece of "automation" software available to do this, I can tell you that the GenAI solution is far superior. It's not close.

PH95VuimJjqBqy · on Feb 23, 2024

because apparently somehow ChatGPT magically knows what you're trying to extract.

Oh it doesn't? odd that.

ChatGPT cannot make judgement calls like what you're trying to imply it can.

ChatGPT can do some really cool things, but it's not magic.

fnordpiglet · on Feb 23, 2024

If you tell a multi modal LLM to extract and structure the contents of a PDF it will absolutely be able to do that successfully. Further, they display a surprising ability at abductive “reasoning” (acknowledging of course they don’t reason at all), and are thus able to make pretty reasonable assumption given the semantic context of a request. Unlike traditional extraction tools that require very specific tuning and are very fragile to structural changes in layout, LLMs tend to be very resilient to such things.

humansareok1 · on Feb 23, 2024

This guy prompts.

PH95VuimJjqBqy · on Feb 23, 2024

what are you extracting?

aurareturn · on Feb 24, 2024

Just go use GPT4 and try it. You keep repeating the same rhetoric that it doesn’t work but people have made it work, including myself.

PH95VuimJjqBqy · on Feb 24, 2024

I pay for GPT4 enterprise, try again, only this time answer the question.

ben_w · on Feb 27, 2024

> I pay for GPT4 enterprise, try again, only this time answer the question.

When I'm frustrated, I talk to ChatGPT like that.

It works as well for the LLM as it does for the humans in this thread.

What's worse is, I'd been writing some SciFi set in 2030 since well before Transformer models were invented, and predicted in early drafts of my fiction that you'd get better results from AI if you treated them with the same courtesies that you'd use for a human just because they learn by mimicking us (which turned out to be true), and yet I'm still making this mistake IRL when I talk to the AI…

aurareturn · on Feb 24, 2024

Why are you paying for enterprise if it’s not useful to you?

PH95VuimJjqBqy · on Feb 26, 2024

it's almost as if you're so hyped over this you take someone cautioning that it's not magic as the enemy.

maybe stop doing that.

The PDF example being thrown around in this thread. There's a magical step in the middle that no one is acknowledging.

humansareok1 · on Feb 27, 2024

What Step? Do you think we're lying? I literally have built an application which takes in PDF's and extracts over 2 dozen values from it via prompting with Langchain. Do you think I'm a paid OpenAI sponsor or what?

You also realize that OpenAI has a dedicated Document Assistant which will literally extract information from a document you upload using prompts? Are you just unable to get that to work? I just don't know what you arguing at this point, like you're watching people walk backwards and then yelling that it's impossible for humans to walk backwards.

PH95VuimJjqBqy · on Feb 27, 2024

odd that you keep using that word prompts.

humansareok1 · on March 1, 2024

Yes you send a message to the API containing the context you are interested in along with questions about that context the LLM can answer. The parlance commonly used refers to that message as a prompt. I don't know if you're delusional or just completely clueless about how LLM's work.

humansareok1 · on Feb 23, 2024

Honestly just a skill issue on your part. RAG and one shot learning can get you incredibly far and if you can't figure it out you're ngmi. And no one is using ChatGPT for this lol.

PH95VuimJjqBqy · on Feb 24, 2024

oh snap, mr hot-shit has the skillzors.

humansareok1 · on Feb 27, 2024

Wait do you not realize what a prompt or a context window is? You literally think GPT just does things on it's own?

You understand that If I want to extract the Date a letter was sent, who the recipient was, what amount is due on an invoice, etc that I have to send a specific prompt asking that to GPT with the PDF in the context? Do you just literally not know how this works?

NewJazz · on Feb 23, 2024

Yeah I'd love to see the results... If anything if there is a multimillion dollar benefit on the table one might argue that companies should publish this data in a more useful format. But no, let's bandaid over outdated practices like PDF-only data and burn GPU cycles to do it, with 92% accurate results.

fnordpiglet · on Feb 23, 2024

A lot of things don’t require 100% accuracy, and raging against the worlds outdated practices doesn’t solve problems faced in the immediate present - and after spending 30 years of my career rating at those practices to absolutely no meaningful effect, a more effective bandaid that has a semantic understanding of the content is probably as good as you get. And GPU cycles are meant to be wasted.

humansareok1 · on Feb 23, 2024

If it replaces hundreds or thousands of man-hours of expensive labor then yes lets do it.

joquarky · on Feb 23, 2024

"Should" is just a curse word.

Be the change.

NewJazz · on Feb 23, 2024

Lol I just don't read those PDFs. They are probably generated by shoddy automated tooling anyway. Garbage in, garbage out.

fnordpiglet · on Feb 23, 2024

A lot of garbage encodings encode valuable information and resilient systems follow postels law, even if you personally do not.

joquarky · on Feb 23, 2024

This sort of bespoke automation ~~is~~ was expensive.