Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The summary of his world building wasn't super huge, so it was just large enough to fit the ~8000 token limit for the GPT-4 model on OpenAI after I trimmed it a bit. I, too, would like to know how to properly get around these technical limitations.


I know it is something along these lines: Install a vector database. Use the API to get vector embeddings for the manuscript, by getting them in chunks. Apparently this is possible the with API even though it's not with normal ChatGPT? Then, think of the query you want to ask. Use the API not to answer the query, but to get the vector embedding for the query. Then, do a search in the vector database to get the vectors that are "near" the vector for the query. Then, finally send the query's vector, and all the "near" vectors to the API. And then you'll get your answer.

I don't know how to do any of that yet. So far it seems like milvus might be the easiest vector db to install locally. But vectors for text passages are very large, so I'm not sure why I'd expect the final query of multiple vectors to be small enough for the token limit. And I'm not really sure yet how to send a query to ChatGPT in vector form.

(Ideally this could work against an open model that isn't ChatGPT.)


Ask GPT-4 to compress the prompt in several shots, do one last shot with the compressed chunks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: