I'm interested in RAG, so I make benchmarking & optimization tool for RAG system...

I'm interested in RAG, so I make benchmarking & optimization tool for RAG system that using LLM. AutoRAG : https://github.com/Marker-Inc-Korea/AutoRAG

Since it is python library, we deploy it to pypi. But for using it on my own, I am using H100 linux server on the torch docker & CUDA. Running it needs only vim and bash. And plus, for running local model I love VLLM. I make my own VLLM Dockerfile and use it for deploying local model in 5 minutes.

FYI : Borrowing whole H100 instance is really expensive, but in my hometown, the government support us the instance for researching AI.