Sadly I can't try this because I'm on Windows or Linux. Was testing apps like th...

wanderingmind · on Feb 28, 2024

lmstudio is using a dark pattern I really hate. Don't have a Github logo in your webpage if your software is not source available. It just takes to Github to some random config repos they have. This is poor choice in my opinion.

Hugsun · on Feb 28, 2024

We call that stolen valor.

woadwarrior01 · on Feb 28, 2024

Since I couldn't find it in your list, I'd like to plug my own macOS (and iOS) app: Private LLM. Unlike almost every other app in the space, it isn't based on llama.cpp (we use mlc-llm) or naive RTN quantized models (we use OmniQuant). Also, the app has deep integrations with macOS and iOS (Shortcuts, Siri, macOS Services, etc).

Incidentally, it currently runs Mixtral 8x7B Instruct[2] and Mistral[3] models faster than any other macOS app. The comparison videos are with Ollama, but it generalizes well to almost every other macOS app that I've seen uses llama.cpp for inference. :)

nb: Mixtral 8x7B Instruct requires an Apple Silicon Mac with at least 32GB of RAM.

[1]: https://privatellm.app/

[2]: https://www.youtube.com/watch?v=CdbxM3rkxtc

[3]: https://www.youtube.com/watch?v=UIKOjE9NJU4

sigmoid10 · on Feb 28, 2024

What's the performance like in tokens/s?

woadwarrior01 · on Feb 28, 2024

You can see ms/token in a tiny font on the top of the screen, once the text generation completes in both the videos I'd linked to. Performance will vary by machine. On my 64GB M2 Mac Studio Max, I get ~47 tokens/s (21.06ms/token) with Mistral Instruct v0.2 and ~33 tokens/s (30.14ms/token) with Mixtral Instruct v0.1.

castles · on Feb 28, 2024

Interesting! What's the prompt eval processing speed like compared to llama.cpp and kin?

woadwarrior01 · on Feb 28, 2024

I haven't run any specific low level benchmarks, lately. But chunked prefilling and tvm auto-tuned Metal kernels from mlc-llm seemed to make a big differenced, the last time I checked. Also, compared to stock mlc-llm, I use a newer version of metal (3.0) and have a few modifications to make models have a slightly smaller memory and disk footprint, also slightly faster execution. Because unlike the mlc-llm folks, I only care about compatibility with Apple platforms. They support so much more than that in their upstream project.

castles · on Feb 28, 2024

thanks, I'll give it a crack

iknowstuff · on Feb 28, 2024

MacGPT is way handy because of a global keyboard shortcut which opens a spotlight-like prompt. I would love to have a local equivalent

chown · on Feb 28, 2024

I am the author of Msty app mentioned here. So humbled to see an app that is just about a month old that I mostly wrote for my wife and some friends to begin with (who got overwhelmed with everything that was going in LLM world), on the top of your list. Thank you!

Datagenerator · on Feb 28, 2024

Looks interesting, but can't see what it is doing. Any link to the source code?

crooked-v · on Feb 28, 2024

One bit of feedback: there's nowhere to put system messages. These can be much more influential than user prompts when it comes to shaping the tone and style of the response.

chown · on Feb 28, 2024

That's on the top of our list. It got pushed back because we want to support creating a character/profile (basically select a model and apply some defaults including a system prompt). But I feel like that was a mistake tomwait for it. Regardless, it is getting added in the next release (the one after something that is dropping in a day or 2, which is a big release in itself)

hanniabu · on Feb 28, 2024

1) What are the mac system requirements? Does it need a specific OS version?

2) If you're privacy first, many would feel a lot more comfortable if this was released as an app in the app store so it will be sandboxed. This is important because it's not open source so we have no idea what is happening in the background. Alternatively open source it, which many here have requested.

petemir · on Feb 28, 2024

If you need help for testing the Linux version let me know, I’d be happy to help

chown · on Feb 28, 2024

I was actually looking for one! What's the best way to reach you? Mind jumping on our Discord so that I can share the installer with you soon?

smnscu · on Feb 28, 2024

Nice, adding these to my list. Here's a list that I put together, it has active GitHub projects for LLM UIs, ordered by stars:

- https://github.com/nomic-ai/gpt4all

- https://github.com/imartinez/privateGPT

- https://github.com/oobabooga/text-generation-webui

- https://github.com/FlowiseAI/Flowise

- https://github.com/lobehub/lobe-chat

- https://github.com/PromtEngineer/localGPT

- https://github.com/h2oai/h2ogpt

- https://github.com/huggingface/chat-ui

- https://github.com/SillyTavern/SillyTavern

- https://github.com/ollama-webui/ollama-webui

- https://github.com/Chainlit/chainlit

- https://github.com/LostRuins/koboldcpp

- https://github.com/ParisNeo/lollms-webui/

visarga · on Feb 28, 2024

Add Open-WebUI (used to be Ollama-WebUI)

https://github.com/open-webui/open-webui

a well featured UI with very active team

hmdai · on Feb 28, 2024

Try this one: https://uneven-macaw-bef2.hiku.app/app/

It loads the LLM in the browser, using webgpu, so it works offline after the first load, it's also PWA you can install. It should work on chrome > 113 on desktop and chrome > 121 on mobile.

lolpanda · on Feb 28, 2024

Oh thanks! didn't know there are quite a few ChatGPT local alternatives. I was wondering what users they are targeting. Engineers or average users? I guess average users will likely choose ChatGPT and Perplexity over local apps for more recent knowledge of the world.

chown · on Feb 28, 2024

Hi. I'm the author of Msty app, 2nd on the list above. You are right about average users likely choosing ChatGPT over local models. My wife was the first and the biggest user of my app. A software engineer by profession and training but she likes to not worry about LLM world and just to use it as a tool that makes you more productive. As soon as she took Msty for a ride, I realized that some users, despite their background, care about online models. This actually led me adding support for online models right away. However, she really likes to make use of the parallel chat feature and uses both Mistral and ChatGPT models to give same prompt and then compare the output and choose the best answer (or sometimes make a hybrid choice). She says that being able to compare multiple outputs like that is a tremendously helpful. But that's the extent of local LLMs for her. So far my effort has been to target a bit higher than the average users while making it approachable for more advanced users as well.

AriedK · on Feb 28, 2024

Looks great, though the fact that you have to ignore your anti-virus warning during installation, and the fact that it phones home (to insights.msty.app) directly after launch despite the line in the FAQ on not collecting any data makes me a little skittish.

Gunnerhead · on Feb 28, 2024

I’m looking for a ChatGPT client alternative, i.e. I can use my own OpenAI API key in some other client.

Offline isn’t important for me, only that $20 is a lot of money, when I’d wager most months my usage is a lot less. However, I’d still want access to completion, DALL-E, etc.

Would Msty be a good option for me?

chown · on Feb 28, 2024

Give it a try and see how you feel. "Yes, it will" be a dishonest answer to be completely honest at least at this point. The app has been out for just about a month and I am still working in it. I would love a user like you to give it a try and give me some feedback (please). I am very active on our Discord if you want to get in touch (just mention your HN username and I will wace).

Gunnerhead · on Feb 28, 2024

Thank you so much, I’m excited to give this a try in the next few days.

quickthrower2 · on Feb 28, 2024

Thanks for the list. Tried Jan just now as it is both easy and open source. It is a bit buggy I think but the concept is ace. The quick install, tells you which models work on your machine, one click download and then a chatgpt style interface. Mistral 7B running on my low spec laptop at 6 token/s making some damn sense is amazing. The bugs are at the inference time. Could be hardware issues though, not sure. YMMV

joshmarinacci · on Feb 28, 2024

Do any of these let you dump in a bunch of your own documents to use as a corpus and then query and summarize them ?

chown · on Feb 28, 2024

Author of Msty here. Not yet but I am already working on the design for it to be added in very near future. I am happy to chat more with you to understand your needs and what you are looking in such apps. Please hop on the Discord if you don't mind :)

hanniabu · on Feb 28, 2024

Some of my usecases would be summarizing a PDF report, analyzing json/csv data, upload a dev project to write a function or feature or build a UI, rename image files, categorize images, etc

greggsy · on Feb 28, 2024

https://github.com/imartinez/privateGPT

windexh8er · on Feb 28, 2024

Yes, GPT4All has RAG-like features. Basically you configure some directories and then have it load docs from whatever folders you have enabled for the model you're currently using. I haven't used it a ton, but I have used it to review long documents and it's worked well depending on the model.

Datagenerator · on Feb 28, 2024

Open-WebUI has support for doing that, it works using #tags for each document so you can ask questions about multiple specific documents.

8n4vidtmkvmk · on Feb 28, 2024

The new one straight from Nvidia does I believe.

greggsy · on Feb 28, 2024

Khoj was one of the first 'low-touch' solutions out there I think. It's ok, but still under active development, like all of them really.

https://khoj.dev/

chaxor · on Feb 28, 2024

What about https://github.com/open-webui/open-webui ?

Seems to have more features than all of them

theolivenbaum · on Feb 28, 2024

We just added local LLM support to our curiosity.ai app too - if anyone wants to try we're looking for feedback there!

stlhood · on Feb 28, 2024

Just FYI, llamafile includes a web-based chat UI. It fires up automatically.

vorticalbox · on Feb 28, 2024

have you seen llamafile[0]?

[0] https://github.com/Mozilla-Ocho/llamafile