More

asabla · 2024-11-22T06:49:35 1732258175

Oh, this looks pretty well made. Since it's using nextjs and shadcn/ui, I wonder if they also used v0 to generate components.

Has anyone any experience with TiDB? Haven't heard about it before this post

datadeft · 2024-11-22T16:12:41 1732291961

Yes I have some experience with TiDB. It is pretty amazing actually. They came up with a novel way of distributing data across nodes and having strong consistency while also maintaining great performance. We are recommending it to some of our clients who are looking for an easy scaling option with MySQL (TiDB is MySQL compatible on the connector level.)

asabla · 2024-11-14T17:18:26 1731604706

at some extent I do agree with the point you're trying to make.

But unless you include pagination needs to be handled as well, the LLM will naively just implement the bare minimum.

Context matters. And supplying enough context is what makes all the difference when interacting with these kind of solutions.

dijksterhuis · 2024-11-14T18:27:29 1731608849

not parent, but

> I asked the AI to write me some code to get a list of all the objects in an S3 bucket

they didn’t ask for all the objects in the first returned page of the query

they asked for all the objects.

the necessary context is there.

LLMs are just on par with devs who don’t read tickets properly / don’t pay attention to the API they’re calling (i’ve had this exact case happen with someone in a previous team and it was a combination of both).

danielbln · 2024-11-14T18:56:37 1731610597

LLMs differ though. Newest Claude just gave me a paginated solution without further prodding.

In other more obscure cases I just add the documentation to it's context and let it work based on that.

asabla · 2024-11-09T18:32:31 1731177151

Title might need to include "Show HN".

Very cool and interesting project! How are build times? And how big are the artifacts?

I'll for sure keep an eye on this, and add it to my ever expanding list of tech to explore.

Thank you for sharing!

drogus · 2024-11-09T18:46:32 1731177992

The binaries are a few KBs at the moment. I haven't measured the build times, cause it's too early for it to mean much. The amount of supported types and functions is very limited, so it will change a lot over time as I add more stuff.

One interesting thing is that `eval()` support will require custom WebAssembly host functions, cause you can't do custom code generation in WASM. Thus by default the project will assume "no eval" compilation. In this mode it will be possible to do a lot of optimizations, like for example remove unused parts of the language/types, do certain optimiztions knowing exactly what types the script is dealing with etc. So a simple script that doesn't use a lot of the builtins should eventually result in a fairly small binary.

asabla · 2024-10-25T05:30:33 1729834233

Damn, that's some impressive speeds.

At that rate it doesn't matter if the first try resulted in an unwanted answer, you'll be able to run once or twice more in a fast succession.

I hope their hardware stays relevant as this field continues to evolve

tjoff · 2024-10-25T05:34:41 1729834481

The biggest time sink for me is validating answers so not sure I agree on that take.

Fast iteration is a killer feature, for sure, but at this time I'd rather focus on quality for it to be worthwhile the effort.

vineyardmike · 2024-10-25T05:52:44 1729835564

If you're using an LLM as a compressed version of a search index, you'll be constantly fighting hallucinations. Respectfully, you're not thinking big-picture enough.

There are LLMs today that are amazing at coding, and when you allow it to iterate (eg. respond to compiler errors), the quality is pretty impressive. If you can run an LLM 3x faster, you can enable a much bigger feedback loop in the same period of time.

There are efforts to enable LLMs to "think" by using Chain-of-thought, where the LLM writes out reasoning in a "proof" style list of steps. Sometimes, like with a person, they'd reach a dead-end logic wise. If you can run 3x faster, you can start to run the "thought chain" as more of a "tree" where the logic is critiqued and adapted, and where many different solutions can be tried. This can all happen in parallel (well, each sub-branch).

Then there are "agent" use cases, where an LLM has to take actions on its own in response to real-world situations. Speed really impacts user-perception of quality.

phito · 2024-10-25T07:52:28 1729842748

> There are LLMs today that are amazing at coding, and when you allow it to iterate (eg. respond to compiler errors), the quality is pretty impressive. If you can run an LLM 3x faster, you can enable a much bigger feedback loop in the same period of time.

Well now the compiler is the bottleneck isn't it? And you would still need human check for bugs that aren't caught by the compiler.

Still nice to have inference speed improvements tho.

vineyardmike · 2024-10-25T08:39:39 1729845579

Something will always be the bottleneck, and it probably won’t be the speed of electrons for a while ;)

Some compilers (go) are faster than others (javac) and some languages are interpreted and can only be checked through tests. Moving the bottleneck from AI code gen step to the same bottleneck as a person seems like a win.

menaerus · 2024-10-25T19:53:52 1729886032

Spelling out the code in editor is not really the bottleneck.

vineyardmike · 2024-10-29T21:55:47 1730238947

And yet it takes a non-zero amount of time. I think an apt comparison is a language like C++ vs Python. Yea, technically you can write the same logic in both, but you can't genuinely say that "spelling out the code" takes the same amount of time in each. It becomes a meaningful difference across weeks of work.

With LLM-pair-programing, you can basically say "add a button to this widget that calls this callback" or "call this API with the result of this operation", and the LLM will spit out code that does that thing. If your change is entirely within 1-2 files, and < 300 LOC, in a few seconds, and it can be in your IDE, probably syntactically correct.

It's human-driven, and the LLM just handles the writing. The LLM isn't doing large refactors, nor is it designing scalable systems on its own. A human is doing that still. But it does speed up the process noticeably.

tjoff · 2024-10-25T05:59:52 1729835992

If the speed is used to get better quality with no more input from the user then sure, that is great. But that is not the only way to get better quality (though I agree that there are some low hanging fruit in the area).

OhNoNotAgain_99 · 2024-10-25T06:32:11 1729837931

To be honest most LLM's are reasonable at coding, they're not great. Sure they can code small stuff. But the can't refactor large software projects, or upgrade them.

regularfry · 2024-10-25T07:20:01 1729840801

Upgrading large java projects is exactly what AWS want you to believe their tooling can do, but the ergonomics aren't great.

I think most of the capability problems with coding agents aren't the AI itself, it's that we haven't cracked how to let them interact with the codebase effectively yet. When I refactor something, I'm not doing it all at once, it's a step by step process. None of the individual steps are that complicated. Translating that over to an agent feels like we just haven't got the right harness yet.

vineyardmike · 2024-10-25T08:44:39 1729845879

Honestly, most software tasks aren’t refactoring large projects, so it’s probably OK.

As the world gets more internet connected and more online, we’ll have an ever expanding list of “small stuff” - glue code that mixes and ever growing list of data sources/sinks and visualizations together. Many of which are “write once” and leave running.

Big companies (eg google) have built complex build systems (eg bazel ) to isolate small reusable libraries within in a larger repo. Which was a necessity to help unbelievably large development teams to manage a shared repository. An LLM acting in its small corner of the wold seems well suited to this sort of tooling, even if it can’t refactor large projects spanning large changes.

I suspect we’ll develop even more abstractions and layers to isolate LLMs and their knowledge of the wold. We already have containers and orchestration enabling “serverless” applications, and embedded webviews for GUIs.

Think about ChatGPT and their python interpreter or Claude and their web view. They all come with nice harnesses to support a boilerplate-free playground for short bits of code. That may continue to accelerate and grow in power.

hmaxdml · 2024-10-25T14:04:47 1729865087

What's your favorite orchestration solution for this kind of lightweight task?

jeswin · 2024-10-25T05:43:51 1729835031

> The biggest time sink for me is validating answers so not sure I agree on that take.

But you're assuming that it'll always ne validated by humans. I'd imagine that most validation (and subsequent processing, especially going forward) will be done on machines.

tjoff · 2024-10-25T05:48:33 1729835313

If that is the way to get quality, sure.

Otherwise I feel that power consumption is the bigger issue than speed, though in this case they are interlinked.

threatripper · 2024-10-25T06:04:23 1729836263

Humans consume a lot of power and resources.

croes · 2024-10-25T07:53:27 1729842807

The basic efficiency is pretty high.

yunohn · 2024-10-25T06:07:00 1729836420

How does the next machine/LLM know what’s valid or not? I don’t really understand the idea behind layers of hallucinating LLMs.

ben_w · 2024-10-25T06:22:32 1729837352

By comparison with reality. The initial LLMs had "reality" be "a training set of text", when ChatGPT came out everyone rapidly expanded into RLFH (reinforcement learning from human feedback), and now there's vision and text models the training and feedback is grounded on a much broader aspect of reality than just text.

croes · 2024-10-25T07:55:51 1729842951

Given that there are more and more AI generated texts and pictures that ground will be pretty unreliable.

ben_w · 2024-10-25T09:45:05 1729849505

Perhaps. But CCTV cameras and smartphones are huge sources of raw content of the real world.

Unless you want to take the argument of Morpheus in The Marix and ask "what is real?"

croes · 2024-10-25T10:46:15 1729853175

So let’s crank up total surveillance for better auto descriptions of a picture.

We aren’t exchanging freedom for security anymore, what could be reasonable under certain conditions, we just get convenience. Bad deal.

ben_w · 2024-10-25T12:49:16 1729860556

That's one way to do it, but overkill for this specific thing — self-driving cars or robotics, or natural use of smart-[phone|watch|glass|doorbell|fridge], likely sufficient.

Total surveillance may be necessary for other reasons, like making sure organised crime can't blackmail anyone because the state already knows it all, but it's overkill for AI.

yunohn · 2024-10-25T06:27:28 1729837648

Could you link to a paper or working POC that shows how this “turtles all the way down“ solution works?

ben_w · 2024-10-25T07:43:56 1729842236

I don't understand your question.

This isn't turtles all the way down, it's grounded in real world data, and increasingly large varieties of it.

croes · 2024-10-25T07:56:48 1729843008

How does the AI know it’s reality and not a fake image or text fed to the system?

ben_w · 2024-10-25T09:50:46 1729849846

I refer you to Wachowski & Wachowski (1999)*, building on previous work including Descartes and A. J. Ayer.

To whit: humans can't either, so that's an unreasonable question.

More formally, the tripartite definition of knowledge is flawed, and everything you think you know has a Munchausen trilemma.

* Genuinely part of my A-level in philosophy

croes · 2024-10-25T10:43:08 1729852988

So we get the same flaws as before with a higher power consumption.

And because it’s fast and easy we now get more fakes, scams and disinformation.

That makes AI a lose-lose not to mention further negative consequences.

ben_w · 2024-10-25T13:24:12 1729862652

Not if you source your training data from reality.

Are you treating "the internet" as "reality" with this line of questions?

The internet is the map, don't mistake the map for the territory — it's fine as a bootstrap but not the final result, just like it's OK for a human to research a topic by reading on Wikipedia but not to use it as the only source.

Workaccount2 · 2024-10-25T14:43:28 1729867408

Sooner or later someone is going to figure out how to do active training on AI models. It's the holy grail of AI before AGI. This would allow you to do base training on a small set of very high quality data, and then let the model actively decide what it wants to train on going forward or let it "forget" what it wants to unlearn.

yunohn · 2024-10-25T10:00:13 1729850413

I wasn’t expecting your response to be “the truth is unknowable”, but was hoping for something of more substance to discuss.

ben_w · 2024-10-25T12:58:03 1729861083

Then you need a more precisely framed question.

1. AI can do what we can do, in much the same way we can do it, because it's biologically inspired. Not a perfect copy, but close enough for the general case of this argument.

2. AI can't ever be perfect because of the same reasons we can't ever be perfect: it's impossible to become certain of anything in finite time and with finite examples.

3. AI can still reach higher performance in specific things than us — not everything, not yet — because the information processing speedup going from synapses to transistors is of the same order of magnitude as walking is to continental drift, so when there exists sufficient training data to overcome the inefficiency of the model, we can make models absorb approximately all of that information.

exe34 · 2024-10-25T17:22:02 1729876922

Does the AI need to know or the curator of the dataset? If the curator took a camera and walked outside (or let a drone wander around for a while), do you believe this problem would still arise?

croes · 2024-10-25T07:54:14 1729842854

And who validates the validation?

exe34 · 2024-10-25T08:16:32 1729844192

the compiler/interpreter are assumed to work in this scenario.

croes · 2024-10-25T07:52:09 1729842729

Exactly, validating and rewriting the prompt are the real time consuming tasks.

asabla · 2024-10-09T21:53:06 1728510786

This looks really interesting and could really be a nice addition to my daily work.

I just downloaded the application, but are unable add OpenAI API keys. Looks like it's probably on my end (with quite an aggressive DNS blocking lists). So my guess here is: I'm unable to add API keys when telemetry is blocked.

Suggestion: please do add some error message when then this occurs. As in, did the request fail (500), faulty key etc

wewtyflakes · 2024-10-09T22:12:13 1728511933

Thank you for the direct actionable feedback, we will improve that messaging.

Regarding debugging your specific problem, when an API is attempted to be added, the local process attempts a 1-token request to the cheapest model with the GPT platform (in your case, gpt-4o-mini on OpenAI) to verify that the key works. Though, if the account has no balance, this request may fail even though it costs a fraction of a fraction of a penny (though anything that fails that request will cause the API key to be considered invalid).

iansinnott · 2024-10-10T11:38:13 1728560293

You can also request the model list to check the validity of a key. No tokens needed.

asabla · 2024-10-03T07:28:06 1727940486

Built a couple of things with Semantic Kernel. Both some private test projects, but also two customer facing applications and one internal.

It's heavily tilted towards OpenAI and it's offerings (either through OpenAI API or through Azure). However, it works decent enough for other alternatives as well, like: huggingface or ollama. Compared to the others (CrewAI etc). I kind of feel like Semantic Kernel hasn't really solved observe ability yet. Sure you can connect what ever logging/metric solution .Net supports, but it's not as seamless like the others. Semantic Kernel is available in .Net, Java and Python. But it's quite obvious .Net is a lot more polished then the others. Python usually gets new features faster, or at least pocs or previews.

Some learnings from it all:

- It's quite easy to get started with

- I like the distinction between native plugins and textbased ones (if a plugin should run code or not)

- There is a feeling of black magic in the background, in the sense of observe ability

- A bit more manual work to get things in order, compared to the alternatives

- Rapid development, it's quite clear the development team from Microsoft is doing a lot of work with this library

All and all, if you feel comfortable with writing C#, then Semantic Kernel is totally a viable option. If you prefer python over anything else, then I would say llamaindex or langchain is probably a better option (for now).

edit: updated some formatting

arunmu · 2024-10-03T08:41:38 1727944898

Thanks. I would have preferred to use Go instead of Python, but somehow the language is not picking up a lot in terms of new LLM frameworks.

As of now, I am using very light weight abstractions over prompts in python and that gets the job done. But, it is way too early and I can see how pipelining multiple LLM calls would need a good library that is not too complex and involved. In the end it is just a API call and you hope for the best result :)

asabla · 2024-10-03T09:04:17 1727946257

Since you prefer go, you might be interested in one of my pet projects. Where I've glued together some libraries, which lets you basically code all interactions with LLM's through lua. The project is written in go.

Currently it only supports ollama, but I've been thinking about adding support for more providers

PhilippGille · 2024-10-03T13:13:46 1727961226

At which url can we check it out? Didn't find it on your profile and Github user with same username seems to be someone different.

asabla · 2024-10-03T21:48:09 1727992089

Sorry about the late response. Thought I already had pushed it, you can find it here: https://github.com/asabla/gogent

As you can see, it's in a very early stage. I'm not a go developer, and I use this repository as a way to explore things both within Ollama and with go.

I'll probably add more things as the time goes by, but it isn't something I hack on every day or for that matter week. Just something to poke around and explore things with.

asabla · 2024-09-18T15:32:46 1726673566

I don't see it like that at all. Some 0-days can (somewhat) be mitigated by other hardware/software.

I rather have as many "known" 0-days in the open. Then having it the other way. Even if it means I won't see any updates to affected devices or software

asabla · 2024-09-18T15:25:06 1726673106

Ooh, this looks like a lot of fun. Really hope they'll either have recordings and/or stream this event.

asabla · 2024-08-17T21:22:45 1723929765

Just want to say thank you for your contribution. A lot of the wonky stuff made it really funny to watch, but at the same time impressive (for what it was).

Do you have any links to share about details surrounding the project? Would be fun to understand more about the process but as well the work in general.

edit: fixed some weird autocorrects

neom · 2024-08-17T21:28:58 1723930138

A lot of it is not mine to share, I was basically a CXO hire brought in by Skyler Hartle and Brian Habersberger who created it* to work with them on assessing the viability of the product they'd built behind the scenes, as a business, and it's levels of scale etc. Skyler and Brian are amazing genius level dudes (and also HN users!). I know the whole story end to end, pretty much exactly as it unfolded, it's not really anything like this blog post thinks it is, to the point I don't think I even agree with the author. Again: I'm sorry but much of it is not mine to share, for a great many reasons.

*As a side project, over years, and years, before it took off, "over night", there is a lot of tech unmentioned they built totally unrelated to LLMs etc, much of it still worth, not mentioning, because it's "cutting edge" still even today)

asabla · 2024-08-17T22:19:22 1723933162

Alright, thank you for sharing. Then I'll just hope either Skyler or Brian wants to share their story some day here on HN

asabla · 2024-08-15T06:20:47 1723702847

It's going to be interesting to see what Wendell finds out about the oddities on Windows.

I think with this CPU it will also be the first time I'll no longer dual boot for gaming, or for that matter have a dedicated Windows machine only for gaming.