More

llogiq · 2025-09-17T14:19:39 1758118779

The problem with C++ vs. unsafety is that there is really no boundary: All code is by default unsafe. You will need to go to great lengths to make it all somewhat safe, and then to even greater lengths to ensure any libraries you use won't undermine your safety.

In Rust, if you have unsafe code, the onus is on you to ensure its soundness at the module level. And yes, that's harder than writing the corresponding C++, but it makes the safe code using that abstraction a lot easier to reason about. And if you don't have unsafe code (which is possible for a lot of problems), you won't need to worry about UB at all. Imagine never needing to keep all the object lifetimes in your head because the compiler does it for you.

llogiq · on Dec 1, 2023

So professional work during a worldwide pandemic with all the hardship that entailed suffered...from working remotely? I don't buy that.

llogiq · on Nov 23, 2023

That very much depends on the code you're compiling. The factors that come into play are monomorphization (which means the compiler builds one copy of the code per type it is called for), procedural macros (which need to be fully compiled before being able to expand code using them), whether complex type shenanigans are used, etc. etc. Absent that, Rust will compile roughly as fast as C nowadays.

llogiq · on Oct 20, 2023

There is so much wrong with this article. Throw a little bit of ML pixie dust on everything for more hype? Check. Compare wildly different things as if they were the same? Double check.

digdugdirk has the right idea, and AFAIR, there is some work on that front (https://www.fornjot.app/).

Also the Fiat 500 goes 100km on about 4l of gas, while the Ford F150 uses 7l. No clue where the author gets the idea that the Fiat would get worse mileage, perhaps he's dividing by weight?

The rest, I don't even.

llogiq · on Aug 8, 2023

Very nice diagrams, they make the article really easy to follow! This really drives the point home that vector search isn't only a qantitative (as in faster), but a qualitative evolutionary step.

Makes one wonder what other use cases are lurking that would need just another small modification and haven't even been thought of yet because they used to be impossible to implement.

weinzierl · on Aug 8, 2023

I like their documentation in general and learned a lot from it. Especially since - unlike Pincore (which has good documentation too) - they don't focus primarily on their commercial offerings. I feel it's really written with the intention to inform first and to sell second.

brianjking · on Aug 8, 2023

All the diagrams are dead images for me :(

timvisee · on Aug 8, 2023

That's weird! A force refresh (CTRL+F5) doesn't work?

llogiq · on July 12, 2023

I totally agree that latency of this solution leaves a lot of room to improvement. But that's totally besides the point of the article, which is that people can get a no-cost semantic search for their personal website using those services. They can also use other solutions, of course.

Also I'm experimenting in further integrating things to reduce latency and most likely will publish another article within the month. Stay tuned.

Finally I somewhat agree that many of the players in the vector DB space try to push their cloud offerings. Which is fine, how else should they make money? And if latency matters that much to you, Qdrant offers custom deployments, too. I believe running Qdrant locally will handily beat your LanceDB solution perf-wise unless you're talking about less than 100k entries. We have both docker containers and release binaries for all major OSes, why not give it a try?

llogiq · on July 12, 2023

The initial version of this actually used Mighty, but I didn't find any free tier available, so I switched to Cohere to keep the $0 pricetag.

binarymax · on July 12, 2023

Mighty is free if you're not making money from it. You could have used Mighty and I would have been glad to help you set it up :)

kybernetikos · on July 12, 2023

There's a bit of a difference between what you see following the 'purchase' link and what you see if you scroll down to 'pricing' on your site. It confused me at first too - I'm just so used to seeing a 'pricing' link in the top bar, I pretty much always go there first to see if there's a reasonable free tier for me to play with something.

binarymax · on July 12, 2023

Thanks for the feedback! I'll do my best to make things more clear.

llogiq · on July 12, 2023

Author here. This was a fun exercise to produce a semantic search using only free-tier services (which in the case of Cohere means non-commercial use only and in the case of AWS Lambda is limited to one year).

It also marks my first foray into using cloud services for a project. I've long been a cloud sceptic, and doing this confirmed some of my suspicions regarding complexity (mostly the administrative part around roles and URLs), while the coding part itself was a blast.

Roark66 · on July 12, 2023

Cool, but I suggest to look into using terraform for "infrastructure as code" for creating all sorts of aws services/infrastructure and maintaining state.

It seems complex at first, but it is a lot more maintable and portable than creating aws infrastructure manually in the console. Once you leave your service to run for 6 months you will forget where stuff is, then in the worst possible moment if it goes down and you need to make some change you'll be franticly looking for aws docs... "can I create a synthetic canary and use the lambda I already have, or do I have to delete it and create it from Cloud Watch interface?" These kind of questions are the bane of Aws ops experience... And once you learn everything they "bring a new console experience"... So I prefer to learn terraform once and that's it.

Why terraform and not python with boto, cdk, cloudformation or ansible or something else? Because terraform is easy to port between providers (sort of), people who are not that good in python find terraform easier so you don't need "senior" people to maintain your code, finally it's a pretty "opinionated" about how w stuff should be done, so it's unlikely you'll open your project in a year and think "why in the world I did that!?", because all your tf projects will be very similar most likely. Also tf is mainly for infrastructure as code, there is no configuration management like in ansible... It is for one thing and it does it relatively well. (I have no relation to TF beyond being a happy user).

llogiq · on July 12, 2023

Thank you for the suggestion! I actually thought about using terraform, but I wanted to keep the experiment somewhat minimal regarding technologies and as I had already added AWS Lambda and Rust to the tech stack, I wanted to stay as close to the metal as possible. Besides, this is not for commercial applications, so I don't think high availability is in scope.

rcme · on July 12, 2023

I find serverless to be needlessly complex. I'd rather write an HTTP server and serve it off of t3.micro instance (also free-tier eligible). So much simpler for side projects.

foobarbecue · on July 12, 2023

I find "serverless" is indeed more complex, because it's a higher abstraction layer. Often, I see people deploying containers lambdas or pods that are full unix environments, with permissions, network routing, filesystems etc. And then because it's "serverless" they use permissions (IAM), networking stuff (VPC, etc), filesystems (S3 etc), and other capabilities that they already have in a lower abstraction level (unix) and are sort of also using. So the complexity of a unix server is a unix server, but the complexity of "serverless" is a unix server plus all the capabilities you duplicated at a higher abstraction level.

Many other commenters replying to https://news.ycombinator.com/item?id=36693471 are interpreting "complex" as "hard for me to set up." I think that's neither here nor there -- no matter what's underneath, you can always rig something to deploy it with the press of a button. The question is: how many layers of stuff did you just deploy? How big of a can of worms did you just dump on future maintainers?

brigadier132 · on July 12, 2023

Serverless is too broad a category to say things like "it's too complex". For example, if you already know docker, you can use google cloud run and just deploy the container to it. You then just say "I want to allow this many simultaneous connections, a minimum of N instances, a maximum of M instances, and each instance should have X vcpus and Y gb of ram".

llogiq · on July 12, 2023

When starting this project I thought the same thing, but having done it I honestly cannot tell that much of a difference. Yes, there are two more steps in setting up the Lambda function, but in the end you still write an HTTP server and have them serve it.

tobilg · on July 12, 2023

Using a decent IaC framework such as Serverless Framework or the CDK instead of the AWS CLI would make the deployment pretty easy.

llogiq · on July 12, 2023

I also found while writing the article but after I had already done my research that cargo-lambda has grown some additional functionality that could have removed the need for the AWS CLI, but I wanted to get the article out, so I didn't test-drive that.

rcme · on July 12, 2023

When using an EC2 instance, testing, deployment, and adding new endpoints are all simpler.

rajamaka · on July 12, 2023

Easier for you* I've done both for years now and I find developing, deploying, testing lambdas much simpler.

zeroCalories · on July 12, 2023

I agree on testing and dev, but for deployment I think stuff like elastic beanstalk or app engine strike a good balance. Almost never run pure EC2.

joshstrange · on July 12, 2023

“Serverless” often has some upfront complexity but I greatly prefer it because once I have it running I’ve never had scaling issues or even had to think about them. To each their own and I’m sure that serverless isn’t the answer for everyone but for my projects (which are very bursty, with long periods of inactivity) is a dream.

kacperlukawski · on July 12, 2023

It's a bit easier in Python if you use tools like https://www.serverless.com/. I'm not sure if Rust has something similar yet.

ZeroCool2u · on July 12, 2023

At the cost of being very specific to Rust, Shuttle is pretty damn simple. https://www.shuttle.rs/

bongobingo1 · on July 12, 2023

It's kind of unclear to me, can I use shuttle without using shuttle.rs (the platform) to actually run it?

Not that I am against paying for a service, but the idea of writing my app against with a specific library against a specific platform makes me uneasy.

They have a github project but I think that is just the CLI + rust libs?

ZeroCool2u · on July 12, 2023

From what I've read you can, but I haven't tried myself or looked into it too deeply.

pgt · on July 12, 2023

First login failed with: "Callback handler failed. CAUSE: Missing state cookie from login request (check login URL, callback URL and cookie config)." but after retrying it went to Projects list. API Key copy button doesn't do anything.

numpad0 · on July 12, 2023

Yeah it seems the premise of serverless is your code always restarts, which is exactly the same as cloud. The only difference is in front of trillion explosive gotchas in the giant 200GB free middleware called GNU/Linux are their employment in case with the serverless vs yours with the cloud.

UNIX is close to turning 50, and people are fundamentally paying as well as getting paid to make a written program loop to the beginning, instead of exiting. I think this is kind of wrong.

cddotdotslash · on July 12, 2023

It depends what you’re doing. I’ve run many side projects off a single Lambda function with the “public URL” config enabled. I pay $0 because of the free tier and updating the code is as simple as pushing a ZIP file. No SSH, no OS updates, nothing else to worry about. You start to get into trouble when you try to break your app into tons of microservices without using some kind of framework or deployment tooling to keep it straight.

cebert · on July 12, 2023

What about serverless do you find to be “needlessly complex”?

rcme · on July 12, 2023

There are just too many required parameters to create a single handler. And then you need to do that N times for each handler. Take a look at a complete Terraform example for a lambda: https://github.com/terraform-aws-modules/terraform-aws-lambd...

For a personal project it's just a bit much in my experience, especially since most personal projects can easily be served by a t3.micro.

cebert · on July 12, 2023

Thanks for clarifying. That’s a fair critique.

intelVISA · on July 12, 2023

to be fair it is (mostly) a Rube Goldberg machine designed to keep backend engineers employed.

foobarbecue · on July 12, 2023

https://news.ycombinator.com/item?id=36694373

llogiq · on July 11, 2023

Nick's articles are always a delight to read, and this one is no exception. Openly discussing failure is one part, but coming up with so many ideas and having the gumption to actually try them out on a compiler is really fascinating.

llogiq · on July 4, 2023

This is a nice article for people new to async/await in Rust.