Hacker Newsnew | past | comments | ask | show | jobs | submit | ranit's favoriteslogin

This discussion is not complete without a mention of Marcus Hutter’s seminal book[0] “Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability”. It provides many of the formalisms upon which metrics of intelligence are based. The gaps in current AI tech are pretty explainable in this context.

[0] https://www.hutter1.net/ai/uaibook.htm


If you do this be sure to buy BPA free receipt paper

Handling receipt paper is what turned out to be the cause of the high BPA numbers when boba tea was tested https://x.com/natfriedman/status/1899641377002025252


The article seems a few months too late. Claude (and others) are already doing this: i've been instructing claude code to generate code following certain best practices provided through URLs or asking it to compare certain approaches from different URLs. Claude Skill uses file "URLs" to provide progressive disclosure: only include detailed texts into the context if needed. This helps reduce context size, and improves cachability.

The amount of public WiFi's (including in-flight ones) I've bypassed by running a vpn server on udp port 53 is honestly insane. Sadly, this is becoming less commonplace many captive portals don't allow any egress at all aside from the captive portal's IP - but alas - still impressive how many are susceptible. It also bypasses traffic shaping (speed limiting) on most networks that are publicly accessible even if they do require some kind of authorization to enable external accessibility.

Highly recommend softether as they give you juicy Azure relay capability for free which is allowed in more "whitelist only" networks than your own vps server.

Haven't gone so far as to enable iodine for actual two-way dns communication through a third party DNS resolver, but that would probably work in more cases than this, albeit slower.


The most important question for every cross platform framework is what happens to the UI?

Adobe products (both the Creative Suite, and their Flex Builder environment for Flash app) had their own design system that felt foreign on every platform it shipped on. If you wanted something that felt native, you had to reimplement e.g. Apple Aqua in Flash yourself.

Flutter goes out of its way to do that work for you, aiming for a "Cupertino" theme that looks-and-feels pixel-perfect on iOS.

React Native tries to delegate to platform primitives for complex widgets, so scroll views still feel like Apple's when on Apple's platform.

Just about every top-level comment here is talking about that in one way or another; yet the blog post doesn't mention it at all.

It's possible that Apple/Swift's mindshare among developers will lead to a significant number of apps shipping the Swift version for Android even if it means using Apple's UI, simply because they can't be bothered to make something bespoke for Android. Then again, Apple takes so much pride in its design language that it might not be willing to implement anything that feels good on a platform they don't own. If they were to ship an API-compatible widget toolkit, it might e.g. use intentionally bad spring physics to remind you you aren't on an iPhone.

I wonder how big the community part of this is. Is this an open source project of non-Apple people who are trying to break Apple's platform out of its walled garden? Is a lot of it funded by Apple? Ultimately, that's going to shape a lot of how this plays out.


  In the past, browsers used an algorithm which only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. co.uk). In these cases, websites could set a cookie for .co.uk which would be passed onto every website registered under co.uk.

  Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list. This is the aim of the Public Suffix List.
  
  (https://publicsuffix.org/learn/)
So, once they realized web browsers are all inherently flawed, their solution was to maintain a static list of websites.

God I hate the web. The engineering equivalent of a car made of duct tape.


A random plug for https://hamstudy.org/

The US ham test question pools are fully public. Your test will be a mixture of questions from the pool. HamStudy basically lets you churn the question pool, and then will offer explainer text / references to back up each question and correct answer.

I went on a vacation and used their phone app any time I was standing in a line. You can set it to just keep spinning through the questions, with a bias towards ones you're getting wrong.


I look at cross core communication as a 100x latency penalty. Everything follows from there. The dependencies in the workload ultimately determine how it should be spread across the cores (or not!). The real elephant in the room is that oftentimes it's much faster to just do the whole job on a single core even if you have 255 others available. Some workloads do not care what kind of clever scheduler you have in hand. If everything constantly depends on the prior action you will never get any uplift.

You see this most obviously (visually) in places like game engines. In Unity, the difference between non-burst and burst-compiled code is very extreme. The difference between single and multi core for the job system is often irrelevant by comparison. If the amount of cpu time being spent on each job isn't high enough, the benefit of multicore evaporates. Sending a job to be ran on the fleet has a lot of overhead. It has to be worth that one time 100x latency cost both ways.

The GPU is the ultimate example of this. There are some workloads that benefit dramatically from the incredible parallelism. Others are entirely infeasible by comparison. This is at the heart of my problem with the current machine learning research paradigm. Some ML techniques are terrible at running on the GPU, but it seems as if we've convinced ourselves that GPU is a prerequisite for any kind of ML work. It all boils down to the latency of the compute. Getting data in and out of a GPU takes an eternity compared to L1. There are other fundamental problems with GPUs (warp divergence) that preclude clever workarounds.


This is the way.

Reolink cameras are pretty good for what they are. Just dont buy into their NVR solution...

Frigate also has some interesting applications to go along with it, see: https://github.com/mmcc-xx/WhosAtMyFeeder

I also have YOLO on my to do list for the home cameras.


Jeez top post on HN and there's a full overlay ad to "download a mac extension". This deserves a summary post to save others the click. Here's the "what every engineer should know" without the spam:

PoE (Power over Ethernet) sends both DC power and data through the same twisted-pair Ethernet cable, allowing devices like IP cameras, wireless access points, and VoIP phones to run without separate power lines. The power is delivered by Power Sourcing Equipment (PSE) — either an endspan (built-in PoE switch) or a midspan (PoE injector placed between a non-PoE switch and the device). The powered device (PD) negotiates power via detection and classification before voltage is applied, preventing damage to non-PoE gear. IEEE 802.3af (Type 1) provides up to 15.4 W at the source, 802.3at/PoE+ (Type 2) up to 25.5 W delivered, and 802.3bt (Type 3/4) extends that to roughly 60–90 W using all four wire pairs. Engineers need to understand not just wiring, but also cable category limits, pair usage, power losses over distance, and heat dissipation — especially at higher power levels. Modern PoE designs must consider standards compliance, thermal management, and efficiency, as power density rises with new generations of PoE technology.


Hey! So I was quite surprised to see my site posted on here so soon after hitting the "go live" button, and thanks for your comment.

I wrote a blog post about why I made the site at https://bret.dk/introducing-sbc-compare/ if anyone's interested, but to TL;DR it, I didn't set out to create a site like this, it was a side quest after creating the automation and database to support my reviews, which do indeed focus on the hobbyist trying to explore Raspberry Pi SBCs and their many alternatives.

I have full specifications and hardware capabilities hidden behind a feature flag at the moment as I'm working my way through adding all of that data (currently at 80 SBCs in the database, and I'm only adding those I own and have run tests on) so there should be something similar to what you're asking for soon. Thanks again!


Text tokens are quantized and represent subword units, vision tokens only exist in the embedding space.

The way text tokenization works in LLMs is that you have a "lookup table" of (small) token ids to (large) vector embeddings. To pass text to the LLM, you split it at token boundaries, convert strings to token ids, and then construct the "context", a matrix where each row is a vector taken from that lookup table.

Transmitting text token sequences can be relatively efficient, you just transmit the token IDs themselves[1]. They're small integers (~100k possible token ids is typical for large models). Transmitting the actual embeddings matrix would be far less efficient, as embeddings often consist of thousands of floating point numbers.

Images are encoded differently. After some basic preprocessing, image data is passed straight to a neural- network-based image encoder. That encoder encodes the image into vectors, which are then appended to the context. There are no token ids, there's no lookup table, we go straight from image data to token embeddings.

This means transmitting image tokens cannot be done as efficiently, as you'd have to transmit the embeddings themselves. Even though an image is encoded in fewer tokens, the most efficient representation of those tokens takes more bytes.

You can think of a text token as an integer between 0 and n, which we know how to map to a vector. This means you have `n` possible choices of tokens. In contrast, an image token is an array of m floating point numbers (the vector itself), each of which can take on many possible values. This means the "token space" of vision tokens is actually much larger.

There's also the issue of patterns. Text tokens correspond directly to a contiguous span of UTF-8 bytes, and most tokenizers won't create tokens that span word boundaries. This means they can't encode global patterns efficiently. You can't have a "Hamlet's monologue" or "the text that follows is in Spanish" token.


Ublock Origin and Unhook[1]

Lets you remove as much or as little of the "UI/UX" as you want - don't want to see shorts, recommended vids, end cards etc - live comments (who even asked for that) you don't have to.

It collapses YT back to been an intentional thing - I'm looking for a video to watch, I watch it, it suggests nothing and I go on with my day instead of getting distracted by the skinner box.

[1] https://addons.mozilla.org/en-GB/firefox/addon/youtube-recom...


Combinatory programing offers functional control flow. (Here is a straight forward explanation: https://blog.zdsmith.com/series/combinatory-programming.html ) I was inspired to write `better-cond` in Janet:

    (defn better-cond
    [& pairs]
    (fn [& arg]
        (label result
            (defn argy [f] (if (> (length arg) 0) (f ;arg) (f arg))) # naming is hard
            (each [pred body] (partition 2 pairs)
                (when (argy pred)
                (return result (if (function? body)
                                    (argy body) # calls body on args
                                    body)))))))
    Most Lisps have `cond` like this:

    (def x 5)
    (cond
    ((odd? x) "odd") ; note wrapping around each test-result pair
    ((even? x) "even"))

    Clojure (and children Fennel and Janet) don't require wrapping the pairs:

    (def x 5)
    (cond
    (odd? x) "odd"
    (even? x) "even")

    My combinatoresque `better-cond` doesn't require a variable at all and is simply a function call which you can `map` over etc.:

    ((better-cond
    (fn [x] (> 3 x)) "not a number" # just showing that it can accept other structures
    odd?   "odd"
    even?  "even") 5)

    Of course, it can work over multiple variables too and have cool function output:

    (defn recombine # 3 train in APL or ϕ combinator
    [f g h]
    (fn (& x) (f (g ;x) (h ;x))))

    (((better-cond
    |(function? (constant ;$&))
    |($ array + -)) recombine) 1 2) # |( ) is Janet's short function syntax with $ as vars

Surprised to see this on HN front-page.

A lot has happened since I proposed / built this.

WebMCP is being incubated in W3C / webmachinelearning, so highly recommend checking that out as it's what will turn into WebMCP being in your browser.

https://github.com/webmachinelearning/webmcp


they want to build services on top of the tools

see this from ~3 months ago https://news.ycombinator.com/item?id=44358216


Note: please don't turn your screenshots and digital art into JPG. JPG uses compression based on natural lighting. It works well for photos, but it's the wrong solution where run-length encoding will do much better (e.g. in screenshots). Black text (or cartoon art) on white backround always looks lousy when converted to JPG.

I love Asianometry! The semiconductor history videos are incredible. The level of quality on that channel just blows my mind. It’s one of the best examples of the revolution happening in high-quality, independently produced content.

I actually first heard about it from the Acquired podcast, which is another great example of that same trend.


yo, i'm the htmx guy

this looks awesome!


Hello, author of hyperflask here. I'm happy to finally announce this project as I've been working on it for quite some time.

I made an announcement post here: https://hyperflask.dev/blog/2025/10/14/launch-annoncement/

I love to hear feedback!


If you are OK with another tool to complement it, I found GitButler via Hacker News recently and it looks promising.

https://gitbutler.com


That is why a minimal framework[1] that allows me to understand the core immutable loop, but to quickly experiment with all these imperative concepts is invaluable.

I was able to try Beads[1] quickly with my framework and decided I like it enough to keep it. If I don't like it, just drop it, they're composable.

[0]: https://github.com/aperoc/toolkami.git [1]: https://github.com/steveyegge/beads


I see, thanks for channeling the GP! Yeah, like you say, I just don't think getting the tool call template right is really a problem anymore, at least with the big-labs SotA models that most of us use for coding agents. Claude Sonnet, Gemini, GPT-5 and friends have been heavily heavily RL-ed into being really good at tool calls, and it's all built into the providers' apis now so you never even see the magic where the tool call is parsed out of the raw response. To be honest, when I first read about tools calls with LLMs I thought, "that'll never work reliably, it'll mess up the syntax sometimes." But in practice, it does work. (Or, to be more precise, if the LLM ever does mess up the grammar, you never know because it's able to seamlessly retry and correct without it ever being visible at the user-facing api layer.) Claude Code plugged into Sonnet (or even Haiku) might do hundreds of tool calls in an hour of work without missing a beat. One of the many surprises of the last few years.

> Call it what you want, you can write it in 100 lines of Python. I encourage every programmer I talk to who is remotely curious about LLMs to try that. It is a lightbulb moment.

Definitely want to try this out. Any resources / etc. on getting started?


You forgot mcp-everything!

Yes, it's a mess, and there will be a lot of churn, you're not wrong, but there are foundational concepts underneath it all that you can learn and then it's easy to fit insert-new-feature into your mental model. (Or you can just ignore the new features, and roll your own tools. Some people here do that with a lot of success.)

The foundational mental model to get the hang of is really just:

* An LLM

* ...called in a loop

* ...maintaining a history of stuff it's done in the session (the "context")

* ...with access to tool calls to do things. Like, read files, write files, call bash, etc.

Some people call this "the agentic loop." Call it what you want, you can write it in 100 lines of Python. I encourage every programmer I talk to who is remotely curious about LLMs to try that. It is a lightbulb moment.

Once you've written your own basic agent, if a new tool comes along, you can easily demystify it by thinking about how you'd implement it yourself. For example, Claude Skills are really just:

1) Skills are just a bunch of files with instructions for the LLM in them.

2) Search for the available "skills" on startup and put all the short descriptions into the context so the LLM knows about them.

3) Also tell the LLM how to "use" a skill. Claude just uses the `bash` tool for that.

4) When Claude wants to use a skill, it uses the "call bash" tool to read in the skill files, then does the thing described in them.

and that's more or less it, glossing over a lot of things that are important but not foundational like ensuring granular tool permissions, etc.


The second-to-last post[0] talks about how they decided to migrate their stack from Ruby on Rails to Haskell, and are now in the seventh (!) year of that migration.

[0] https://exploring-better-ways.bellroy.com/designing-for-the-...


Creator of JustSketchMe here! I was very surprised to see this on HackerNews this morning. Very cool to see this doing the rounds 6 years into running this :)

Thanks, it also runs well as a PWA. It's Threejs under the hood, plus some rendering optimisation to ensure it performs well in mobile browsers.

If you liked that video you'll like this one too, which explains that mechanical and electrical parallel but in the other direction.

Prof. Malcolm C. Smith had an electrical circuit and made its mechanical equivalent. His invention (the inerter) gave the F1's McLaren team an advantage in 2007.

https://www.youtube.com/watch?v=FhmLb2DhNYM


This is something that I've wondered about when it comes to things like self driving cars, and the difference between good and bad drivers.

When I'm driving I'm constantly making predictions about the future state of the highway and acting on that. For example before most people change lanes, even without using a signal they'll look and slightly move the car in that direction, up to a full second before they actually do it. Or I see two cars that are going to end up in a conflict state (trying to take the same location on the highway) so I pivot away from them and the recovery they will have to make.

Self driving cars for all I know are reactionary. They can't pick up on these things beforehand at this time and preemptively put them self in a safer position. Bad/distracted/unaware drivers are not only reactionary, they'll have a much slower reaction time than a self driving car.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: