More

FeepingCreature · 2025-11-01T18:06:54 1762020414

Sure, but in this case the speculative scenario is the entire premise behind the existence of the charity in the first place.

labrador · 2025-11-01T18:18:31 1762021111

The charity was premised on either:

- AGI being cheap to develop, or

- finding funders willing to risk billions for capped returns.

Neither happened. And I'm not sure the public would invest 100's of billions on the promise of AGI. I'm glad there are investors willing to take that chance. We all benefit either way if it is achieved.

frotaur · 2025-11-01T18:21:22 1762021282

'We all benefit either way'?

I am not sure that making labour obsolete, and putting the replacement in the hands of a handful of investor will result in everybody benefiting.

labrador · 2025-11-01T18:27:01 1762021621

That's a different conversation. I believe AGI will be a net benefit.

grayhatter · 2025-11-01T19:40:33 1762026033

I feel as though you're ignoring the most important part of that sentence. I assume you meant to write;

I believe that AGI will be a net benefit to whomever controls it.

I would argue that if a profit driven company rents something valuable out to others, you should expect it would benefit them just as much if not more, than those paying for that privilege. Rented things may be useful, but they certainly are not a net benefit to the system as a whole.

labrador · 2025-11-01T22:23:14 1762035794

No, I believe AGI will have a net benefit for all of humanity. The telephone system was a net benefit for all Americans even though for a time AT& T (Ma Bell) controlled it.

grayhatter · 2025-11-02T07:38:51 1762069131

Your pattern matching skills leave a lot of room for improvement.

Information interconnection is meaningfully different from AGI, and the environment ATT and Bell existed within no longer exist.

labrador · 2025-11-02T13:25:26 1762089926

AGI is fantasy at this point and your assumption that AGI would give OpenAI unprecented powers is the Musk/Yudkowsky/Hinton argument that AI will dominate and enslave us.

Drop those assumptions and my point stands that throughout history, monopolistically-controlled transformative technologies (telephones, electricity, vaccines, railroads) have still delivered net benefits to society, even if imperfectly distributed. This is just historical fact.

grayhatter · 2025-11-02T15:39:35 1762097975

> AGI is fantasy at this point and your assumption that AGI would give OpenAI unprecented powers is the Musk/Yudkowsky/Hinton argument that AI will dominate and enslave us.

Yeah, like I said, room for improvement. I find the argument that AGI or sAGI should be feared, or is likely to turn "evil" absurd in the best case. So your arguing against a strawman I already find stupid.

Telephones, increased the speed of information transfer, it couldn't produce on it's own. Electricity allowed transmission of energy from one place to another, and doesn't produce inherent value in isolation, vaccines are in an entirely different class of advancement, (so I have to idea how you mean to apply it to the expected benefits of AGI? I assume you believe AGI will have something to do with reducing disability), railroads again, like energy or telephones, involved moving something of value from one place to another.

AGI is supposed to produce a potentially limitless amount of inherent value on its own, right? It will do more than just move around components of value, but more like a diamond mine, it will output something valuable as a commodity. Something that can easily be controlled... oh but it's also not concrete, you can never have your own, it's only available for rental, and you have to agree to the ToS. That sounds just like all previous inventions, right?

You're welcome to cite any historical facts you like, but when you're unwilling or unable to draw concrete parallels, or form convincing conclusions yourself, and hand wave, well most impressivive inventions in the past were good so I feel AGI will be cool too!

Also, the critical difference (ignoring the environmental differences between then and now) between the inventions you cited, and AGI, is the difficulty in replicating any technology. Other than "it happened before to most technologies" is there reason I should believe that AGI would be easy to replicate for any company that wants to compete against the people actively working to increase the size of their moat? copper wire, and train tracks are easy to install. Do you expect AGI will be easy for everyone to train?

labrador · 2025-11-02T16:30:31 1762101031

You insulted me twice so this conversation is over

grayhatter · 2025-11-02T22:43:41 1762123421

oh, sorry dude... I wasn't expecting the indirect insult to be the only thing you read... my intent was less for you to take offense, and more to point out how you're arguing against something I never said and don't believe. I would have been interested in the reasoning behind the claim, and the parallels you saw, but was unwilling to tolerate the strawman.

labrador · 2025-11-03T00:49:46 1762130986

Thanks. I'm sorry I jumped to the conclusion that you were making the doomer arguement. I see now your argument is much more subtle and raises some interesting points. If I understand it correctly, it's like what if one company owned the internet? But worse than that, what if one company owned access to intelligence? I'm old so I remember when AT&T owned the American phone system. We couldn't hook up anything to the phone jack without permission, so intuitivly I did understand your argument, but my opposition to doomer arguments (pause research! regulate!) got in the way.

Zardoz84 · 2025-11-01T18:34:09 1762022049

There isn't AGI

labrador · 2025-11-01T18:44:34 1762022674

Exactly. That's why I called them speculative.

FeepingCreature · 2025-11-01T18:44:38 1762022678

"Neither happened"? I wasn't aware the OpenAI capped-profit corp had a funding problem?

JohnnyMarcone · 2025-11-01T22:00:58 1762034458

A lot of funding was predicated on them making the transition. Also they would not have been able to IPO without the transition, so there was a funding problem when you look at it that way.

FeepingCreature · 2025-10-30T13:03:27 1761829407

This mental model is also in direct contradiction to the whole purpose of the embedding, which is that the embedding describes the original text in a more interpretable form. If a piece of content in the original can be used for search, comparison etc., p much by definition it has to be stored in the embedding.

Similarly, this result can be rephrased as "Language Models process text." If the LLM wasn't invertible with regards to a piece of input text, it couldn't attend to this text either.

FeepingCreature · 2025-10-29T13:29:51 1761744591

In principle, it should be possible to identify malign IPs at scale by using a central service and reporting IPs probabilistically. That is, if you report every thousandth page hit with a simple UDP packet, the central tracker gets very low load and still enough data to publish a bloom filter of abusive IPs, say a million bits gives you pretty low false-positive. (If it's only ~10k malign IPs, tbh you can just keep a lru counter and enumerate all of them.) A billion hits per hour across the tracked sites would still only correspond to ~50KB/s inflow on the tracker service. Any individual participating site doesn't necessarily get many hits per source IP, but aggregating across a few dozen should highlight the bad actors. Then the clients just pull the bloom filter once an hour (80KB download) and drop requests that match.

Any halfway modern LLM could probably code the backend for this in a day or two and it'd run on a RasPi. Some org just has to take charge and provide the infra and advertisement.

01HNNWZ0MV43FF · 2025-10-29T13:51:14 1761745874

The hard part is the trust, not the technology. Everyone has to trust that everyone else is not putting bogus data into that database to hurt someone else.

It's mathematically similar to the "Shinigami Eyes" browser plug-in and database, which has been found to have unreliable data

FeepingCreature · 2025-10-29T13:56:01 1761746161

Personally talk to every individual participating company. Provide an endpoint that hands out a per-client hash that rotates every hour, stick it in the UDP packet, whitelist query IPs. If somebody reports spam, no problem, just clear the hash and rebuild, it's not like historic data is important here. You can even (one more hour of vibecoding) track convergence by checking how many bits of reported IPs match the existing (decaying) hash; this lets you spot outlier reporters. If somebody always reports a ton of IPs that nobody else is, they're probably a bad actor. Hell, put a ten dollar monthly fee on it, that'll already exclude 90% of trolls.

I'm pretty pro AI, but these incompetent assholes ruin it for everybody.

pixl97 · 2025-10-29T15:16:57 1761751017

>malign IPs at scale

As talked about elsewhere in this thread, residential devices being used as proxies behind CGNAT ruins this. Not getting rid of IPv4 years ago is finally coming to bite us in the ass in a big way.

codersfocus · 2025-10-29T15:53:33 1761753213

IPv6 wouldn't solve this, since IPs would be too cheap to meter.

FeepingCreature · 2025-10-16T15:57:38 1760630258

First, have remote shell.

FeepingCreature · 2025-10-16T15:48:16 1760629696

If usability tickets are closed because the company doesn't want to bother, then maybe these gears deserve to have sand put in them.

I generally approve of subversive actions which are naturally damaging if and exactly if the accusation they are based on is true.

That is, the logic is something like "well, either it gets fixed, in which case it's a victory for good, or they're hypocrites who don't really care, in which case 1. it wastes their time and 2. they deserve to have their time wasted."

rectang · 2025-10-16T15:52:14 1760629934

Is PaulHoule filing tickets with other browser vendors?

Is the point to target Mozilla, or to actually make a difference in accessibility?

FeepingCreature · 2025-10-16T15:53:53 1760630033

Why would he file tickets with browsers he does not use?

And my whole point is that a strategy can have multiple effects. As I understand it:

- Firefox care about usability => the issues get fixed or at least considered.

- Firefox don't care about usability => sand in the gears.

So it's a hybrid strategy whose purpose depends on the situation.

FeepingCreature · 2025-10-09T11:37:11 1760009831

What, "Joiry" offering "Intellegent Protective Systems" with protection from "Over-circuit" didn't fill you with confidence?

FeepingCreature · 2025-10-08T15:08:14 1759936094

I don't think "you should build your own battery pack" is the sort of advice that will on net reduce house fires.

FeepingCreature · 2025-10-08T15:06:47 1759936007

Was it in the charger?

diob · 2025-10-08T19:11:16 1759950676

Yes! One of the lessons I learned with this, which is if you're charging it you're supposed to supervise it or charge it somewhere "safe".

FeepingCreature · 2025-10-06T12:05:32 1759752332

"llms don't actually freak out over seahorses, it's just <explains in detail how and why the llm freaks out over seahorses>"

FeepingCreature · 2025-10-05T21:26:22 1759699582

Informed layman warning.

The tokenizer covers the entire dataset. It's basically just a fixed-size Huffman code, grouping together common fragments of letters- for instance, the 100 most common English words are probably all single tokens.

During learning, the model proceeds in roughly the same way a child would: it starts by grouping tokens together, learning the deep regularities of language such as "news[paper]" being more likely than "news[q77.bfe]". Then it incrementally assembles these fragments into larger and larger chains. Similarly , it first learns thematic groupings, such as "word" being more likely somewhere after "dictionary" rather than "stop what I was reading to get the dictionary out every time I encountered a banana assault hungry". Then it starts to pick up "patterns": "as a [baby|child|kid] I had no [idea|concept|clue]". At some point in this process it naturally abstracts concepts from languages: "as a child" starts being internally represented by the same neurons as "als ich ein Kind war".

Then some magic happens that we don't understand, and out pops a neural network that you can talk to and that can write programs and use tools. To be clear, this is the case before RL: probably these patterns are now widespread in the training data, so that the model already understands how to "complete the pattern" on its own. RL then does some magic on top of that to bring it from 20% benchmarks to 80% and presto, AI assistant.

astrange · 2025-10-06T00:38:23 1759711103

> The tokenizer covers the entire dataset.

Well, this is only trivially true. You can feed binary data to the LLM and it probably has tokens that only cover single bytes of that.

lelanthran · 2025-10-06T07:50:01 1759737001

Not an expert, but I don't think that this bit:

> At some point in this process it naturally abstracts concepts from languages: "as a child"

Is true. I don't know of any way for the model to represent concepts.

jfyi · 2025-10-06T10:37:49 1759747069

https://www.anthropic.com/research/tracing-thoughts-language...

>Claude sometimes thinks in a conceptual space that is shared between languages, suggesting it has a kind of universal “language of thought.” We show this by translating simple sentences into multiple languages and tracing the overlap in how Claude processes them.

brulard · 2025-10-07T20:16:58 1759868218

I think concept here means it is assigned a point or an area in the many-dimensional embedding space. The "concept" has no obvious form, but similar words, synonyms or words from another languages meaning roughly the same are very close together in this space.