More

ghm2199 · 2026-01-21T18:34:10 1769020450

> Building a comparable one from scratch is like building a parallel national railroad..

Not too be pedantic here but I do have a noob question or two here:

1. One is building the index, which is a lot harder without a google offering its own API to boot. If other tech companies really wanted to break this monopoly, why can't they just do it — like they did with LLM training for base models with the infamous "pile" dataset — because the upshot of offering this index for public good would break not just google's own monopoly but also other monopolies like android, which will introduce a breath of fresh air into a myriad of UX(mobile devices, browsers, maps, security). So, why don't they just do this already?

2. The other question is about "control", which the DoJ has provided guidance for but not yet enforced. IANAL, but why can't a state's attorney general enforce this?

oh_fiddlesticks · 2026-01-21T19:59:00 1769025540

> 1. One is building the index, which is a lot harder without a google offering its own API to boot. If other tech companies really wanted to break this monopoly, why can't they just do it?

FTA:

> Context matters: Google built its index by crawling the open web before robots.txt was a widespread norm, often over publishers’ objections. Today, publishers “consent” to Google’s crawling because the alternative - being invisible on a platform with 90% market share - is economically unacceptable. Google now enforces ToS and robots.txt against others from a position of monopoly power it accumulated without those constraints. The rules Google enforces today are not the rules it played by when building its dominance.

creato · 2026-01-21T20:54:56 1769028896

robots.txt was being enforced in court before google even existed, let alone before google got so huge:

> The robots.txt played a role in the 1999 legal case of eBay v. Bidder's Edge,[12] where eBay attempted to block a bot that did not comply with robots.txt, and in May 2000 a court ordered the company operating the bot to stop crawling eBay's servers using any automatic means, by legal injunction on the basis of trespassing.[13][14][12] Bidder's Edge appealed the ruling, but agreed in March 2001 to drop the appeal, pay an undisclosed amount to eBay, and stop accessing eBay's auction information.[15][16]

https://en.wikipedia.org/wiki/Robots.txt

dragonwriter · 2026-01-21T22:28:12 1769034492

Not only was eBay v. Bidder's Edge technically after Google existed, not before, more critically the slippery-slope interpretation of California trespass to chattels law the District Court relied on in it was considered and rejected by the California Supreme Court in Intel v. Hamidi (2003), and similar logic applied to other states trespass to chattels laws have been rejected by other courts since; eBay v. Bidder's Edge was an early aberration in the application of the law, not something that established or reflected a lasting norm.

creato · 2026-01-22T01:05:07 1769043907

The point is, robots.txt was definitely a thing that people expected to be respected before and during google's early existence. This Kagi claim seems to be at least partially false:

> Google built its index by crawling the open web before robots.txt was a widespread norm, often over publishers’ objections.

hattmall · 2026-01-22T05:41:47 1769060507

Perhaps it wasn't a widespread norm though. But I don't really see why that matters as much, is the the issue that sites with robots.txt today only allow Googlebot and not other search engines? Or is Google somehow benefitting from having two decade old content that is now blocked because of robots.txt that the website operators don't want indexed?

ricardo81 · 2026-01-23T12:12:50 1769170370

Agree. It was not standard in the late 90s or early 00s. Most sites were custom built and relied on the _webmaster_ knowing and understanding how robots.txt worked. I'd heard plenty of examples where people had inadvertently blocked crawlers from their site, not knowing the syntax correctly. CMS' probably helped in the widespread adoption e.g. wordpress

embedding-shape · 2026-01-23T10:55:05 1769165705

> robots.txt was definitely a thing that people expected to be respected before and during google's early existence

As someone who was a web developer at that time, robots.txt wasn't a "widespread norm" by a large margin, even if some individuals "expected it to be respected". Google's use of robots.txt + Google's own growth made robots.txt a "widespread norm" but I don't think many people who were active in the web-dev space at that time, would agree that it was a widespread norm before Google.

throw-the-towel · 2026-01-21T21:55:06 1769032506

Nitpick: Google incorporated in 1998, so, before the Bidder's Edge case.

baggachipz · 2026-01-21T20:24:58 1769027098

A classic case of climbing the wall, and pulling the ladder up afterward. Others try to build their own ladder, and Google uses their deep pockets and political influence to knock the ladder over before it reaches the top.

dylan604 · 2026-01-21T22:02:51 1769032971

Why does Google even need to know about your ladder? Build the bot, scale it up, save all the data, then release. You can now remove the ladder and obey robots.txt just like G. Just like G, once you have the data, you have the data.

Why would you tell G that you are doing something? Why tell a competitor your plans at all? Just launch your product when the product is ready. I know that's anathema to SV startup logic, but in this case it's good business

Nextgrid · 2026-01-22T00:22:28 1769041348

Running the bot nowadays is hard, because a lot of sites will now block you - not just by asking nicely via robots.txt, but by checking your actual source IP. Once they see it's not Google, they send you a 403.

eloisius · 2026-01-22T12:01:02 1769083262

Cloudflare’s ubiquity makes bootstrapping a search index via crawler virtually impossible, but what about data sources like Common Crawl?

monooso · 2026-01-22T00:32:30 1769041950

Cost, presumably. From the article:

> Microsoft spent roughly $100 billion over 20 years on Bing and still holds single-digit share. If Microsoft cannot close the gap, no startup can do it alone.

kavalg · 2026-01-22T09:38:37 1769074717

Wouldn't it be nice if Microsoft opened the bing index for all.

direwolf20 · 2026-01-22T11:49:28 1769082568

Don't they? DDG and Kagi use it. I would think you have to pay money but it does seem like they're willing to get partners.

edit: this is wrong

monooso · 2026-01-22T11:58:09 1769083089

This is incorrect. Kagi does not use the Bing index, as detailed in the article:

> Bing: Their terms didn’t work for us from the start. Microsoft’s terms prohibited reordering results or merging them with other sources - restrictions incompatible with Kagi’s approach. In February 2023, they announced price increases of up to 10x on some API tiers. Then in May 2025, they retired the Bing Search APIs entirely, effective August 2025, directing customers toward AI-focused alternatives like Azure AI Agents.

specialist · 2026-01-22T14:14:56 1769091296

Now that you mention it...

It's odd that Microsoft hasn't aggressively pushed for "openness". That's in the usual playbook for attacking a market leader.

(And then pull up the ladder once you've become king of hill.)

Microsoft will probably never topple Google, absent anti-monopolistic enforcement. But they can certainly attack Google's profits.

ricardo81 · 2026-01-23T12:16:25 1769170585

There's one great example of a company that did that and managed to go viral on their release, Cuil. They claimed to have a Google size of search index. Unfortunately for them their search results weren't good and so that visibility quickly disappeared.

Going further back, AlltheWeb was actually pretty decent but was eventually bought by Overture and then Yahoo and ended up in their graveyard.

For everyone else it's the longer grind trying to gain visibility.

baggachipz · 2026-01-23T16:06:19 1769184379

I forgot about Cuil! I really wanted to like it.

ghm2199 · 2026-01-21T20:36:33 1769027793

True. But the thing is if one says "We will make sure your site is in a world wide freely availabled index" which is kept fresh, google's monopoly ship already begins to take on water. Here is a appropriate line from a completely different domain of rare earth metals from The Economist on the chinese govt's weaponization of rare earths[1]:

> Reducing its share from 90% to 80% may not sound like much, but it would imply a doubling in size of alternative sources of supply, giving China’s customers far more room for manoeuvre.

[1] https://archive.ph/POkHZ#selection-1233.117-1233.302

jeromechoo · 2026-01-21T21:16:29 1769030189

Building an index is easy. Building a fresh index is extremely hard.

Ranking an index is hard. It's not just BM25 or cosine similarity. How do you prioritize certain domains over others? How do you rank homepages that typically have no real content in them for navigational queries?

Changing the behavior of 90% of the non-Chinese internet is unraveling 25 years and billions of dollars spent on ensuring Google is the default and sometimes only option.

Historically, it takes a significant technological counter position or anti-trust breakup for a behemoth like Google to lose its footing. Unfortunately for us, Google is currently competing well in the only true technological threat to their existence to appear in decades.

AlienRobot · 2026-01-21T23:48:55 1769039335

Good news! Google doesn't know how to rank pages either!

pas · 2026-01-22T10:15:06 1769076906

yet ... it works "ok" most of the time.

not to mention that people mostly need wikipedia, the news, navigating the infuriating world of websites of big service providers (gov sites, or try to find anything on Microsoft's dark corner of the web), porn and brainrot

but it's awfully hard to make traction on a business that provides this.

walls · 2026-01-21T19:32:45 1769023965

A huge amount of the web is only crawlable with a googlebot user-agent and specific source IPs.

Imustaskforhelp · 2026-01-21T20:22:24 1769026944

> And given you-know-what, the battle to establish a new search crawler will be harder than ever. Crawlers are now presumed guilty of scraping for AI services until proven innocent.

I have always wondered but how does wayback machine work, is there no way that we can use wayback archive and then run a index on top of every wayback archive somehow?

ghm2199 · 2026-01-21T21:03:06 1769029386

You can read https://hackernoon.com/the-long-now-of-the-web-inside-the-in... it was a nice look into their infra structure. One could theoretically build it. A few things stand out:

1. IIUC depends a lot on "Save Page Now" democratization, which could work, but its not like a crawler.

2. In absence of alexa they depend quite heavily on common crawl, which is quite crazy because there literally is no other place to go. I don't think they can use google's syndicated API, cause they would then start showing ads in their database, which is garbage that would strain their tiny storage budget.

3. Minor from a software engineering perspective but important for survival of the company: since they are an artifact of record storage, to convert that to an index would need a good legal team to battle google to argue. They do that the DoJ's recent ruling in their favor.

deepsquirrelnet · 2026-01-21T20:31:33 1769027493

I do not know a lot about this subject, but couldn’t you make a pretty decent index off of common crawl? It seems to me the bar is so low you wouldn’t have to have everything. Especially if your goal was not monetization with ads.

ghm2199 · 2026-01-21T20:52:35 1769028755

I think someone had commented on another thread about SerpAPI the other day that common crawl is quite small. It would be a start, I think the key to a good index people will use is freshness of the results. You need good recall for a search engine, precision tuning/re-ranking is not going to help otherwise.

hattmall · 2026-01-22T05:50:34 1769061034

Are these websites not serving public content? If there's some legal concerns just create a separate scraping LLC that fakes user agent and uses residential IPs or VPN or something. I can't imagine that the companies would follow through with some sort of lawsuit against a scraper that's trying to index their site to get them more visitors, if they allow GoogleBot.

ddtaylor · 2026-01-22T10:44:37 1769078677

Isnt that what SerpAPI was doing?

charcircuit · 2026-01-21T21:16:45 1769030205

If a crawler offered enough money they could be allowed too. It's not like Google has exclusive crawling rights.

Nextgrid · 2026-01-22T00:30:13 1769041813

There is a logistics problem here - even if you had enough money to pay, how would you get in touch with every single site to even let them know you're happy to pay? It's not like site operators routinely scan their error logs to see your failed crawling attempts and your offer in the user-agent.

Even if they see it, it's a classic chicken & egg problem: it's not worth the time of the site operator to engage with your offer until your search engine popular enough to matter, but your search engine will never become popular enough to matter if it doesn't have a critical mass of sites to begin with.

charcircuit · 2026-01-22T03:45:40 1769053540

Realistically you don't need every single site on board before you index becomes valuable. You can get in touch with sites via social media, email, discord, or even visiting them face to face.

stavros · 2026-01-22T07:04:25 1769065465

You really do need every single site, as search is a long tail problem. All the interesting stuff is in the fringes, if you only have a few big sites you'll have a search engine of spam.

charcircuit · 2026-01-22T08:36:02 1769070962

I think that is only needed for a small subset of queries. Seriously think of the last time you did a search and went to a fringe site as opposed to a well known brand or social media. Ranking quality is much more important than coverage over the whole internet.

stavros · 2026-01-22T09:34:45 1769074485

> Seriously think of the last time you did a search and went to a fringe site as opposed to a well known brand or social media.

Oh, almost never. That's exactly why search sucks now.

KellyCriterion · 2026-01-21T19:47:44 1769024864

Scraping is hard. Very good scraping is even harder. And today, being a scraping business is veeery difficult; there are some "open"/public indices, but none of these other indices ever took off

ghm2199 · 2026-01-21T20:00:33 1769025633

Well sure yes, I don't contend with the fact that its hard, but if the top tech companies joined their heads I am sure if for example, Meta, Apple, MS have enough talent between to make an open source index if only to reap gains from the de-monopolization of it all.

zanderz · 2026-01-22T16:03:14 1769097794

I learned on here that this has been happening to a degree with maps. Several big companies have been cooperating to improve open street map data, a rare example of a beneficial commons. This is probably some unique accident of incentives and timing and history but maybe it could happen in other domains.

Nextgrid · 2026-01-22T00:26:30 1769041590

All these companies have the exact same business model as Google (advertising) and have the same mismatched incentives: good search results are not something they want.

Google Search sucks not because Google is incapable of filtering out spam and SEO slop (though they very much love that people believe they can't), but that spam/slop makes the ads on the SERP page more enticing, and some of the spam itself includes Google Ads/analytics and benefits them there too.

There is no incentive for these companies to build a good search engine by themselves to begin with, let alone provide data to allow others to build one.

alex1138 · 2026-01-22T05:57:27 1769061447

I was on the Goog forums for years (before they even fucking ruined the FORMAT of the forums, possibly to 'be more mobile friendly') and it was people absolutely (justifiably) screaming at the product people

No, the customer isn't 'always' right, but these guys like to get big and once big, fuck you, we don't have to listen to you, we're big; what are you going to do, leave?

t_mahmood · 2026-01-23T09:53:03 1769161983

They will prefer to band up with Google, and rip us off.

Imustaskforhelp · 2026-01-21T20:21:40 1769026900

I mean, doesn't microsoft have bing?

ghm2199 · 2026-01-21T20:44:33 1769028273

Yeah but no one uses it. I am not even sure people that are forced to use it like using it because it was productized it pretty poorly. After all who wants another google? They invested 100 Billion dollars, which is a lot of wasted money TBH.

Search indexes are hard, surely, but if you were to strip it to just a good index on the browser, made it free, kept it fresh, it cannot be 100 billion dollars to build. Then you use this DoJ decision and fight against google to not deny a free index to have equal rights on chrome you can have a massive shot at a win for a LOT less money.

Imustaskforhelp · 2026-01-21T20:48:55 1769028535

> Yeah but no one uses it. I am not even sure people like using it because it was productized it pretty poorly. They invested 100 Billion dollars, which is a lot of wasted money TBH.

I mean... Duckduckgo uses bing api iirc and I use duckduckgo and many people use duckduckgo.

I also used bing once because bing used to cache websites which weren't available in wayback archive, I don't know how but It was pretty cool solution for a problem.

I hate bing too and I am kind of interested in ecosia/qwant's future as well (yes there's kagi too and good luck to kagi as well! but I am currently still staying on duckduckgo)

ghm2199 · 2026-01-21T21:10:58 1769029858

Duck duck go is really cool. I am almost fully rooting for them and they are my default mobile and web browser.

The small distributed team grinding it out against the goliath. They are awesome and perhaps the right example of what a path like this would look like. Maybe someone from their team can chime in on the difficulties of building a search engine that works in the face of tremendous odds.

direwolf20 · 2026-01-22T11:50:56 1769082656

DDG is mostly just an anonymizing proxy for Bing. Microsoft encourages it because it increases Bing's market share over Google.

dylan604 · 2026-01-21T22:10:24 1769033424

I would imagine the users of DDG to be closer to a rounding error than an actual percentage of users. I'd imagine theGoog would love and hate to have 100%. They'd love it because all the data, and hate it as it would prove the monopoly. At the end of the day, the % that is not going to them probably doesn't cause theGoog to lose much sleep

Imustaskforhelp · 2026-01-21T22:38:49 1769035129

It's just so wild how great Duckduckgo is & how under-rated it is.

It's available in all major browsers (Here in zen browser, it doesn't even have a default browser but rather on the start page it asks between the three options, google duckduckgo and bing but yes if you press next it starts from google but zen can even start from ddg, its not such a big deal)

Duckduckgo is super amazing. I mean they are so amazing and their duck.ai or ai actually provides concise data instead of Google's AI

DDG is leaps ahead of Google in terms of everything. I found Kagi to be pleasant too but with PPP it might make sense in Europe and America but privacy isn't/ shouldn't be the only who only pays. So DDG is great for me personally and I can't recommend it enough for most cases.

Brave/Startpage is a second but DDG is so good :)

It just works (for most cases, the only use case I use google is for uploading images to then get more images like this or use an image as a search query and I just do !gi and open images.google.com but I only use this function very rarely, bangs are amazing feature by ddg)

dylan604 · 2026-01-21T23:00:20 1769036420

I use DDG myself. I just assumed that I'm not a very sophisticated user as I've never had it not serve my needs based on how other people here say it's not very good.

8bitsrule · 2026-01-22T12:07:51 1769083671

>I've never had it not serve my needs

Same here. It may be 'not very good' for highly specialized or complex technical questions ... but I do research across a broad range of (non-specialized) topics daily. I often need to find 2nd and 3rd points of view on a topic ... or detailed facts about singular events ... and I rarely need to go to the 2nd page. And all ad-free!

It's a remarkable education tool. A curious, explorative kid these days could easily sail WAY beyond their age group using DDG. I can only wish I'd had it.

Their recently added 'Search assistant' consistently provides a couple of CITATIONS to backup its (multi-leveled) responses (Ask for more, get more.) I've seen nothing like it elsewhere. It is even quite good at diggin up useful ... and working ... example code for some languages. Also with citations.

direwolf20 · 2026-01-22T11:51:28 1769082688

DDG is just an anonymizing front-end for Bing. Your DDG results are Bing results.

antiframe · 2026-01-22T16:40:18 1769100018

Then the anonimization is a key component of their goodness. When I compare searches between Bing and DDG I find the DDG ones superior every time.

renegat0x0 · 2026-01-21T21:20:38 1769030438

Scraping is hard, and is not hard that much at the same time. There are many projects about scraping, so with a few lines you can do implement scraper using curl cffi, or playwright.

People complain that user-agent need to be filled. Boo-hoo, are we on hacker news, or what? Can't we just provide cookies, and user-agent? Not a big deal, right?

I myself have implemented a simple solution that is able to go through many hoops, and provide JSON response. Simple and easy [0].

On the other hand it was always an arms race. It will be. Eventually every content will be protected via walled gardens, there is no going around it.

Search engines affect me less, and less every day. I have my own small "index" / "bookmarks" with many domains, github projects, youtube channels [1].

Since the database is so big, the most used by me places is extracted into simple and fast web page using SQLite table [2]. Scraping done right is not a problem.

[0] https://github.com/rumca-js/crawler-buddy

[1] https://github.com/rumca-js/Internet-Places-Database

[2] https://rumca-js.github.io/search

SyneRyder · 2026-01-21T23:15:43 1769037343

+1 so much for this. I have been doing the same, an SQLite database of my "own personal internet" of the sites I actually need. I use it as a tiny supplementary index for a metasearch engine I built for myself - which I actually did to replace Kagi.

Building a metasearch engine is not hard to do (especially with AI now). It's so liberating when you control the ranking algorithm, and can supplement what the big engines provide as results with your own index of sites and pages that are important to you. I admit, my results & speed aren't as good as Kagi, but still good enough that my personal search engine has been my sole search engine for a year now.

If a site doesn't want me to crawl them, that's fine. I probably don't need them. In practice it hasn't gotten in the way as much as I might have thought it would. But I do still rely on Brave / Mojeek / Marginalia to do much of the heavy lifting for me.

I especially appreciate Marginalia for publicly documenting as much about building a search engine as they have: https://www.marginalia.nu/log/

carte_blanche · 2026-01-23T09:22:50 1769160170

Do you have any documentation/blog post for this? I would love to do something similar for my own use.

visarga · 2026-01-22T04:47:28 1769057248

> Search engines affect me less, and less every day. I have my own small "index" / "bookmarks" with many domains, github projects, youtube channels

Exactly, why can't we just hoard our bookmarks and a list of curated sources, say 1M or 10M small search stubs, and have a LLM direct the scraping operation?

The idea is to have starting points for a scraper, such as blogs, awesome lists, specialized search engines, news sites, docs, etc. On a given query the model only needs a few starting points to find fresh information. Hosting a few GB of compact search stubs could go a long way towards search independence.

This could mean replacing Google. You can even go fully local with local LLM + code sandbox + search stub index + scraper.

direwolf20 · 2026-01-22T11:52:00 1769082720

Marginalia Search does something like this

ghm2199 · 2026-01-21T23:33:24 1769038404

When I saw the Internet-Places-Database I thought it was an index on some sort of PoI and I got curious. But the personal internet spiel is pretty cool. One good addition to this could be the Foursquare PoI dataset for places search: https://opensource.foursquare.com/os-places/

hsuduebc2 · 2026-01-21T18:57:18 1769021838

I don’t think it’s comparable to today’s AI race.

Google has a monopoly, an entrenched customer base, and stable revenue from a proven business model. Anyone trying to compete would have to pour massive money into infrastructure and then fight Google for users. In that game, Google already won.

The current AI landscape is different. Multiple players are competing in an emerging field with an uncertain business model. We’re still in the phase of building better products, where companies started from more similar footing and aren’t primarily battling for customers yet. In that context, investing heavily in the core technology can still make financial sense. A better comparison might be the early days of car makers, or the web browser wars before the market settled.

ghm2199 · 2026-01-21T20:12:37 1769026357

> ... stable revenue from a proven business mode... In that game, Google already won.

But if they were to pour that money strategically to capture market share one of two things would happen if google was replaced/lost share:

1. it would be the start of the commoditization of search. i.e. search engine/index would become a commodity and more specialized and people could buy what they want and compete.

2. A new large tech company takes rein. In which case it would be as bad as this time.

Like what I don't get is that if other big tech companies actually broke apart monopoly on search, several google dominos in mobile devices, browser tech, location capabilities would fall. It would be a massive injection of new competition into the economy, lots of people would spend more dollars across the space(and ad driven buying too) money would not accrue in an offshore tax haven in ireland

To play the devils advocate, I think the only reason its not happening is because meta, apple, microsoft have very different moats/business models to profit off. They all have been stung one time or another is small or big ways for trying to build something that could compete but failed. MS with bing, Meta with facebook search, Foursquare — not big tech but still — with Maurauder's Map.

hamdingers · 2026-01-21T19:07:46 1769022466

> If other tech companies really wanted to break this monopoly, why can't they just do it

Google is a verb, nobody can compete with that level of mindshare.

observationist · 2026-01-21T19:23:45 1769023425

A big part of it is about the legal minefield if you presented any sort of real threat to Google. Nobody wants to wager billions in infrastructure and IP against Google or Apple or Microsoft, even if you could whip up a viable competing product in a weekend (for any given product.)

Part of it is also the ecosystem - don't threaten adtech, because the wrong lawsuits, the wrong consumer trend, the wrong innovation that undercuts the entire adtech ecosystem means they lose their goose with the golden eggs.

Even if Kagi or some other company achieves legitimate mindshare in search, they still don't have the infrastructure and ancillary products and cash reserves of Google, etc. The second they become a real "threat" in Google's eyes, they'd start seeing lawsuits over IP and hostile and aggressive resource acquisitions to freeze out their expansion, arbitrary deranking in search results, possible heightened government audits and regulatory interactions, and so on. They have access to a shit ton of legal levers, not to mention the whole endless flood of dirty tricks money can buy (not that Google would ever do that.)

They're institutional at this point; they're only going away if/when government decides to break it up and make things sane again.

wongarsu · 2026-01-21T19:16:53 1769023013

Xerox is a verb, but most copy machines I see are made by their competition

hamdingers · 2026-01-21T19:19:08 1769023148

Wonder why that could be?

https://www.nytimes.com/1975/07/31/archives/xerox-settlement...

eikenberry · 2026-01-21T19:18:07 1769023087

Kleenex isn't the only brand of tissues sold in stores.

iamacyborg · 2026-01-21T23:13:36 1769037216

How’s that working out for Hoover in the UK?

cowsandmilk · 2026-01-21T21:07:56 1769029676

Licensing their index doesn’t change that.

Zyst · 2026-01-21T19:19:01 1769023141

So were AOL, and Skype

dylan604 · 2026-01-21T22:05:41 1769033141

I don't ever recall anyone using AOL as a verb. How would you do that?

terespuwash · 2026-01-22T07:47:46 1769068066

Let me AOL this for you

dylan604 · 2026-01-22T17:04:22 1769101462

said no one ever

mgiampapa · 2026-01-23T11:02:09 1769166129

You clearly did not live in the world of watching two teens on computers in the same room hold two entirely different conversations out-loud and over AIM.

citizenpaul · 2026-01-22T00:22:48 1769041368

>why can't they just do it

Money. Google controls 99% of the adverting market. That's why its called a monopoly. No one else can compete because they can never make enough money to make it worth the costs of doing it themselves.

paxys · 2026-01-21T19:32:54 1769023974

Apple had a chance to break Google's search monopoly, but they chose to take billions from them instead.

Microsoft had a chance (well another chance, after they gave up IE's lead) to break up Google's browser monopoly, but they decided to use Chromium for free instead.

Ultimately all these decisions come down to what's more profitable, not what's in the best interests of the public. We have learned this lesson x1000000. Stop relying on corporations to uphold freedoms (software or otherwise), becuase that simply isn't going to happen.

charcircuit · 2026-01-21T21:26:30 1769030790

>but they chose to take billions from them instead.

They chose to use Google with a revenue sharing agreement. Google is very well monetized. It would be very difficult for Apple to monetize their own search as good as Google can.

>they decided to use Chromium

Windows ships with Microsoft Edge as the browser which Microsoft has full control over.

w10-1 · 2026-01-22T17:50:36 1769104236

Other comments mention difficulty, cost, conditions, etc.

Also, competitive agreements: of the big players like Apple, Microsoft, Facebook/Meta, Amazon, etc., only Google is in the ad business. But it has credible threats of digging into their businesses - GCP, Android, (not to mention software licenses and competitive access to e.g., Samsung), etc. So they agree to cede the ad world to Google, to keep Google out of their businesses.

The injunctions cannot be effective. Google ads are essentially a tax at a fine scale that rational people chose when it didn't change site behavior. But then Google ads changed the nature of the web itself, converting every snippet of information into an opportunity to monetize. Neither would change with a public search.org, and injunctions to license ad-free indexes won't change site behavior or publishers' self-interest in selling access to their content to Google alone.

Google knows the injunctions are unworkable and ultimately ineffective. The only question is what price they have to pay to the Trump judiciary to counter them.

xnx · 2026-01-21T19:24:11 1769023451

> If other tech companies really wanted to break this monopoly, why can't they just do it

Companies would rather sue than try and compete by investing their own money.

ghm2199 · 2026-01-16T00:05:45 1768521945

Litestream.io is amazing. Using sqlite as a DB in a typical relational data model where objects are related mean most read then write transactions would have to one node, but if the using it for blobs as first class objects(e.g. video uploads or sensor data) which are independent probably means you can shard and scale your set up the wazoo right?

ghm2199 · 2026-01-15T23:21:23 1768519283

Does any one know how the size of this compares to archive.today?

textfiles · 2026-01-16T00:17:53 1768522673

We absolutely lap them with many, many more petabytes of material. But archive.today is also not doing speculative or multiple scheduled captures of the amount of sites that archive.org is.

ghm2199 · 2026-01-10T13:39:59 1768052399

Yeah. Human preference are like snowflakes. One mans "clean-enough" is another man's OCD driven nightmare.

spwa4 · 2026-01-10T14:08:51 1768054131

For an employee the cost function is maximum wage for minimum work. Since at minimum wage, you're paid for your time, this means sweeping as badly and slowly as the minimum the manager accepts.

Hell, given that there is a social safety net, and you'll have costs to do the job (food, public transport, ...) you're probably even better off doing worse than that, and getting fired when the manager is "tired of your shit" or whatever.

Then you'll get unemployment, which is slightly less, but you can invest the time in cooking at home, and you'll eat better and have more money left over.

seec · 2026-01-11T18:39:12 1768156752

It's a very cynical view but I kind of agree. In those kinds of jobs, the only rewards for doing the job well and fast is just more mindless jobs of the same type. It would be usefull if you would receive a benefit for doing the job better or you could leave earlier for the same pay but that is rarely the case, since as you said, employers generally pay for your time instead of task completion (which is rather dumb because it offers bad incentives for both sides).

I have talked about this with some business owners who were getting kinda angree that some employes were not putting in a lot of efforts. In all case they were paying the minimum wage with absolutly zero compensation for doing the job better and/or faster.

I can't decide if they are just stupid or simply corrupt but they really should realise that with a strong welfare state and plenty of similar shitty jobs available, the stick really cannot work all that well and they should really use the carrot a lot more. But of course those people generally make at least 3-4 times the minimum wage and they feel they deserve the premium because they deserve it and are so much better. Funnily enough, those that I know consistenly do a worse job than their employees at most things and it's obvious they didn't get there by starting at the bottom.

spwa4 · 2026-01-13T08:44:23 1768293863

> I can't decide if they are just stupid or simply corrupt but they really should realise that with a strong welfare state and plenty of similar shitty jobs available, the stick really cannot work all that well ...

That assumes the manager level above them isn't doing the exact same thing.

seec · 2026-01-13T19:46:01 1768333561

I'm mostly talking about people that don't really have a manager above them or all the freedom they need to organise payroll as they see fit. Restaurants owners, recreation center owner, tradesman, etc.

They really could reward good elements, but treat everyone as interchangeable at the same exact pay rate. They get mad when people don't do as good as a job as they wish and then they get mad when they find another job.

I have many examples, but 2 recent ones where:

- a man who was hired as lifeguard for a small pool was being complained about for not cleaning the thing properly, even though every single similar job do not require this tasks for the same pay level, he left to create his own business with friends and is much happier (even though he doesn't make much more money yet).

- a woman who was hired as waitress ended up doing the vast majority of the work for the other waitress, a woman part owne that was quite a bitch in many ways. The hired woman got paid minimum wage and had to share the tips, even though from what I saw, she is the one who got most of them. The next season, she refused to come back, prefering instead working as a cashier. The woman owner was somehow confused. The hired woman was a strong worker, no reason to slave away for greedy ungratefull owners.

Unsurprisingly, the first business isn't going well (2 year going chronic deficit) and the second one just failed after 4 years. Those people just don't understand the true value of work, because they never had to do as much work themselves. I won't details my connection but they are people I know very well and the link is so obvious to me, but apparently not to them. But they are failing, so I guess karma is a bitch in the end.

ghm2199 · 2026-01-10T13:36:03 1768052163

Actually even when a Roomba(RIP) turns to follow the optimal plan like that, the time taken to pivot 90 to manage planb is pretty large.

My Roomba will take less time when it dead reckons as opposed to avoiding visiting tiles twice. I guess energy and time are still spent poorly by humans and Roomba for following the optimal plan.

ghm2199 · 2026-01-08T16:40:15 1767890415

I wish meta would do the same for its portal devices. The devices are solid hardware. They removed a ton of app support that needed cloud services.

I loved their camera tracking and picture frame along with their speaker quality.

ghm2199 · 2026-01-08T14:04:51 1767881091

I think you can take a bit more directed approach too.

You can build a micro-controller with a circuit that controls a stepper motor. Let the motor do something simple/fun. Connect the fun doohikie to give feedback to the microcontroller — e.g. using some kind of an encoder chip that converts motor's rotation amount to numbers that will tell the microcontroller how much to move, initialize the doohikie's start state.

But you can understand a lot more if you don't use the HBridge chip for the motor. Build the bridge circuit yourself. Build your own power supply for the microcontroller too(if you want to).

You can pick a path to go down on and focus on specific parts:

1. For the h bridge there is lots to learn. Designing the operational amp for D2A conversion as well as amplification/signal modeling — which you will need for the motor for current limiting in the h bridge per motor spec for the doohickie you want to power — That will teach you quite a bit about analog electronics and design. What kind of currents you need to protect and how e.g. using opto electronics. Limiting noise from power supply and parasitic noise so that your circuit does not misfire. You will likely need a set up of an oscilloscope, soldering irons and breadboards to prototype. Learn some basics from a book about design then go back to the circuit and build.

2. If you build your own PCB for this. It is a multi month project. You can learn a out CAD and chip layout. But I think you can do this in parts for example you can design the initial PCB only for the digital components and then connect it to a breadboard where you can prototype the H bridge you want.

3. If you choose to learn digital design and embedded system programming then maybe you can build the tougher analog parts for motor control using store-bought components and chips and focus mostly on the programming the microcontroller. That is a totally legitimate part too. You could use an old MCS-51 microcontroller and learn about data and program memory addressing and interrupt handling from scratch.

ghm2199 · 2025-12-15T23:52:50 1765842770

An analogous audio binding issue used to happen with my Jabra Bt headphones. It was generally connected to my phone and my computer. After finishing a phone call — if previously the computer was playing some music — the music would turn back on but it would be a very poor quality, I suspect the audio "mode" was stuck at "transmitting" phone call audio quality even though the BT software on the headset detected devices being switched from phone -> computer. Toggling the BT sound output on the mac to and fro between Computer and Headphones, fixed it.

I suspect it was probably a vendor — jabra — software issue when sending a signal to apple's BT stack when switching between types of devices? But probably not worth fixing on my own.

ghm2199 · 2025-12-11T23:59:40 1765497580

Serious question: What's the gold standard in blocking certain content for an age group, without tracking ones identity?

My initial thought would it would be just making it super easy for their guardians to distribute and control device content. But let the control end at that echelon of power; Not even the local councils or schools should be given the power to regulate social media for kids to this extent IMO, let alone the govt

ElectroBuffoon · 2025-12-12T13:10:58 1765545058

RTALabel https://www.rtalabel.org/page.php is over 3 decades old. Restricted to Adults, pretty easy to remember.

For some history and related standards, see Wikipedia:

PICS https://en.wikipedia.org/wiki/Platform_for_Internet_Content_...

POWDER https://en.wikipedia.org/wiki/Protocol_for_Web_Description_R...

ASACP/RTA https://en.wikipedia.org/wiki/Association_of_Sites_Advocatin...

The more "digi-ID so we are sure you are old enough, bitte" keeps been pushed, the clearer it's about tracking and not about children. No matter how much they love to frame it the other way around. Unless they want to admit they are total inepts.

rossriley · 2025-12-12T02:35:51 1765506951

I'd have thought an OAuth flow to a government run ID system, to create an account you first must verify your age by redirecting to the ID provider logging in via FaceID/Fingerprint to verify it's you and then you are redirected back to the original site with a verification code.

Admittedly on paper that means the Gov system would know which sites you were approved for, not logging that would require legislation to not store these logs.

ghm2199 · 2025-11-21T19:25:53 1763753153

My hero usecase with these tools is to auto pull investments from Fidelity 401K account + Schwab brokerage + BYOBrokerage.

Then combine them and break them down by country/geography(e.g. ex-US or US or world) and then type(small, mid, large) and then finally by income strategy(growth, value, fixed, defined outcome etc)