Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately, though this is an excellent story, this is just an example of a bias known as "frequency illusion". It happens with a lot of things, like seeing the clock at 11:11 more than you do at 11:09. Or seeing lots of your make/model car but being blind to the hundred of other variants on the road.

How many times have you opened up your laptop and not seen a Facebook add for something you just did, or something you discussed? You'll never notice those occasions.



Most people don’t see completely irrelevant Facebook ads.

Frankly, Facebook and their proxies always speak in meaningless nonsense about everything. If you asked Zuck if he ate kittens, you’d get some reply about facebooks mission and why cats are important.

For some mysterious reason, all explanations for the “Facebook is listening” phenomenon are uniquely cogent, clear and dismissive.

Personally, I have zero doubt that a downstream “partner”, data provider, or affiliate is processing audio data of questionable origin for ad insights. Call center companies with tight margins do it, why wouldn’t an ad company?


> Most people don’t see completely irrelevant Facebook ads.

Most people don't NOTICE the ones that are, either.


I understand you have used Occam's razor to come to this conclusion. And its a perfectly valid point. But, this particular story has been repeated so many times around me that I am genuinely suspicious. But alas, the only way to know would be to look at the code. And even then we might not understand because its a blackbox type system which is ill understood by even its designers.


It's actually possible to evoke some interesting responses from the algorithms by reducing the amount of data input.

For example: When I set up a new facebook account for my mother (at her explicit wish), she had no friends or interests marked yet. Facebook showed her some random ads and posts.

During the setup I was scrolling through her timeline and my phone beeped so I stopped scrolling for about 2 seconds. The post shown was a random post about some fish.

When I picked it up, I saw it quickly replacing the next random post with something about the same kind of fish. So evidently it even looks at how long you look at certain content to determine your interests.

I suppose it is possible to derive other algorithmic determinations using similar methods.


> So evidently it even looks at how long you look at certain content

It does. Instagram constantly sends back telemetry including your scroll position, which can then be used to determine what you were looking at and for how long. Scroll right past an ad? you probably won't see it again; the algorithm knows it didn't have an impact on you. Meanwhile, spend a few seconds reading what it says, and this teaches the algorithm that you are interested in similar content.


How is it that no such scandal has been uncovered? Surely by now some hacker would have been able to prove that a phone is recording, sending to server, processing, and returning relevant ad. Or surely someone would have come forward or whistle blown by now. So I'll quote Hitchen's razor for you:

"What can be asserted without evidence can also be dismissed without evidence."


Many have tried. Steve Gibson (of GRC fame) did some wiresharking around one of his Amazon devices and found no abnormal networking traffic when he was talking to it, vs not.


Alexa devices have been extensively and repeatedly shown to not be "passively listening".

The same cannot be said for phone apps.


Untrue.

I am pushing the boat out because I rely on my memory. But there were reports from Apple contractors about what they heard on Sirri. It is always on, always listening.


> But there were reports from Apple contractors about what they heard on Sirri. It is always on, always listening.

These 2 sentences may not necessarily need to both be true. As I recall, one could opt in (or was it opt-out?) to an Apple program to upload bits and pieces of spoken word for its people to parse "humanly" for it to improve its speech to text.

I'm no Apple fan, but I'm not sure one implies the other, here.


I have acknowledged that my reasoning is less sound than the occam's razor, and I didn't really assert data impropriety. So calm down.

But about your first point. I don't think even the designers and maintainers of this blackbox understand the system. Looking at it from that point of view, the chances of a hacker finding proof for this is pretty low.


The thing is you can disapprove one piece without understanding the whole system.

It would be pretty easy to show that a) sound is not being continually recorded and streamed over the internet and b) the device is not using enough processing power to decode speech. Both have been done, so this is veering into conspiracy theory territory.


FWIW, streaming voice over the Internet isn't required for this attack - all the software needs is to send a few bytes long tag indicating the topic of an overheard conversation.

The processing power required for this isn't big either - remember that 12+ years ago Microsoft Windows shipped with a speech recognition system that was in many ways better than what the phones currently offer, and worked off-line and with almost unnoticeable performance penalty. And if you're interested in probabilistic reporting ("there's 86% I've heard a word matching this tag in the last hour..."), you can relax performance requirements even further.

So, out of the things you mention, the only somewhat convincing piece of evidence would be that the apps in question are not accessing microphone in the background.


My dude! we are talking past each other. I am not asserting data handling impropriety. That is not what concerns me. What concerns me is they are letting these black box systems emotionally manipulate me.


Why would Google be recording your mic and using it for ads where they would just be caught for doing it? I mean it's completely possible. But more likely just confirmation bias. Speaking of Occam's razor, we should just dump modern "technology" (smart phones, smart TVs, the web, IoT, even feature phones were no good).

There's actually nothing hard about the concept of a mobile phone, it's just a computer (or could even be a simple PCB) with a mic and speaker. No need for "secret sauce" standards such that nobody can tell if it's secure (I mean it isn't, the bugs just get patched every week, day, nanosecond, whatever). Hell, you can even make a completely open and simple (even more important than open) phone communication standard and charge 1 billion people tens of dollars per month to use your network and become the richest person on earth.

edit: I mean facebook, or whatever (also facebook would have to gain access to the mic [maybe facebook has mic permission i guess, i am unfamiliar with smart phones])


Because they have "voice assistants" that have to be always on, always transmitting, because the software that recognises your words on your mobile phone needs help.

Facebook has access to your mic if you ever use it for its voice com functions (do not do that) and do not explicitly remove the permissions to access teh mic (do do that).

They have been caught several times. Thing is people give them permission to record through the mic so it is legal.

Do not confuse legal with good, it is evil.


Well, another approach would be to do some controlled experiments: Pick a selection of somewhat-uncommon products. Get some volunteers to set up Facebook accounts on clean computers and phones with no adblockers. Monitor their incoming advertising messages for 2 weeks.

Then randomly assign the products from the first step to the volunteers, give them information about the product on paper and ask them to hold verbal conversations about such such products.

If they start getting adverts that happen to match the subject of those verbal conversations, something is going on.


The "11:11 on the clock" story has also been repeated by millions of people for decades. That many people fall prey to a cognitive bias does not make it any less of a bias.


You need to default to uncertainty. It’s not proven that it was coincidence, but it’s also not proven that it isn’t.

Sometimes you never do find out what happened.


Yes, it is a common cognitive bias.


> But, this particular story has been repeated so many times around me that I am genuinely suspicious

So, one meta-step up in abstraction? People "notice" these these things which they talk about and now you're especially sensitive to hearing them?


This probably could explain most people's accounts of this, but I've been approached by companies who offered large lump sums to include their SDK which required microphone access in our mobile app, in order to fingerprint what our users were watching on TV while it was open, nominally to see what ads they were seeing. I wouldn't be at all surprised if they or others were going a good bit further than that and trying to run speech recognition on overheard conversations, unless it's illegal somehow.

AdTech is frankly a revolting industry.


I don't buy the speech recognition, nor have I seen it offered.

But the tv 'recognition' is a big part of selling ads on connected tvs, vizio, roku etc.


Bullshit, Facebook was found around 2015-2016 to be draining iPhone batteries with background audio sessions. While they may have gotten more efficient with their methods it wouldn't be surprising if they were still recording audio.

There's a moral hazard that incentivizes any company that can do so to bug user's homes for advertising purposes. IMO it should be illegal.


I doubt it though, on Android 11 it now tells you when your phone is recording audio (I see it during a whatsapp call for example) and as far as I know iOS has something similar (an orange dot IIRC).

So they would be caught out pretty quickly if they did this.

I'm sure they did it before though, ultrasonic identifications during TV ads etc were really a thing.


My issue with the whole idea of background recording for advertising is that it would be incredibly costly to store this data, transcribe the audio and turn it into anything even remotely meaningful for advertisers. I also don’t know a lot on this subject so if anyone has better info that’d be great.


You don't need to store the data, just transcribe it. That's basically the business model for Siri, Alexa et al. If you're worried about cost, just offload the work to the cell phone and accept the less-than-100% transcription.

The only reason I don't think the big players are doing this _is_ the potential for scandal. Random apps on the app store that ask for a million permissions, on the other hand, are probably doing this.

It only takes one clever hacker looking to make a name for themselves. With that said, there are plenty of cases where companies _were_ caught spying, so maybe it's not so cut and dry.


You can easily process the audio on the fly and reduce it to a probabilistic estimate of whether a tag from a predefined topic set was present in the conversation. Doesn't need to be 100% accurate. You don't need to store the audio - just stream it through the recognizer. The output of such recognizer will be something on the order of 8-32 bytes (an int for tag, a float for probability, an int64 for timestamp), possibly less if one's clever - and it only needs to be stored until the next opportunity to send it out.

Also: people seem to be looking at modern speech recognizers on their phones and wrongly concluding that speech recognition in general is very compute-intensive. It isn't, if you're willing to make some sacrifices on accuracy and generality, and to do it locally instead voice data off to a cloud somewhere. A proper benchmark here isn't Siri or Google Assistant - it's Microsoft Speech API, as shipped with Windows 12+ years ago.


> store this data, transcribe the audio and turn it into anything even remotely meaningful for advertisers

I disagree - even shitty, low CPU on-device transcription could give a signal to advertising algos.

I doubt this is being done, but it is definitely within the range of possibility and wouldn't even drain your phone battery that much.


All is needs to be is a list of keywords associated with your advertising profile.


Purchased a hard kombucha at Whole Foods last week,

Ever since, about 1/3 of my instagram ads are for it. Never had an instagram ad for it before.


Maybe the subtle influences that led you to buying such a drink in the first place are directly related to increased advertising for them. Also Amazon has your Whole Foods purchasing data, so that probably trickled down somewhere.


Technically my girlfriend purchased it with her Whole Foods/Amazon account, and she has not seen any ads.


The first part of your comment is gaslighting.

The second "trickled down somewhere" is the point.


A few potential reasons:

- You fit the demographic of Kombucha drinkers in your locality

- You visited a Kombucha blog/website recently that used retargeting to deliver an ad to your Instagram

- An initial ad that caught your attention and Instagram used “dwell time” to determine that the ad is relevant to you


Good points. I do fit the demographic, I have looked up pages related to Kombucha in the past (thought not in the last 6 months), and I did dwell on the first ad in amazement.

Still, I got home from the store and started seeing the ads immediately.


Have you spoke with someone about it? IOS and Android can easily analyze what you say and send relevant keywords home. This is very evil genius in a way that they do not send your voice feed neither your full sentences but only keywords that the law describe as "metatags" that courts found no to be an abuse of your 4A in the past.

In fact, you didn't even have to be on the phone. You could just come home and told you wife what you bough. That would be enough to send keywords and know what you maybe interested in. I know for fact my cable box (Spectrum) is listening and analyzing to my conversations. We used to talk with my wife about the most crazies stuff and less than 48 hours, Spectrum TV, Sling and YouTube would inject relevant ads. Some were extremely home made and amateurish but always spot on.

Do an experiment in home. Talk about something you dont have or is irrelevant to you. For example if you have no kids start talking about them. Use keywords like "our first child", "baby sitting", "hospital", "giving birth", "baby shower", I bet you less than 48 hours later your TV will be interrupting you with ads related to baby products; ads you have never seen before.


Do you have any evidence of iOS analyzing and transmitting keywords? It's one thing to say it's technically possible and another to say it's happening.


Honestly speaking, this is more scary than them actually stealing data.


Sure. Except in the last three years that "frequency illusion" has been happening to me with a... growing frequency. About every 2-3 months, Facebook shows my wife an ad for some completely random shit we're sure neither of us searched for before or mentioned to anyone else.

I would agree with you last decade. This decade, I have my doubts.


There is great statistical power in these ML models, in many cases the "random shit" will become the topic of conversation due to shared social factors that can be predicted, you simply neglect to recognize all the times the modeling failed.


While this is indeed a possibility, your certitude is unwarranted.


I don't know how many times I internally chuckle when I glance at the clock some time before I go to bed and it's "21:12"; a meaningful number to me as the "2112" album and the band Rush was a big part of my youth.

That, and I tend to go to bed between 21:00 and 22:00. But I don't attribute it to anything but me being in a position to look at the clock around that time, and I haven't wondered if I see it any more than 21:09 or 21:30. Would be an interesting histogram, if nothing else.


I have the same for Rush 21:12.

Recommended. https://www.youtube.com/watch?v=AZm1_jtY1SQ


No it's not. There has been more than one instance when me or friends talked about topics and a day later we get weird ads for it on Instagram.

Adtech is creepy and dystopian.


As what % of ads that are unrelated to anything they talked about?


For me it's 9:41




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: