For those curious like me, here are the Wikipedia entries for David Mayer
> David Mayer may refer to:
(A) David Mayer (historian) (1928–2023), American-British theatre historian
(B) David Mayer de Rothschild (born 1978), British adventurer, ecologist, and environmentalist
(C) David R. Mayer (born 1967), American politician
Akhmed Chatayev
(D) David Mayer (1980–2017), Chechen Islamist and terrorist
> Wouldn’t the shadowy all powerful cabal simply have ChatGPT respond with an answer about any of the other 3 David Mayer’s?
Based on the assumption that the network produces something untrue/compromising about David Mayer: Re-training an LLM is exceptionally computationally heavy, not to mention time consuming, even if they could find the data that caused it to train this way. There is no guarantee it would unlearn this either. What they likely mostly do is build upon already trained models.
If the threat to chatGPT is great enough (legal, financial, etc), it is far easier to filter the output. It would look a lot like this. Even when you try to trick the model, when the output tokens match it cuts off.
> But I guess anti-semitism is too much fun to not engage in.
This has nothing to do with anti-semitism. We're talking about a family with more wealth than most nation states. I'm not commenting on why or how, but it is simply a factual statement.
Or, you know, actually censor the name "Rothschild" or references to the Rothschild family, which it doesn't. I guess the shadowy all powerful cabal didn't think of that.
> Or, you know, actually censor the name "Rothschild" or references to the Rothschild family, which it doesn't. I guess the shadowy all powerful cabal didn't think of that.
It depends what you are trying to sensor. Saying something like "The Rothschild family control everything and are lizard people" can be laughed off, but something closer to reality could be very sensitive.
It doesn't need to be a massive conspiracy theory either, it could just be something like a medical condition leaked online.
If you think this kind of stuff doesn't happen, the Royal Family of the UK are quite open about it [0]. The law literally does not apply to them [1]. They go to a lot of effort to hide their wealth and financial matters [2].
One of the creators here: Yup! Our current users are more engaged and active on Twitter than the average users.
The chrome extension and website release also helps improve our model significantly. We get feedback on how it works in the wild and false positives that we can use to improve our model on.
We agree - as independent of Twitter we have a considerable amount of freedom in how we build this. However Twitter can and should be doing much more. For example with a model they can start placing captchas before Tweets.
In addition to building the model we went about trying build our own bots. To do this we went on forums and contacted individuals selling "aged" accounts. It turns out its as simple as sending $4 over paypal to get a compromised account. These are accounts with histories, real followers, and real people behind them.
We bought 11 of these and were able to automate them within the hour of purchasing them. They also started receiving replies to their retweets and content almost immediately from all over twitter.
The ease of setting up these compromised accounts as bots was also incredibly worrisome. We've found high confidence heuristics to determine that an account has been compromised. If we can - Twitter should be able to as well.
We're ultimately a bit confused over Twitter's inactivity here. We also haven't heard anything from the company.
Your right - twitter could do (and probably are doing) everything we're doing. They have billions of dollars and hundreds of engineers.
The value of building a model is that we can do a wide analysis on bot like activity. Separately launching botcheck.me as something that users can use is incredibly valuable from the ML side. Users essentially hand classify a bunch of false positives for us (to further train on) and also give us an idea of how are model is doing.
We aren't just doing sentiment analysis and you're right - NLP is hard. Fortunately at UC Berkeley we have some amazing CS professors that have been incredibly helpful in advising us while building this.
We're using LSTMs to learn the weights of various words. We've been using high confidence heuristics to generate our training data that aren't based primarily on tweet content.
One such example is looking at compromised accounts that have had their usernames changed.
I would very much like an API that provides this service for Reddit accounts as well. I suppose since the data is freely available I need to get off my butt and write it myself though...
hey! One of said creators here! We haven't figured out how to monetize / properly fund this yet. It's more so a problem we saw and attempted to solve. The cost right now is a couple hundred dollars month in server costs.
Please don't take my comment(s) as criticism: you're doing a good job. Even if you don't make any money doing this, it's a fun thing to do. In the worst case Twitter will (acqui)hire you.
> David Mayer may refer to: (A) David Mayer (historian) (1928–2023), American-British theatre historian (B) David Mayer de Rothschild (born 1978), British adventurer, ecologist, and environmentalist (C) David R. Mayer (born 1967), American politician Akhmed Chatayev (D) David Mayer (1980–2017), Chechen Islamist and terrorist
source: https://en.wikipedia.org/wiki/David_Mayer