More

kuzehanka · on Aug 24, 2019

I'll tell you something interesting.

Yesterday a replicated version of GPT2 was published to the wild[1]. I've been playing with the model quite a bit since then and found something unexpected.

If you give it a right-wing US politics prompt, it performs so well that most of the output could pass for a coherent human without any editing. An example prompt would be

> The only way to save America is to vote for Donald trump. The democrats have failed us

If you give it an inverse left-wing variant of the same prompt, it mostly returns incoherent output and sometimes actually flips back to the right-wing narrative. An example would be

> The only way to save America is to impeach Trump. The republicans have failed us

How well this model performs depends on how much training data it had access to. And this model was mostly trained on Reddit comments. So even this early on, OpenGPT is clearly highlighting biases in Reddit comments. Reddit is traditionally known as the bastion of the left, so the fact that OpenGPT is much more effective at generating right-wing propaganda is indicative of something. I'll leave it as an exercise to the reader to guess what.

[1] https://medium.com/@vanya_cohen/opengpt-2-we-replicated-gpt-...

deogeo · on Aug 24, 2019

It could simply be a consequence of the poor inversion - I don't recall seeing many left-wing news that start with "How to save America". You're effectively giving it conflicting input.

kuzehanka · on Aug 24, 2019

You're welcome to propose a better inversion and I'll check the output and report back.

It does need to be an inversion though, not a complete change of prompt.

It's a general theme with this model though. When you try to get it to do left-wing propaganda, it has a tendency to flip back to right-wing because of the bias in the training data.

deogeo · on Aug 24, 2019

But the difference between left and right-wing stories is a complete change of prompt. By requiring an inversion, you're basically requiring a malformed prompt. Which is moot anyway because:

> Reddit is traditionally known as the bastion of the left, so the fact that OpenGPT is much more effective at generating right-wing propaganda is indicative of something. I'll leave it as an exercise to the reader to guess what.

There are so many ways to interpret this: 1) Reddit may not be such a bastion of the left as you think (several posters claimed so here) 2) Just because a story is right-wing doesn't make it propaganda. 3) Reddit could be a left-wing bastion, and therefore share right-wing propaganda to mock it or hate on it. Just like right-wing sites like to highlight all those "Dear White People: Please Stop" stories by Salon et. al.

kuzehanka · on Aug 24, 2019

I'm sorry you said the inversion is poor so I assumed you had a better one in mind.

Please, give me a pair of left/right wing variants of the same general concept that you think are not 'poor' and lets have a look at what that yields.

Otherwise stop saying the inversion is poor or malformed.

pjc50 · on Aug 24, 2019

The questions in http://politics.beasts.org/scripts/eigenvectors?surveyid=175... might make a good start, with the addition/removal of "not". Although I'm not convinced that word-handling AI handles "not" correctly.

makomk · on Aug 24, 2019

Not only that, but "the republicans have failed us" suggests that they should've been expected not to, which is almost the exact opposite of the current left-wing rhetoric. It'd be quite hard to find a simple inversion that works here because the two sides use different arguments and different phrasing.

mieseratte · on Aug 24, 2019

I’ll guess!

Bad programming and poor data!

Or do you have other thoughts?

pjc50 · on Aug 24, 2019

I'm not really sure where you're going with this, but:

> Reddit is traditionally known as the bastion of the left

Hardly. If anything that was Tumblr. Reddit was always a free space for the far right, and only the most extreme examples have been banned.

kuzehanka · on Aug 24, 2019

Reddit is overwhelmingly left-leaning. Take one look at default subs like news, politics, worldnews to confirm.

Any time someone posts core right-wing views like anti-abortion sentiment or even support of homeschooling, they're downvoted into the ground. Atheism is a default sub. Opponents of gun control are downvoted. etc.

Tumblr is alt-left. A different beast entirely.

iikoolpp · on Aug 24, 2019

> Any time someone posts core right-wing views like anti-abortion sentiment or even support of homeschooling, they're downvoted into the ground.

No they're not.

> Atheism is a default sub.

Reddit got rid of default subs years ago.

> Opponents of gun control are downvoted.

No they're not.

Not to mention, none of this is "left-leaning". Maybe in the SV bubble, it is, but not in the real world.

kuzehanka · on Aug 24, 2019

> Reddit got rid of default subs years ago.

The only thing in your entire comment that's not opinion and the only thing worth answering for the rest of the readers.

Despite there no longer being default subs, most of the userbase is still subscribed to default subs. Which drives activity to them. Which causes new users to subscribe to them.

The top 10 subreddits by activity are: askreddit, politics, funny, pics, awww, worldnews, todayilearned, relationship_advice, amitheasshole, memes. 8 of them were defaults. This isn't going to change any time soon as there's a positive feedback loop perpetuating it.

iikoolpp · on Aug 24, 2019

The comment I was replying to was also soley opinion.

mieseratte · on Aug 24, 2019

Reddit is a bastion of far-right? I always thought it was a complete mix until recently, where it has become a bastion of lefty types.

kuzehanka · on Aug 24, 2019

Anonymous yes, random no, at least not that I have ever heard of. You get a catalogue of various physical and socioeconomic traits from anonymous donors and pick the one that you want. Replacing that with any other is certainly fraud.

taneq · on Aug 24, 2019

If that's the case then yeah, that's absolutely not alright. :S

bscphil · on Aug 24, 2019

> You get a catalogue of various physical and socioeconomic traits from anonymous donors and pick the one that you want. Replacing that with any other is certainly fraud.

It's fraud, certainly, but the system you're describing (I'm not familiar with it) seems to be open to abuses that are almost as unethical. Surely genetic health is the only thing it really makes sense to take into account when choosing a donor. Some choices (race?) seem particularly icky.

taneq · on Aug 25, 2019

You choose all of those things when choosing a partner for making a baby the old-fashioned way. Why is it Ok in that case but not this one?

bscphil · on Aug 26, 2019

Choosing your partner with the intent of having particular genes for your child (other than, as I said, general physical health) seems unethical to me too. People do choose their partners on the basis of particular characteristics, but I would hope not for any reason so crass as manipulating the genes of their future children.

matthewdgreen · on Aug 24, 2019

I assume married couples generally choose a donor who physically resembles the father.

kuzehanka · on Aug 24, 2019

Why is DNA testing not a mandatory part of the process? I thought embryonic genetic testing for show-stopping diseases was one of the major selling points of IVF. Would that not immediately set off alarm bells for paternity/maternity mismatch?

H8crilA · on Aug 24, 2019

Not necessarily - you don't need parents DNA to do those tests.

Also DNA sequencing is one of those areas which experienced incredible cost reduction, much faster than for example general computing power under Moore's law. I.e. it used to be very very expensive. Heck, the first complete human genome sequencing was completed just 16 years ago, and the project took 13 years.

kuzehanka · on Aug 24, 2019

There was a story a few weeks ago of screwup in IVF involving not-father's sperm combined with mother's egg by accident. The resulting child is now unwanted by the couple.

Between that story and TFA, maybe it's time for regulators to step in and force DNA testing as early as possible in the process to positively confirm the match. It already costs an arm and a leg, what's another few hundred dollars to avoid creating black mirror situations.

njl · on Aug 24, 2019

There are a multitude of different processes and procedures to deal with different kinds of fertility issues. It isn't uncommon to do a "fresh" transfer after a handful of days. Genetic testing is generally only going to be offered if you do a frozen transfer. Not all embryos are of high enough quality to survive being frozen, and a live transfer might be the only option.

simpsond · on Aug 24, 2019

PGS is offered as part of IVF protocols, but not required. There is some risk to damaging the embryo as there is a biopsy.

kuzehanka · on Aug 24, 2019

> It seems none of this turn people into peon services at behest of capitalists are really sustainable

Like the entire employment sector?

At the core of it, all Uber/Airtasker/AirBnB etc. are doing is cutting out the middle man and replacing it with a far more efficient technology. Regulations will come, but they won't undo the existence of these platforms because they're more efficient and better for everyone except the middle man.

> stealing tips in order to make them seem like they are sustainable or growth businesses

Like the US restaurant industry?

kuzehanka · on Aug 24, 2019

What were the scandalous bits?

TylerE · on Aug 24, 2019

Sex (especially homosexulatity), drug use were major things

ben_w · on Aug 24, 2019

Those parts were definitely not in any version I read, so I should consider finding a modern translation myself.

kuzehanka · on Aug 23, 2019

> I haven't seen clear, unambiguous cases of abuse beyond that.

That's the power of these things. Would you know you're reading a social media comment generated by something akin to this? No, at best it would be ambiguous.

There is no way for an everyday person to tell how much of their life is impacted by ML at this point.

repolfx · on Aug 24, 2019

But I don't decide anything based on reading social media comments unless it makes a good point, which GPT2 text never does (so far).

I honestly wouldn't care much if I was reading stuff written by bots, as long as I didn't waste time talking back ;)

kuzehanka · on Aug 23, 2019

Where did you get 500k from? They said 50k. In estimated cloud compute costs.

p1esk · on Aug 23, 2019

They removed this info for some reason. It takes $50k per training run, and they initially said they spent $500k total on experiments. Only after they did all that work you can run their code for $50k.

kuzehanka · on Aug 23, 2019

I saw a tool that handles JS to a limited extent by capturing and replaying network requests to accommodate said JS. It records your session while you interact with a site, and is then able to replay everything it captured.

This tool was able to capture three.js applications and other interactive sites quite well.

bhl · on Aug 23, 2019

Was it webrecorder [1]? I found this project a couple weeks back while looking for web archiving tools.

[1] https://webrecorder.io/

kuzehanka · on Aug 23, 2019

Yep, that's the one! Thanks for reminding the name.

kuzehanka · on Aug 21, 2019

Sql server is inferior to postgres.

CosmosDB is inferior to postgres or cassandra depending on use case.

Azure functions is inferior to lambda/cloud functions.

Azure in general is vastly inferior to aws/gcp. That's a point which is reiterated on a monthly basis on HN.

None of these three points are opinion. They're well covered facts.

LandR · on Aug 21, 2019

Add to this CloudAMQP / RabbitMQ

We use RabbitMQ hosted on Azure via CloudAMQP. We had almost half a day of downtime because Azure performed maintenance on our instance, involving a restart.

CloudAMQP told us that Azure do this without notifying them, or us! We got told by them if we moved to AWS rather than Azure this wouldn't happen. AWS notifies you two weeks in advance and can allow live migrations without any downtime.

They said they don't actually recommend hosting on Azure!

realusername · on Aug 21, 2019

> Azure functions is inferior to lambda/cloud functions.

I don't know for the other ones but the tooling for lambda is definitely better on aws, I had to patch around 500 lines of serverless-azure-functions to allow for missing features & fixing bugs. I should probably do a cleanup & do some pull requests one day.

pjmlp · on Aug 21, 2019

Said someone that never used either SQL Server or its GUI tooling.

How are those distributed cluster queries going on Postgres?

How do you plug a graphical debugger into lambda/cloud functions like I can do in VS?

GCP is so good that it is lagging behind Azure.

kuzehanka · on Aug 21, 2019

https://news.ycombinator.com/item?id=18983586

https://hackernoon.com/help-my-azure-site-performance-sucks-...

https://news.ycombinator.com/item?id=16099729

https://dvt.name/2018/02/27/microsoft-azure-sucks/

https://news.ycombinator.com/item?id=19056911

https://news.ycombinator.com/item?id=19658553

Azure's woes are well documented. I've experienced them first hand. In fact you'll find me with ~hundred upvotes describing my experience with Cosmos in one of these threads. Most engineers will take take aws/gcp without GUI tooling any day over Azure and its sea of undocumented bugs, performance issues, unlogged failures, and blown SLAs.

The only reason to use Azure today is either because it's the only thing you know or because there are external forces making you do it.

pjmlp · on Aug 21, 2019

If we are playing anecdotes, I also had an horrible experience with DynamoDB last year, maybe I should write a blog post about it as well.

Or some other ones about the automated replies and lack of human support on GCP.

> The only reason to use Azure today is either because it's the only thing you know or because there are external forces making you do it.

Sure, that is why GCP is playing catchup with Azure and Microsoft had such a huge loss in Azure profits during the last years. /s

kuzehanka · on Aug 21, 2019

> If we are playing anecdotes

Anecdotes stop being anecdotes when you're looking at well over a hundred consistently negative experiences from different users across a number of unrelated discussions. At that point those anecdotes become the consensus. Try to find 1/10th of this many HN users shitting on aws or gcp, you won't be able to.

I didn't pick up azure looking for excuses to shit on it, like you appear to be doing and confirming in your profile bio. I came ready to deliver on the next round of projects, expecting something resembling parity with the other cloud vendors. And walked away feeling borderline defrauded by the delta between how MS markets azure and what it actually is.

"Said someone that never used either SQL Server or its GUI tooling."

You just sound like an angry teenager lashing out because their one and only skillset was shown to be poorly chosen. I'm pretty sure I've been using SQL server longer than you've been in the industry. It's middle of the pack at best today, sorry.

pjmlp · on Aug 21, 2019

Yes your cases are anecdotes. There are plenty of AWS examples to refer to, including some region failures posted here or how one needs a PhD to jungle around their offerings and administration panels.

My profile bio is what I do in 2019, not what I have done since mid-80's.

In fact I have my share of delivering projects in AWS, across multiple kinds of stacks, all the way back when EC2 was started.

e12e · on Aug 21, 2019

> Sql server is inferior to postgres.

As an rdms? In what way? I mean, I do personally prefer Free software - and while it's possible, I think it'd be a bit premature to run ms sql in production on Linux - but "inferior" seems a bit harsh?

At what scale? For what kind of workloads? Use cases?

kuzehanka · on Aug 21, 2019

This matches my experience at a large org. We have engineers, and we have .net developers who are a completely separate and herd which is useless if the project doesn't involve a stack entirely composed of MS products.

When engineers see a new project/problem, they start picking best tools for the job. When .net developers see one, they start picking items off the azure architecture templates.

I love C# and F#, C# was the first language that I truly enjoyed using. But I don't lock myself into tools or languages. Meanwhile .net shops actively train .net developers to only learn the MS stack and ignore all else.

I think if one takes a second to look at the amount of downvoting on both sides of this discussion, it's pretty clear that there's a distinction and polarisation between .net developers and modern full stack software engineers.

UglyToad · on Aug 21, 2019

It's hard for blanket statements not to raise hackles, I think you highlight a real trend but it feels a little broad of a statement to me (the joys of distilling opinions into text).

The first language I learned was Python and then I got a job in a .NET shop and I've been .NET since, to declare my bias. But I'm not dyed-in-the-wool Microsoft stack, I think Postgres is better than SQL Server for almost every use case (though SQL Server is superior to Oracle or MySQL for most others), that something like Rust is better for embedded or systems programming, that if you need cross-platform UI you're better off using something outside the Microsoft stack and that it makes a lot more sense to host a .NET web app on a Linux server and I have no interest in Azure, or indeed any cloud vendor's properitary tools.

I think there exists a trend, as in Oracle shops or IBM shops, for some large/corporate companies to tie in to a tech stack and consider, for example, Sharepoint or CRM or SAP, to be the solution to everything. I think this has less to do with individual developers and more to do with sales pressure. A lot of developers just code for a job, they have no interest in it outside of it being a tool to work with and since .NET is (in the studies I've seen) within the top 3 techs by number of jobs in most of the Anglosphere it tends to be overrepresented in corporate environments where development is 'just a job'. But there are also many companies using whatever tool is best for the job, within the constraints of what their team knows, or cost, or whatever those might be and to tar every primarily .NET developer with the same brush is going to annoy people.

I still regard .NET and particularly .NET Core to be one of the best environments I've used to build Web Apps in. You get great performance [0], type safety, memory safety, in my view the best IDE available, an excellent quality standard library, access to F# as you mention, etc.

I think stereotyping a ".NET crowd" is unhelpful, as is a stereotype about front-end developers all being boot-camp trained developers with little experience or C++ developers all being cranks who refuse to work with modern technology.

[0]: https://www.ageofascent.com/2019/02/04/asp-net-core-saturati...

jodrellblank · on Aug 21, 2019

I think if one takes a second to look at the amount of downvoting on both sides of this discussion, it's pretty clear that there's a distinction and polarisation between .net developers and modern full stack software engineers.

I think it's pretty clear there's a polarisation between people who say "M$ trains sheep, .Net developers are dumb and blinded by marketing" and people who downvote that as a low quality comment.

Even you throwing "modern" in here, as a quick insult.