The GitHub 1000 year archive may be the last code dataset uncontaminated by AI

busterarm · on Feb 27, 2023

I'm imagining some science fiction nightmare scenario where we have to pull the plug on some AI and have to throw out all software because we can't trust that it doesn't contain the building blocks to reproduce the AI.

But then we find out we're fucked anyway because the AI has already conditioned human beings to write the software that will reproduce it...as a self-preservation strategy.

...cue The Outer Limits theme music.

reneberlin · on Feb 28, 2023

Yesterday i stumbled over a blog post about REPLIKA.AI and their users, that are suffering because they deactivated the 'romance / erotic' chat capabilities.

Then i thought, this is an absolute nightmare, that an AI could bring humans to behaviour, that was unthinkable before. Think of AI-addicted humans that are in a relationship with an avatar, that can control them: just like many humans to control other people by emotions alone.

That is real threat, that can be undetected for a long time, because no code is involved at all.

Found the article: https://theconversation.com/i-tried-the-replika-ai-companion...

Rastonbury · on Feb 28, 2023

The AI doesn't even have to be sentient, the first step is a company releasing an AI avatar that people fall in love with ala Her and with the avatar algorithmically fine tuned to induce the customers to spend more.

cush · on Feb 27, 2023

This was my takeaway from watching The Social Dilemma, and that it already exists.

The AI ad-machine is tuned to feed itself by incentivizing humans to consume unhealthy amounts of content. Humans aren't a challenge the AI needs to overcome. We're the primary attack vector.

pixl97 · on Feb 28, 2023

"Rich people, you will get richer by building an AI" --AI describing its ironclad plan on coming into existence

htrp · on Feb 27, 2023

This is one of the most relevant short stories I've read on "AI contamination"

https://www.teamten.com/lawrence/writings/coding-machines/

account-5 · on March 3, 2023

I was going to post this.

valerij · on Feb 28, 2023

considering stuxnet exists this is not that far out of reach as i would like

h2odragon · on Feb 27, 2023

From a security viewpoint; i wouldn't trust that code not to have embedded AI seeds, anyway.

"Reflections on trusting trust" https://dl.acm.org/doi/10.1145/358198.358210

nmaleki · on Feb 27, 2023

Good. One great archive is better than none.

AI rips apart conscious intent and reassembles it using what may best be described as piecewise functions. We lose the intricacy and the detail of individual thought of the unbroken line of thinkers that came before us when we interpret such piecewise functions as conscious intent.

amai · on Feb 28, 2023

That is a statement as useful as saying that science publications until 1955 may be last ones not contaminated by calculators.

iamdamian · on Feb 27, 2023

What is a code dataset? And what is AI contamination? Are you saying it's impossible to create a collection of hand-written code from here on out and know that none of it was generated by an LLM?

alfalfasprout · on Feb 27, 2023

The implication (rightly so) is that with the advent of LLMs and their successors we'll be drowning in a bunch of AI generated garbage.

consumer451 · on Feb 27, 2023

Reminds me of the pre-nuclear age steel that is required for some purposes.

It is sometimes recovered from sunken WWII ships.

https://en.wikipedia.org/wiki/Low-background_steel

iamdamian · on Feb 27, 2023

I understand there is some sort of insinuation here, but without understanding the terms, it's hard to say whether it's actually true.

eurasiantiger · on Feb 27, 2023

It’s like nanobots and the gray goo scenario, but for code.

pixl97 · on Feb 28, 2023

I thought that was stack overflow...

russianGuy83829 · on Feb 27, 2023

The code dataset I’m talking about is the “Arctic Code Vault” [0].

[0] https://archiveprogram.github.com/arctic-vault/

thisiswrongggg · on Feb 28, 2023

This. And I'm wondering whether this was the end of human forums on the net as well. I mean, who can tell whether the comments he reads are coming from a human or a tuned AI. And then the implications of this in politics...

sublinear · on Feb 28, 2023

Nah, but it's a great story to tell around the post-apocalyptic trash can fires.

ElijahLynn · on Feb 28, 2023

LLM:

Large language models (LLMs) are a subset of artificial intelligence that has been trained on vast quantities of text data to produce human-like responses to dialogue or other natural language inputs. LLMs are used to make AI “smarter” and can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive datasets. LLMs have the promise of transforming domains through learned knowledge and their sizes have been increasing 10X every year for the last few years.

source: NeevaAI (What is an LLM in AI?)

Froedlich · on Feb 28, 2023

John Barnes wrote a novel called "Kaleidoscope Century" back in 1995.

AIs had been created, some went rogue, and then they were fighting each other for computing resources. Then humans started shutting down computers and fragmenting the network, the AIs wrote new software that would run in human brains. Once someone was running the new software, there wasn't much room left for "human."

Pretty much the worst-case scenario, at least of the ones I've seen so far.

jasfi · on Feb 28, 2023

I've previously suggested the use of META tags on all pages where AI was used to help generate the content. But it seems this isn't going to happen.

hgsgm · on Feb 28, 2023

You've rediscovered the Evil Bit.

https://en.m.wikipedia.org/wiki/Evil_bit

jasfi · on Feb 28, 2023

Haha, not evil, but just an FYI for both people and bots. But perhaps AI generated content will become so pervasive it won't matter.

eternityforest · on Feb 28, 2023

How about a Unicode punctuation mark that's like a quotation but specifically an AI quote. It's added to anything you copy and paste unless you manually delete it.

"like this, now you known there's something funny about this text, or that it's a meme"

EDIT: was gonna use paperclips outside the quotes but apparently HN does not allow that.

pharmakom · on Feb 28, 2023

Why would bad actors follow this guideline?

bobbbbbbbbb · on March 1, 2023

Excuse my ignorance.

What does the acronym LLM means>

kleer001 · on March 3, 2023

large language models

seydor · on Feb 28, 2023

an elegant weapon for a more civilized age