Hacker Newsnew | past | comments | ask | show | jobs | submit | kapitanjakc's commentslogin

I've had that thing bookmarked in my machine for quite some time now.

Whenever I open it, it gives me a good laugh.


I don't have watch history on.

I don't see anything on my home page.

I specially open youtube when I want to look at something and I have to search for it.

I still watch related videos, but it's way better compared to what I got on home page.


The big bang theory series episode was the one that introduced me to Brian Wilson through Darlin.

RIP


I've read stories about him on folklore.

He was a good man and great engineer.

RIP


I don't have a personal site yet. But when I do, I plan to make it with HTML+CSS+JS/JQ only

Maybe apache or nginx as webservers

host it on shared stuff or AWS free tier

I just need to figure out how to center a div, and then I'll be in the business.


AWS free tier. S3+cloudfront has cost me $0.00 for the last year. This is incidentally the best price.

My (single page) personal site is HTML+CSS (no JS) based on a template generated by ChatGPT because I don't give a crap. Trying to make something that works on a mobile device and desktop is beyond my meagre skills. This worked fine.


>AWS free tier. S3+cloudfront has cost me $0.00 for the last year. This is incidentally the best price.

I haven't tried this setup, but I'm using Cloudflare to serve my static sites for $0.00 as well. My mini rails apps I've down to $6/month VPS that I'm happy enough with as well for anything a bit spicy.


I would do that but I dislike Cloudflare because they wanted by DNS as well. I keep my DNS / CDN separate. Too many eggs in one basket otherwise.


Within minutes you could start at https://neocities.org/


I've never understood the whole centering a div meme.

    width: 60%; // define your width as desired 
    margin: 0 auto;
Now go start your blog!


I'm not sure if you are being serious about not understanding "the whole centering a div meme". Your example handles a trivial case, but does not address the whole of the problem.

As others have pointed out, vertical centering is often the problem being discussed (although difficulties with horizontal centering do happen). Anyone I know that has written any non-trivial web application has run into the situation where they spent way more time than they thought they should have to getting some element in a web application centered on the page the way they wanted it to be.

This article is a good example of the complexity, I think:

https://css-tricks.com/centering-css-complete-guide/

The author makes a decision tree, which illustrates the complexity fairly well, and then there's a conversation in the comments between the author and a reader about whether parts of the decision tree are correct.

CSS is extremely complicated. It's easy to get lost in the complexity, and it can be very frustrating when you know how you want something to look, but can't quite figure out how to get it to happen.

That's why the meme is so popular. LOTS of people who deal with CSS can relate.


That's the old hacky way of doing it. place-content makes it even easier.


Now center div with unknown height vertically :-)

And no cheating by using flexbox!


^ Comment flagged for sadomasochism.


how do you center something on an axis with no limits placed to form a segments. That’s mathematically impossible unless you placed the limits first.


Powerful the Force is, young Padawan, as is the strength of your doubts. Release them you must.


<center> </center>

It's been working for the second century.


I'll still bust this out if it's some quick page that's not going to last long (like some kind of "service down for maintenance" page that's only going to be visible for a few minutes, or something)

It's "bad" but you know what? It fucking works, it's concise, and I can remember it no matter how long I go between writing HTML/CSS.

Hell I wouldn't be surprised if the paths it takes through a typical browser engine also makes it burn 5% or fewer as many cycles as CSS centering methods.


I did the same: https://domi.work/

And it's also ugly :)


I love this:

- Most of it is CSS, which when removed still produces a pretty functional website.

- Most of the CSS is just one (commented out) background image

- There are about 5 lines of java script, which seem to just exist to obfuscate your email.


Wow, I completely forgot about that image! Thank you for reminding me (it is now gone).

It was an experiment a while back and it was inline in order to keep it all in one file. Actually that made me realize, my site is dynamic: Because I edit this one html file live on the server to make changes, whoever loads my website repeatedly while I'm doing that is going to see changes live.


GitHub has free hosting.


GitHub has poor browsers backward compatibility. Considering it's owned by Microsoft, we should probably start counting the days until it ends up behind a login wall like LinkedIn.


If your budget isn't literally zero, avoid AWS and get a cheap VPS from Digital Ocean, Linode, Vultr, OVH, or Hetzner Cloud, IMO.

The problem with AWS is their extortionate egress fees which are about 50-100 times the market price.


What I'm doing for my site is similar, I just sprinkle 11ty on top for the static generation, and then publish on netlify pages.


GitHub Co pilot was doing this earlier as well.

I am not talking about giving your token to Claude or gpt or GH co pilot.

It has been reading private repos since a while now.

The reason I know about this is from a project we received to create a LMS.

I usually go for Open edX. As that's my expertise. The ask was to create a very specific XBlock. Consider XBlocks as plugins.

Now your Openedx code is usually public, but XBlocks that are created for clients specifically can be private.

The ask was similar to what I did earlier integration of a third party content provider (mind you that the content is also in a very specific format).

I know that no one else in the whole world did this because when I did it originally I looked for it. And all I found were content provider marketing material. Nothing else.

So I built it from scratch, put the code on client's private repos and that was it.

Until recently the new client asked for similar integration, as I have already done that sort of thing I was happy to do it.

They said they already have the core part ready and want help on finishing it.

I was happy and curious, happy that someone else did the process and curious about their approach.

They mentioned it was done by their in house team interns. I was shocked, I am no genius myself but this was not something that a junior engineer let alone an intern could do.

So I asked for access to code and I was shocked again. This was same code that I wrote earlier with the comments intact. Variable spellings were changed but rest of it was the same.


> I know that no one else in the whole world did this because when I did it originally I looked for it.

Not convincing, but plausible. Not many things that humans do are unique, even when humans are certain that they are.

Humans who are certain that things that they themselves do are unique, are likely overlooking that prior.


Agreed, ask it for the cutoff date. I did, June 2024...


It seems you're implying Github Copilot trained on your private repo. That's a completely separate concern than the one raised in this post.


In GitHub Co pilot if we say dont use my code option for training does this still leaks your private code?


Read the privacy policy and terms of use

https://docs.github.com/en/site-policy/privacy-policies/gith...

IMO, You'd have to be naive to think Microsoft makes GitHub basically free for vibes.


Github copilot is most definitely not free for Github enterprise customers.


I didn't realize we were talking about that.


Yes. Opt-outs like that are almost never actually respected in practice.

And as the OP shows, microsoft is intentionally giving away private repo access to outside actors for the purpose of training LLMs.


You're completely leaving out the possibility that the client gave others the code.


Lol, never thought about that, it's highly possible.


Which provider is immune to this? Gitlab? Bitbucket?

Or is it better to self host?


Self hosted GitLab with a self-hosted LLM Provider connected to GitLab powering GitLab Duo. This should ensure that the data never gets outside your network, is never used in training data, and still allows you/staff to utilize LLMs. If you don’t want to self host an LLM, you could use something like Amazon Q, but then you’re trusting Amazon to do right by you.

https://docs.gitlab.com/administration/gitlab_duo_self_hoste...


GitHub won’t use private repos for training data. You’d have to believe that they were lying about their policies and coordinating a lot of engineers into a conspiracy where not a single one of them would whistleblow about it.

Copilot won’t send your data down a path that incorporates it into training data. Not unless you do something like Bring Your Own Key and then point it at one of the “free” public APIs that are only free because they use your inputs as training data. (EDIT: Or if you explicitly opt-in to the option to include your data in their training set, as pointed out below, though this shouldn’t be surprising)

It’s somewhere between myth and conspiracy theory that using Copilot, Claude, ChatGPT, etc. subscriptions will take your data and put it into their training set.


“GitHub Copilot for Individual users, however, can opt in and explicitly provide consent for their code to be used as training data. User engagement data is used to improve the performance of the Copilot Service; specifically, it’s used to fine-tune ranking, sort algorithms, and craft prompts.”

- https://github.blog/news-insights/policy-news-and-insights/h...

So it’s a “myth” that github explicitly says is true…


> can opt in and explicitly provide consent for their code to be used as training data.

I guess if you count users explicitly opting in, then that part is true.

I also covered the case where someone opts-in to a “free” LLM provider that uses prompts as training data above.

There are definitely ways to get your private data into training sets if you opt-in to it, but that shouldn’t surprise anyone.


You speak in another comment about the “It would involve thousands or tens of thousands of engineers to execute. All of them would have to keep the conspiracy quiet.” yet if the pathway exists, it seems to me there is ample opportunity for un-opted-in data to take the pathway with plausible deniability of “whoops that’s a bug!” No need for thousands of engineers to be involved.


Or instead of a big conspiracy, maybe this code which was written for a client was later used by someone at the client who triggered the pathway volunteering the code for training?

Or the more likely explanation: That this vague internet anecdote from an anonymous person is talking about some simple and obvious code snippets that anyone or any LLM would have generated in the same function?

I think people like arguing conspiracy theories because you can jump through enough hoops to claim that it might be possible if enough of the right people coordinated to pull something off and keep it secret from everyone else.


My point is less “it’s all a big conspiracy” and more that this can fall into Hanlon’s razor territory. All it takes is not actually giving a shit about un-opted in code leaking into the training set for this to happen.

The existence of the ai generated studio ghibli meme proves ai models were trained on copyrighted data. Yet nobody’s been fired or sued. If nobody cares about that, why would anybody care about some random nobody’s code?

https://www.forbes.com/sites/torconstantino/2025/05/06/the-s...


Companies lie all the time, I don't know why you have such faith in them


Anonymous Internet comment section stories are confused and/or lie a lot, too. I’m not sure why you have so much faith in them.

Also, this conspiracy requires coordination across two separate companies (GitHub for the repos and the LLM providers requesting private repos to integrate into training data). It would involve thousands or tens of thousands of engineers to execute. All of them would have to keep the conspiracy quiet.

It would also permanently taint their frontier models, opening them up to millions of lawsuits (across all GitHub users) and making them untouchable in the future, guaranteeing their demise as soon a single person involved decided to leak the fact that it was happening.

I know some people will never trust any corporation for anything and assume the worst, but this is the type of conspiracy that requires a lot of people from multiple companies to implement and keep quiet. It also has very low payoff for company-destroying levels of risk.

So if you don’t trust any companies (or you make decisions based on vague HN anecdotes claiming conspiracy theories) then I guess the only acceptable provider is to self-host on your own hardware.


Another thing that would permanently taint models and open their creators to lawsuits is if they were trained on many terabytes worth of pirated ebooks. Yet that didn't seem to stop Meta with Llama[0]. This industry is rife with such cases; OpenAI's CTO famously could not answer a simple question about whether Sora was trained on Youtube data or not. And now it seems they might be trained on video game content [1], which opens up another lawsuit avenue.

The key question from the perspective of the company is not whether there will be lawsuits, but whether the company will get away with it. And so far, the answer seems to be: "yes".

The only exception that is likely is private repos owned by enterprise customer. It's unlikely that GitHub would train LLMs on that, as the customer might walk away if they found out. And Fortune 500 companies have way more legal resources to sue them than random internet activists. But if you are not a paying customer, well, the cliche is that you are the product.

[0]: https://cybernews.com/tech/meta-leeched-82-terabytes-of-pira... [1]: https://techcrunch.com/2024/12/11/it-sure-looks-like-openai-...


With the current admin I don't think they really have any legal exposure here. If they ever do get caught, it's easy enough to just issue some flimsy excuse about ACLs being "accidentally" omitted and then maybe they stop doing it for a little while.

This is going to be the same disruption as Airbnb or Uber. Move fast and break things. Why would you expect otherwise?


I really don't see how tens of thousands of engineers would be required.


I work for <company>, we lie, in fact, many of us in our industry lie, to each other, but most importantly to regulators. I lie for them because I get paid to. I recommend you vote for any representative that is hostile towards the marketing industry.

And companies are conspirators by nature, plenty of large movie/game production companies manage to keep pretty quiet about game details and release-dates (and they often don't even pay well!).

I genuinely don't understand why you would legitimately "trust" a Corporation at all, actually, especially if it relates to them not generating revenue/marketshare where they otherwise could.


If you found your exact code in another client’s hands then it’s almost certainly because it was shared between them by a person. (EDIT: Or if you’re claiming you used Copilot to generate a section of code for you, it shouldn’t be surprising when another team asking Copilot to solve the same problem gets similar output)

For your story to be true, it would require your GitHub Copilot LLM provider to use your code as training data. That’s technically possible if you went out of your way to use a Bring Your Own Key API, then used a “free” public API that was free because it used prompts as training data, then you used GitHub Copilot on that exact code, then that underlying public API data was used in a new training cycle, then your other client happened to choose that exact same LLM for their code. On top of that, getting verbatim identical output based on a single training fragment is extremely hard, let alone enough times to verbatim duplicate large sections of code with comment idiosyncrasies intact.

Standard GitHub Copilot or paid LLMs don’t even have a path where user data is incorporated into the training set. You have to go out of your way to use a “free” public API which is only free to collect training data. It’s a common misconception that merely using Claude or ChatGPT subscriptions will incorporate your prompts into the training data set, but companies have been very careful not to do this. I know many will doubt it and believe the companies are doing it anyway, but that would be a massive scandal in itself (which you’d have to believe nobody has whistleblown)


Indeed. In light of that, it seems this might (!) just be a real instance of "i'm obsolete because interns can get an LLM to output the same code I can"


Hmm could very well be. But with comments intact ?

Anyway 1 thing that I did not consider and is pointed out by other comment is that original client could've provided the same code as they are also actual owners.


I believe the issue here is with tooling provided to the LLM. It looks like GitHub is providing tools to the LLM that give it the ability to search GitHub repositories. I wouldn't be shocked if this was a bug in some crappy MCP implementation someone whipped up under some serious time pressure.

I don't want to let Microsoft of the hook on this but is this really that surprising?

Update: found the company's blog post on this issue.

https://invariantlabs.ai/blog/mcp-github-vulnerability


No, what you're seeing here is that the underlying model was trained with private repo data from github en masse - which would only have happened if MS had provided it in the first place.

MS also never respected this in the first place, exposing closed source and dubiously licensed code used in training copilot was one of the first thing that happened when it was first made available.


Or as the other comment points out that original clients might have used it on the code. So my conspiracy theory just came crashing.


thinking a non enterprise GH repo to be out of reach from Microsoft is like giving your phone for Facebook authentication and thinking they won't add it to their social graph matching.


“With comments intact”

… SCO Unix Lawyers have entered the chat


Haven't tried Frigate solution.

Actually we are web developers with expertise in python, flask, django etc so we aren't familiar with it.

We developed some sites for the said client and they said you are experts on python so figure out a solution for this.

I am thinking that it'd be better to go with existing solution instead of creating a custom thing. Such as HikVision or something.


From my personal experience, nothing scientific or proven here.

I sit in a small office since last few years. A year or so ago I started to get less mentally active, as in things were going on in automatic mode.

And I did not feel good in general, a friend who practices Yoga advised me to do breathing exercises.

15-30 mins of deep breaths in open space in early morning, after shower, before breakfast. Followed by 3-5 min of rapid breathing. And finishing with taking as much air as I can and holding it for 30 sec to a min and repeating it for 2-3 times.

I do feel active after that, I wonder if it's related to these studies.


I encourage you and everyone else interested to attend a Holotropic Breathwork session to truly grasp the profound impact your breath can have on your mind. This is nothing like your regular five-minute yoga breathing exercise, boxed breathing, or even the Wim Hof breathing. It's a completely different level. These sessions typically last 3 to 5 hours and take place in a safe, supportive setting with a dedicated sitter and experienced facilitators.

And please don’t try this stuff alone at home.


Did you check the co2 levels on your office? that could be one reason.


For several years in a row, I was living every day suffering from severe sleep deprivation. I was not merely homeless, but living on the streets, and I became really intent on walking around all night, rather than trespass or sit down for a rest, in someplace where I didn't belong. Or I would sit in the IHOP, and drink 2 pots of coffee and stare, zombie-like, until the Sun rose. So I lost a lot of sleep and I dozed whenever possible, and not in a bed, but often seated at a table, with my arms folded, and my face buried in those folded arms, while others made chit-chat and the music played around me.

Well, I'd get into an enclosed space with lots of people, and I'd begin to pass out. It happened a lot in church. We'd be singing and standing and sitting and kneeling, and I'd be just ready to conk out and go to sleep. And I would do crazy things like, lunging for the thermostat because it felt so warm and close in there. I thought everyone was feeling the same stale, stuffy air as I was. I don't know. It would also happen in the coffeehouses, but sleep was guaranteed to overcome me during liturgies.

But I came to believe that it was a CO2 buildup sort of situation. With a lot of human bodies in a closed space, and we were all vocalizing for an hour or so, and it was winter so perhaps the heat was on, or the air conditioning was turned off. And so CO2 buildups were the most likely thing.

Once I was housed, and able to catch up on sleep, it doesn't happen anymore. I did complain to my doctor and I asked him if I may have COPD. He insisted that I breathed better than he did. He brought in two young Medical Assistant ladies to do this breathing exercise so that he could prove there's nothing wrong with me. Of course we didn't get to that point of discussing sleep deprivation, because you can't medicate that. Well, a psychiatrist could try, with extra-drowsy meds. And they did try. I resented that.


This is really valuable information. Thank you.


Do they have vulnerabilities and or security loopholes in organisation as big as twitter too ?

Or is Ddos that big that they simply overwhelmed every security that was in place.


They've shut down redundant data centers, fired countless engineers. They've slimmed down into a more efficient, cheaper service that can't handle curve balls, but at least the website is overrun with neo-fascist morons.


Don't forget the data showing that as many as 1 out of every 3 users on X is now a bot


They’ve had countless failed livestreams since Musk took over too.


Isn't interbreeding bad for their health?

Genuine question.

Is it interbreeding in a way like all bisons present now are sharing the ancestors or is it like it's all a single family of 6k bisons now ?


The notable discovery is that, where the evidence a few decades ago was that the bison were breeding in their historical herds (so multiple, smaller genetic pools), they now appear to be breeding between herds (so a single larger, more diverse genetic pool).

AIUI with small populations, more variation in breeding between groups is a good thing, because it spreads genetic diversity across the whole population.


You probably mean inbreeding. Interbreeding is good. It is good that the bison herds mingle and interbreed.

Anyway, for mammals an initial population of a couple dozen individuals (assuming they're reasonably genetically diverse in the first place) is plenty enough to produce a population of any size without problems.


There's a general guideline called the 50-500 rule. You need at least 50 animals to avoid immediate inbreeding (and also stochastic extinction from a fire or flood or disease etc), and about 500 to have a genetically healthy population. That varies some after a bottleneck event since your genetic population will be functionally less than your actual physical one, but it's a decent way to approach the problem.


Having a single breeding population across the park creates more genetic diversity than would be present in isolated herds.


> Bison like those in Yellowstone once suffered a population crisis that conservationists call the "population bottleneck" of the 19th century. By the early 1900s, American bison numbers had been reduced by 99.9% across North America and only 23 wild bison were known to have survived poaching in Yellowstone.

So at their worst this particular population only had 23 individuals left. Interbreeding is bad insofar as it increases the chances of passing harmful recessive genes to younger generations.


Compared to the alternative of the species not surviving at all, it seems like the better option :)

Besides, it seems like they think it's genetically healthy, so doesn't seem like a problem. I'm assuming they've verified this somehow.

> Today, the Texas A&M researchers report that the Yellowstone bison population appears to be functioning as a single and genetically healthy population that fluctuates between 4,000 and 6,000 individuals.


> Compared to the alternative of the species not surviving at all

How about compared to two distinct herds?


Getting that "living on top of a volcano" risk feeling :-)


Biology is not my area of expertise, but: Interbreeding is bad when it’s a small population interbreeding for a long long time. From the article it sounds like they aren’t worried about the genetic diversity of this 6k bison herd. I’m sure it would be better to have more diversity, but that’s hard to achieve with animals brought back from near-extinction.


There are a lot of private herds. But a lot of them have been bred with domesticated cattle and do not have the pure bison DNA in them. They can be used as a last resort. The solution here would be to slowly start separating herds to more locations away from Yellowstone. Over generations, the genetic makeup will separate enough to be considered separate populations.


At least two groups are now breeding as a single population. The genetic diversity might be more spread out over the population. As I understand the article there were two functionally separate groups as late as 20 years ago (already 100 years after the introduction of the Texas bison to the original Montana heard) and now they are recorded as being a single population.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: