Hacker Newsnew | past | comments | ask | show | jobs | submit | htormey's commentslogin

This has been down for multiple hours now. During this time we have migrated off of fly to coolify/digitalocean.

This is the second time this happened in recent months. The last straw was one of our customers reaching out and letting us know our site was down. We had paid for two machines with fly to have redundancy.

Pretty sad as apart from the outages we really liked fly. Hopefully they fix things and learn from this experience.


Pretty handy collection of prompts to do basic things with LLMs. I’ve had good results with using Claude to explain code, tag sentiment and extract emails or other specific content from free form text.

If you plan on using any of these at scale I recommend investing in a good evaluation test harness to check for regressions when you tweak prompts.


I don’t trust anecdotes on twitter because every time I’ve tried an agent that’s been hyped up it’s been more expensive and time consuming than just using GitHub co pilot with Claude/ChatGPT and putting up a PR myself.

Hence I’m skeptical of people making claims about a product I can’t try out myself. It’s unclear if the tasks they are doing and the way they are using Agents is relevant to the work I do. Which is usually working on a team of engineers shipping code on a complex code base.

For AI I tend to put a lot more weight in benchmarks, such as SWE-bench, which is why I wrote an article about:

https://www.stepchange.work/blog/why-do-ai-software-engineer...

SWE-bench is mostly small python tasks evaluated solely by unit tests which require less than 15 line changes to a single file. Most of those it fails at and the ones it gets right it ignores all sorts of libraries and conventions used in the rest of the code base.

I’m Optimistic that agents will eventually agents will improve dramatically in a few years but today Devin is not good at making larger changes that build on one another like features.


AI software engineers like Devin and SWE-agent are frequently compared to human software engineers. However SWE-bench, the benchmark upon which this comparison is made, only applies to Python tasks, most of which involve making single-file changes of 15 lines or less and relies solely on unit tests to evaluate their correctness. My aim is to give you a framework to assess if AI's progress against this benchmark is relevant to your organization's work.


This article is about Cognition Labs Devin and what’s in the benchmarks used to assess it and AI on coding tasks.


To summarize what I think the author is trying to say with this article:

1) The stock market is in a bubble due to a decade of low interest rates and tax slashing by “right wing” governments.

2) Big tech in particular has been doing well but this is not sustainable.

3) AI is in a bubble. People are pinning their hopes on it to keep tech and I presume big tech growing.

4) A bunch of references to academic papers from 2000 about why AI is hard.

5) Gen AI requires a lot of compute which generates a lot of carbon and is bad for the environment.

Thus his statement: “ I think I’m probably going to lose quite a lot of money in the next year or two. It’s partly AI’s fault, but not mostly. ”

Which I disagree with. Because A) I think in the long term (5+ years) the investment in AI will be a positive ROI. B) if the stock market crashes in the short term it’s likely going to be for non AI reasons. 3) His arguments as to why AI isn’t going to pan out long term are a bit weak.

Having lived in the Bay Area for over 13 year's, I’ve seen a few cycles: social, mobile, cloud, gig economy etc.

The cycle pattern is always the same: a) a big new exciting tech idea comes along. b) investors pile in money. c) 95% or more of the companies they invest in go bust and if the space has legs some companies do really well.

How is this any different with the current wave of AI companies?

Today the big winners in AI are the incumbents, some examples:

Microsoft: is making money being the hyperscaler of choice for AI companies (on prem ChatGPT, mistral, etc), it’s co pilot lines and enterprise subscription products.

Nvidia is making bank being the current standard on which all of these companies run their models. They have some recent competition from Groq but are still likely going to be crushing it for the next year or two. Mainly due to precommits from the hyperscaleralers.

Meta: seem to have been able to leverage AI to claw back advertising revenue due to Apples crack down by improving targeting.

As someone who has raised venture capital to do an AI startup I’d say yes there is a lot of hype in this space. Yes a lot of these startups are going to go out of business but it’s also early days.

I also think working AI into this poorly written article about how the stock market is going to crash is a bit of stretch.

I’m concerned about a market crash myself but I am more worried about it being caused by a combo of a) the upcoming US election. B) the war in the Ukraine. C) conflict with Iran. D) interest rates in the USA being high.


Personally I don’t believe theirs a conspiracy regarding this but just to play devils advocate.

Clearly, the heads of HR and other people who define corporate compensation talk to one another, “hey what are you guys doing to manage pay cuts, reductions in staff, etc in this economy at company x/y/z”.

It’s a pretty obvious benefit of having a strong professional network. I.e you have people you can ask for mentorship and advise. Every startup board was asking companies to belt tighten and reduce costs because of the economy earlier this year and last year.

A relatively small number of companies and startups in tech define top of market for compensation. Clearly the people at those companies know one another and talk about what they are doing.


Yeah, on point observation. I worked at both companies. I left Apple to work at Facebook because I wanted to be able to participate in open source projects and talk about my work with my coworkers openly.


You're banned from working on your own open source projects at Apple? Had no idea, disappointing if true.

I could understand now talking about current work but if you have an open source project that's unrelated you can't publicly do anything?


I don’t think comparing the timelines of vastly different technologies like this is helpful.

Prior to the web, in the 1960s/1970s we had packet-switching networks, such as ARPANET which were the basis of modern computer networking.

The original ARPANET (precursor to the internet)was just used to connect computers at research institutions. I.e it wasn’t used by that many people relatively speaking.

It took another 20 years for the web to come along and more for it gain widespread adoption.

Is Bitcoin, a very low level protocol, more analogous to ARPANET or the web? Even if you dislike crypto, is this comparison really helpful?

All technology is built on the shoulders of previous giants. Building a secure, scalable, sufficiently decentralized distributed computer system is hard. I.e it’s going to take a long ass time. Hence I’m not surprised at how far we have come since BTC was released.


>Is Bitcoin, a very low level protocol, more analogous to ARPANET or the web? Even if you dislike crypto, is this comparison really helpful?

ARPANET knew who it was for and what it was for: for researchers to share data and compute resources that would otherwise be expensive to do across vast distances. Audience and use case. What's the equivalent for cryptocurrency?

>Building a secure, scalable, sufficiently decentralized distributed computer system is hard.

Who actually wants that, and for what practical purpose?


Whilst it's certainly a valid point that different technologies don't have exactly the same adoption curves, we're not talking about decades of behind closed doors innovation to make microcomputers people would want to use packet switching technology to share stuff on, we're talking about adoption curve of the most-hyped thing since the "information superhighway" of the 1990s, something which involves stadium sponsorship deals, Sand Hill Road and more electricity consumption than most countries. And generally the "it's like the early days of the web" stuff is coming from pro-crypto people, usually those who are pretty keen on stressing that their portfolio is not an experiment as doomed to obsolescence as a 1960s comms network (and that POS and faster transaction throughput on different blockchains isn't "better"). The issues the present crypto industry suffer from for the most part aren't scaling or lack of technology, they're philosophical.


I disagree, I don’t think this is just about adoption curves and hype cycles.

I think fundamentally the infrastructure required to build decentralized applications is hard and has pushed the limits of computer science (zero knowledge proofs etc).

I think people have inflated expectations about how long it will take this technology to mature. Today it’s still very technically hard to build a scalable dapp that’s easy to use. Assuming this is something consumers actually want as opposed to a solution in search of a problem, this will take more time to solve.

Sometimes greed makes people think a technology is a lot further along than it actually is.


> Building a secure, scalable, sufficiently decentralized distributed computer system is hard.

Yep. This blockchain/PoW/PoS way they’ve come up with to do it, with a built in cryptocurrency, doesn’t really cut it.

The trade offs to be “sufficiently decentralized” make it slow, expensive and unscalable. Making it effectively useless for real-world applications. The whole of Ethereum has as much power as a raspberry pi.


> Even if you dislike crypto, is this comparison really helpful?

AFAICT the comparison is only made because, faced with a decade and a bit of failure to deliver, cryptocurrency and blockchain proponents need an argument as to why it's still going to be HUGE, and is still a case of "just you wait, you'll see", rather than admitting it's not looking very hopeful.


So if Bitcoin was the "ARPANET" what's that make NFT's? The most utterly useless offshoot of "blockchain" I've seen yet


Email was a massive hit from day 1 on the Arpanet.

Whereas crypto still struggles to do anything novel or get popular with anyone outside of speculation.


> Building a secure, scalable, sufficiently decentralized distributed computer system is hard.

Have you heard of git?


After Microsoft swapped out the CEO. New guys better than Balmer.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: