Hacker Newsnew | past | comments | ask | show | jobs | submit | nattaylor's commentslogin

My read is that no customers will leave since they are much more interested in news coverage -- and this helps the AP focus more on news.

This is a tangent, but I wonder if they feel that they are just creating LLM training data and that few readers (even of Sunday papers) will actually read their reviews.


The base model is Qwen2.5-VL-3B and the announcement says a limitation is "Model can suffer from hallucination"


Seems a bit scary that the "source" text from the pdfs could actually be hallucinated.


Given that input is image and not raw pdf, its not completely unexpected


I wish there were some explain plans in either post, since I don't get what's going on.

If the query uses the index, then the on the fly tsvector rechecks are only on the matches and the benchmark queries have LIMIT 10, so few rechecks right?

Edit: yes but the query predicates have conditions on 2 gin indexes, so I guess the planner chooses to recheck all the matches for one index first even though it could avoid lots of work by rechecking row-wise


Is pre-training in FP8 new?

Also, 10M input token context is insane!

EDIT: https://huggingface.co/meta-llama/Llama-3.1-405B is BF16 so yes, it seems training in FP8 is new.


Deepseek v3 was FP8


S3 Tables is designed for storing and optimizing tabular data in S3 using Apache Iceberg, offering features like automatic optimization and fast query performance. SimpleDB is a NoSQL database service focused on providing simple indexing and querying capabilities without requiring a schema.


This is very cool. Kuzu has a ton of great blog content on all the ways they make Kuzu light and fast. WebLMM (or in the future chrome.ai.* etc) + embedded graph could make for some great UXes

At one time I thought I read that there was a project to embed Kuzu into DuckDB, but bringing a vector store natively into kuzu sounds even better.


Great point! Several years ago there was a project GRainDB, which along with GraphflowDB (a purely in-memory graph database) formed the ideas of what is now Kuzu :)

https://graindb.github.io/ https://github.com/graphflow/graphflow-columnar-techniques


500 errors for me


Doesn't compression make any minification gains negligible?


Depends on what you’re serving up. Blog? Yes. Video game? No.


If you like brittle things, the id attribute is already made into an attribute on the window for legacy reasons

Edit: My tone may have indicated that parent's solution was brittle. It's not!


The id attribute can take on values that are already present or reserved in window. "fetch", "opener", etc..

The reason to have a separate system that correctly calls getElementById() is to avoid this issue.

So, it's actually a _less_ brittle mechanism that doesn't rely on legacy mechanisms and lacks the surprises that come with that.


Sorry, I didn't mean to suggest your solution was brittle -- I actually quite like it and want to adopt it!

But I do think the legacy browser behavior with the ID attribute as window properties is very brittle for the reasons you suggest


My fault, I tend to "rapid fire" sometimes. Yours was next to another reply on the identical subject and that caused me to mistake the meaning of the comma in your sentence.

Reading it more slowly, I see it now, and, thank you!


Nah, i rather skip brittle :) But what's not brittle in JS? Elm? (TS to some extend)


https://we.phorge.it/ is a community fork that appears pretty active.

I was also very fond of Phabricator (all though my team preferred GitHub style pull requests) but I haven't had a need for it recently, so I haven't tried phorge myself.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: