There is no reason to believe Gemini Image is not diffusion model. In fact, generated result suggests it at least have VAE and very likely is a diffusion model variant. (Most likely a transfusion model).
As a startup, they pivoted and focused on image models (they are model providers, and image models often have more use cases than video models, not to mention they continue to have bigger image dataset moat, not video).
If they have so much data, then why do Flux model outputs look so God-awful bad?
They have plastic skin, weird chins, and have that "AI" aura. Not the good AI aura, mind you. The cheap automated YouTube video kind that you immediately skip.
Flux 2 seems to suffer from the exact same problems.
Midjourney is ancient. Their CEO is off trying to build a 3D volume and dating companion or some nonsense and leaving the product without guidance and much change. It almost feels abandoned. But even so, Midjourney has 10,000x better aesthetics despite having terrible prompt adherence and control. Midjourney images are dripping with magazine spread or Pulitzer aesthetics. It's why Zuckerberg went to them to license their model instead of quasi "open source" BFL.
Even SDXL looks better, and that's a literal dinosaur.
Most of the amazing things you see on social media either come from Midjourney or SDXL. To this day.
>Even SDXL looks better, and that's a literal dinosaur.
I’m not saying you are wrong in effect, but for reference just slightly over 2 years ago was SDZL released, and it took about a year to have great fine tunes.
10 is OK if you remove the ads and stop random update. I’ll never use 11 and beyond. I always switched to Linux for my dev box and now that I play less and less game (haven’t played for weeks) I’ll switch to Linux for my personal box too, once the current one broke down.
Yeah, the version history as perceived by the vendor and as perceived by the commoner are somewhat out of touch. To me Windows 10 is basically new and already considered to be out-of-date.
Windows 10 Pro is actually a pretty decent OS. It brought quite a few major improvements over Windows 7 and I can't really think of any notable downsides.
LuaTorch is eager-execution. The problem with LuaTorch is the GC. You cannot rely on traditional GC for good work, since each tensor is megabytes (at the time), now gigabytes large, you need to collect them aggressively rather than at intervals (Python's reference-counting system solves this issue, and of course, by "collecting", I don't mean free the memory (PyTorch has a simple slab allocator to manage CUDA memory)).
With Lua Torch the model execution was eager, but you still had to construct the model graph beforehand - it wasn't "define by run" like PyTorch.
Back in the day, having completed Andrew Ng's ML coursew, I then built my own C++ NN framework copying this graph-mode Lua Torch API. One of the nice things about explicitly building a graph was that my framework supported having the model generate a GraphViz DOT representation of itself so I could visualize it.
Ah, I get what you mean now. I am mixing up the nn module and the tensor execution bits. (to be fair, the PyTorch nn module carries over many these quirks!).
That's wrong. Llama.cpp / Candle doesn't offer anything on the table that PyTorch cannot do (design wise). What they offer is smaller deployment footprint.
What's modern about LLM is the training infrastructure and single coordinator pattern, which PyTorch just started and inferior to many internal implementations: https://pytorch.org/blog/integration-idea-monarch/
Note that busy_timeout is not applicable to SQLite in this case (the SQLITE_BUSY issued immediately, no wait in this case).
Also this is because WAL mode (and I believe only for WAL mode, since there is really no concurrent reads in the other mode).
The reason is because pages in WAL mode appended to a single log file. Hence, if you read something inside a BEGIN transaction, later wants to mutate something else, there could be another page already appended and potentially interfere with the strict serializable guarantee for WAL mode. Hence, SQLite has to fail at the point of lock upgrade.
Immediate mode solves this problem because at BEGIN time (or more correctly, at the time of first read in that transaction), a write lock is acquired hence no page can be appended between read -> write, unlike in the deferred mode.
How to do site-to-site traffic over Tailscale / WG encryption? From preliminary testing, it seems have difficulty to saturate a 10Gbps connection while plain HTTP (nginx) traffic does that fine. Of course it should vary from CPU to CPU, but any tips how to improve that? Ideally I would love to go over with encrypted traffic, although everything is public, just one less thing need to be careful (in case future need to transport some non-public data over).
Yeah, luckily, you can unit tests these and fix them. They are not concurrency bugs (again, luckily).
BTW, numeric differentiation can only be tested very limitedly (due to algorithmic complexity when you doing big matrix). It is much easier / effective to test against multiple implementations.
And it is always felt to me that has lineage from neural Turing machine line of work as prior. The transformative part was 1. find a good task (machine translation) and a reasonable way to stack (encoder-decoder architecture); 2. run the experiment; 3. ditch the external KV store idea and just use self-projected KV.
reply