More

denfromufa · 2025-05-04T03:53:36 1746330816

My wife hit a wall trying to upload a hefty PDF - every “shrink” tool we tried barely compressed the size, and some even made it larger! Frustrated by the state of PDF compressors (looking at you, Adobe), I turned to LLMs - Claude, Deepseek, and Gemini came up short, but OpenAI’s o4-mini saved the day with a perfect solution. That inspired me to build pdfmini: a tiny, open‑source, client‑side HTML app that crushes PDF sizes right in your browser!!! No installs, no fees, zero privacy worries - all your data stays on your machine.

Try pdfmini now:

https://den-run-ai.github.io/pdfmini/

Source code for pdfmini:

https://github.com/den-run-ai/pdfmini

2Gkashmiri · 2025-05-04T04:29:45 1746332985

This gave me an idea. You seem to be the right person to talk to.

Here is my workflow. Have a bunch of PDFs and images I need to combine.

I go to tools.PDF24.org, Merge pdfs, then compress them, then more compress them because of size limits, then add or remove pages. Then add page numbers.

These are multiple steps.

Could we have a way of defining these terms at start, either textual or no-code-like or something where we could define stuff like

Take input, merge > compress with greyscale, Max size 1MB, add page numbers on bottom right

Or

Convert input to jpg with image size 8cm by 8cm

I know many people who simply fail at such stuff. They just throw their hands up in defeat.

Not saying we should have llms do the job but if we could have multiple actions so that people could tell the software what they have in mind.

People dont just compress PDFs, often merge and then compress.

I recently say pdfux.com but it is not as featureful as PDF24 but PDF24 crashes a lot.

Beijinger · 2025-05-04T05:20:40 1746336040

#!/bin/bash

# Convert images to PDF

img2pdf *.jpg -o images.pdf

# Merge PDFs

pdfunite file1.pdf file2.pdf images.pdf merged.pdf

# Compress

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook \ -dNOPAUSE -dQUIET -dBATCH -sOutputFile=compressed.pdf merged.pdf

# Remove unwanted pages (e.g., page 3)

pdftk compressed.pdf cat 1-2 4-end output final.pdf

# Add page numbers

pdfjam final.pdf --outfile final_numbered.pdf --pagecommand '{}' --landscape

2Gkashmiri · 2025-05-04T05:26:07 1746336367

You know what. I will share my script in the morning.

I used scantailor go scan a book. That gave out tif files.

So I built a script to convert them to jpg, then merge into PDF. Then OCR and add the text layer on PDF. Then compress.

I know this for a niche automation..... web OTOH where normies reside and are scared by terminal, it wont work.

Been using pdftk for years now but im only person who can use it in my office.

philjohnson · 2025-05-04T05:23:16 1746336196

I'll be adding compression support for BreezePDF, so this can be done in a click

2Gkashmiri · 2025-05-04T09:53:30 1746352410

Merge/compress with Max size / color-greyscale/ remove pages / multi format import like PDF and images as input / export options/ export into multiple files if file size exeeds certain size.

And like my earlier comment, a way to define these multiple steps in a flow so that people can do multiple steps with a single file without having to learn command

denfromufa · 2025-05-04T05:24:16 1746336256

This is very cool, are all these command-line tools open-source?

Beijinger · 2025-05-04T05:46:35 1746337595

denfromufa · 2025-05-04T05:22:56 1746336176

If you can define this as a feature request to pdfmini, please submit it on github, e.g. drag-and-drop flow builder

maskros · 2025-05-05T14:00:50 1746453650

Well so I glanced at what that project does.

Congratulations, you've managed to "compress" PDF files by rasterizing every page to JPEG, while destroying all the vector and textual information in it.

The resulting PDF is nothing like the input -- it's just a bunch of blurry JPEG images wrapped in a PDF format.

You can't search or copy the text, and trying to print it will just make a blurry mess of the text.

gaws · 2025-05-09T17:54:52 1746813292

Nail it. I requested a 50% compression for a 200MB PDF file that contained pictures, and the tool made it an illegible mess. I can't imagine using this tool for anything serious, like tax returns, that requires a machine-readable file.

denfromufa · on May 6, 2024

I would appreciate if stackoverflow integrated something like a REPL or replit in their Q&A to reproduce example easily (maybe even CI?). For Python it would actually be very easy with backends such as Google Colab or even built-in ChatGPT Code Interpreter.

denfromufa · on Feb 20, 2024

The highlight of this event was running with Jeff at Rice University before his talk:

https://x.com/JeffDean/status/1756319820482592838?s=20

denfromufa · on Dec 2, 2023

Can you please tell more about your ML stack?

denfromufa · on Dec 2, 2023

Can you please tell more about the ML stack?

denfromufa · on Dec 2, 2023

When are you expecting applied scientist position to open?

denfromufa · on June 25, 2022

Does Rosetta still work in the virtualized MacOS when using Apple’s virtualization framework?

easeout · on June 26, 2022

Yes: Just now I checked TextEdit's "Open with Rosetta" box in Get Info, launched it, and saw it come up as an Intel process in Activity Monitor.

This was in a Ventura beta 2 VM run with Apple's virtualization sample project: https://developer.apple.com/documentation/virtualization/run...

easeout · on June 26, 2022

One limitation I have observed is that this VM can't host another VM itself.

garaetjjte · on June 26, 2022

Reportedly M2 supports nested virtualization though.

fastball · on June 26, 2022

But why would you ever want to?

moondev · on June 26, 2022

Docker desktop for macOS requires a linux vm on the macOS host, so nested virtualization is required if you want to use docker desktop inside the macOS guest.

Other Tools like multipass kind and minikube on the guest will not work

imwillofficial · on June 26, 2022

It’s helpful when spinning up labs, or testing infrastructure deployments

astrange · on June 26, 2022

Works in Linux VMs too. (on Ventura)

easeout · on June 26, 2022

Details about that: https://developer.apple.com/documentation/virtualization/run...

denfromufa · on June 25, 2022

This looks like a better open-source option:

https://github.com/KhaosT/MacVM

_kbh_ · on June 26, 2022

UTM should support most of the same features (aside from ease of us for installing all macOS versions). It also now supports paravirtualisation using the hypervisor framework.

https://github.com/utmapp/UTM

jsjohnst · on June 26, 2022

> This looks like a better open-source option

Comparing the code between the two, VirtualBuddy seems like the better option to me (albeit not by a lot though). They are both lightweight wrappers around MacOS’s built in hypervisor, so I’m really not sure what you’re going on.

reaperducer · on June 25, 2022

Since you have an opinion about this, can you explain why one option is better than the other?

denfromufa · on June 25, 2022

VirtualBuddy is still experimental

jsjohnst · on June 26, 2022

> VirtualBuddy is still experimental

Really? You’ve gotta be trolling, right?

mekster · on June 26, 2022

Sounds like you're the one trolling when the site says this upfront.

WARNING: This project is experimental. Things might break or not work as expected.

jsjohnst · on June 26, 2022

Look at the code for the one he’s suggesting. It’s the same essentially. One is just more upfront about giving its visitors a clear understanding. Taken out of context I can maybe see what you mean, but like come on, it’s not an effort to keep up is it?

denfromufa · on June 25, 2022

Anyone using Parallels to virtualize MacOS on M1 Macs?

gzer0 · on June 25, 2022

Yes, I am. It's quite smooth. It has its problems, but for the most part its alright. What did you want to explore further about this?

Edit: many people download the default parallels; you need to download the parallels from this page https://www.parallels.com/blogs/parallels-desktop-apple-sili... to be able to have access to M1 virtualization.

denfromufa · on June 25, 2022

Does Rosetta still work in the virtualized MacOS in Parallels?

mattruzzi · on June 26, 2022

ajconway · on June 26, 2022

Parallels run a similar thin wrapper on top of the OS-provided VM API which looks somewhat like: vm = createVM([device list]); vmWindow = createVMWindow(vm); vm.run();

superdug · on June 26, 2022

Yes, and it's terrible.

You can't sign into icloud and you can't maximize a VM to 4k resolutions. It's usable, but for $100 they could do much much better.

sharikous · on June 26, 2022

Those are well known limitations of Apple virtualized OSes. The threshold for solving those issues involves using a different virtualization framework and a lot of reverse engineering.

SkyAndSand · on June 26, 2022

Yes, I use macOS as development VM on a maxed out 16 inch M1 MacBook Pro. It all works as expected, except you don’t have any VM settings (e.g. how much ram / cpu you want to give the VM) and Docker doesn’t run inside the VM.

mattruzzi · on June 26, 2022

You can change some of the settings by editing an ini file.

https://kb.parallels.com/en/128842

SkyAndSand · on June 26, 2022

Oh, I didn't know that. Thanks a lot!

skoobasteeve · on June 26, 2022

I’ve been using it on Monterey. It’s not nearly as optimized as virtualized Windows or Linux on the same hardware (most Parallels features like auto-scaling not available yet), but I think the situation should improve with Ventura.

fragmede · on June 26, 2022

It works! I can even run x86 binaries for Windows in a VM. Don't ask me how that works though

CharlesW · on June 26, 2022

Microsoft has their own x86 emulator.

denfromufa · on June 13, 2020

One of the most important things not mentioned in the article is sleep. Not enough sleep and your diet will change. Not enough sleep and your body will not finish recovering overnight. Read the book “Why we sleep”. It will change your life.

P.S. I do marathon training and glycogen storing as well as activating the fat burning are essential!