Hacker Newsnew | past | comments | ask | show | jobs | submit | cmarschner's commentslogin

Befuddling that this happened again. It’s not the first time

- Paul Manafort court filing (U.S., 2019) Manafort’s lawyers filed a PDF where the “redacted” parts were basically black highlighting/boxes over live text. Reporters could recover the hidden text (e.g., via copy/paste).

- TSA “Standard Operating Procedures” manual (U.S., 2009) A publicly posted TSA screening document used black rectangles that did not remove the underlying text; the concealed content could be extracted. This led to extensive discussion and an Inspector General review.

- UK Ministry of Defence submarine security document (UK, 2011) A MoD report had “redacted” sections that could be revealed by copying/pasting the “blacked out” text—because the text was still present, just visually obscured.

- Apple v. Samsung ruling (U.S., 2011) A federal judge’s opinion attempted to redact passages, but the content was still recoverable due to the way the PDF was formatted; copying text out revealed the “redacted” parts.

- Associated Press + Facebook valuation estimate in court transcript (U.S., 2009) The AP reported it could read “redacted” portions of a court transcript by cut-and-paste (classic overlay-style failure). Secondary coverage notes the mechanism explicitly.

A broader “history of failures” compilation (multiple orgs / years) The PDF Association collected multiple incidents (including several above) and describes the common failure mode: black shapes drawn over text without deleting/sanitizing the underlying content. https://pdfa.org/wp-content/uploads/2020/06/High-Security-PD...


Never trust a lawyer with a redact tool any more complicated than a marker.

I've seen lawyers at major, high-priced law firms make this same mistake. Once it was a huge list of individuals names and bank account balances. Fortunately I was able to intervene just before the uploaded documents were made public.

Folks around here blame incompetence, but I say the frequency of this kind of cock-up is crystal clear telemetry telling you the software tools suck.

If the software is going to leverage the familiarity of using a blackout marker to give you a simple mechanism to redact text, it should honour that analogy and work the way any regular user would expect, by killing off the underlying text you're obscuring, and any other correponding, hidden bits. Or it should surface those hidden bits so you can see what could come back to bite you later. E.g. It wouldn't be hard to make the redact tool simultaneously act as a highlighter that temporarily turns proximate text in the OCR layer a vibrant yellow as you use it.


It often comes down to not using the right software and training issues. They have to use Acrobat, which has a redaction tool. This is expensive so some places cheap out on other tools that don’t have a real redaction feature. They highlight with black and think it does the same thing whereas the redaction tool completely removes the content and any associated metadata from the document.

This was basically the only reason we were willing to cough up like $400 for each Acrobat license for a few hundred people. One redaction fuckup could cost you whatever you saved by buying something else.

I would like to believe that the DOJ lacking the proper software might have something to do with DOGE. That would be sweet irony.


If my law firm can't afford the $20/month for a copy of Acrobat Pro, I'd be very concerned what else they are cutting corners on.

Law firms are notoriously behind in tech. I’ve seen some shit. A small firm running on the owner’s personal Dropbox account with client matter files stored alongside his porn collection, ancient, unsupported software, unpatched systems, basically zero information security, servers in a bathroom and network switches in a shower, a literal hoarder with garbage and shit in the office, etc. The Dropbox guy was basically a giant in his practice area. Very successful. You have no idea how bad things are behind the scenes.

I think it's usually a bit more complicated, i.e. the people who were expected to do processes don't and someone else shows the people asking for access that there's a faster, cheaper, cooler tool.

This is to be expected from an effort like DOGE simply because the E is for Efficiency. That is, how well a system is performing. The ratio of energy input to output.

Unfortunately the E in DOGE should have been for Effectiveness. That is, is the system shooting at the right target, and how close is it to hitting that target.

You can be very efficient but if you’re doing the wrong thing(s) you’re ultimately wasting resources.

The irony is, DOGE got the E wrong. It’s efficient but not effective


Or it a scam run by someone who wants to get access the social security info on americans. We are in trouble if you think the acronym is the biggest issue

I was speaking to the difference between efficiency and effectiveness. DOGE is simply the current best example.

Putting the obvious aside, sure, it’s Trump’s fault the system was so mismanaged that he’s been able to get elected. Twice. You’d think that after the first term the system would have gotten the message. It did not.

My recommendation to you is ask: How did we get here? And who is accountable for this?

There’s a very good chance those giving you your current narrative marching orders are on that list. Funny, right? Why own their failure when they can convince fools to blame a symptom?


not even, anyone still left at DOJ working to protect the president is immensely corrupt, and this is just that careless stupidity that typically goes along with deeply corrupt people.

I feel like the number of incidents related to "fully public S3 buckets" has gone down after AWS made it nearly impossible to miss the notice.

I think someone just got free marketing materials to promote the redaction tools.

Now much more people will be aware of the issue.


Are you saying that only Adobe PDF has proper redaction tools? I did a quick search and found several open source PDF tools claiming to do redaction- are they all faulty? I would honestly be surprised if there aren't any free tools that do it right.

No that's not what GP is saying. GP is saying that there is software that does not have a redaction feature (perhaps because the developer didn't implement it), but users of the software worked around it by adding a black rectangle to the PDF in such software, falsely believing it to be equivalent to redaction.

Properly implementing redaction is a complicated task. The redaction can be applied to text, so the software needs to find out which text is covered by the rectangle and remove it. The redaction can be applied to images, so the software needs to edit a dizzying array of image formats supported by PDF (including some formats frequently used by PDFs but used basically nowhere else, like JBIG2). The redaction can be applied to invisible text (such as OCR text of a scanned document). The redaction can be applied to vector shapes, so some moderately complicated geometry calculations are needed to break the vector shapes and partially delete them.

It's very easy to imagine having a basic PDF editor that does not have a redaction feature because implementing the feature is hard.

For the same reason, a basic PDF editor does not have a real crop feature. Such an editor adds a cropbox and keeps all the content outside the cropbox.


> Folks around here blame incompetence, but I say the frequency of this kind of cock-up is crystal clear telemetry telling you the software tools suck.

Absolutely. They know this is confusing, and they're bound and determined not to fix it. At the least, they need a pop-up to let you know that it's not doing what you might think it's doing.


Apple’s Preview app does exactly that. I discovered this while trying to make a blanked copy of kid #2’s homework worksheet for kid #1 who left his at school after kid #2 already wrote on her copy.

I’m optimistic that because LLMs have brought down the cost of the mere act of typing out code that we will see a shift in focus on certification and verification. Preferably with some legal protection for customers that are sorely lacking today.

Apple’s Preview app (which has a very thorough PDF markup tool) does this right: it has an explicit “redact” tool which deletes the content it’s used on.

> Never trust a lawyer with a redact tool any more complicated than a marker.

there's white-out on my monitor.

> ...frequency of this kind of ...

sometimes I wonder if it is plausible deniability. Like people don't WANT to cover this up and do it in a certain way.


Always worth remembering that PDFs are basically a graphic design format/editor from the 70s. It was never intended for securely redacting documents and while it can be done, that’s not the default behaviour.

No surprise non-experts muck it up and I don’t see that changing until they move to special-purpose tools.


Of course we can blame incompetence. It's incompetent not to realise your own incompetencies, also known as overconfidence.

Any lawyer should be like "I don't know what I'm doing here I'll get an expert to help" just like as a software developer I'd ask a lawyer for their help with law stuff...because IANAL uwu


I think it's part laziness here.

Placing a black rectangle on a PDF is easier than modifying an image or removing text from that same PDF.


The consequences of fucking it up are low, too.

If they get caught, they just take the document down and deny it ever got posted. Claim whatever people can show is a fake.

Since they control the levers of government, there's few with the resources and appetite for holding them accountable. So far, we haven't un-redacted anything too damning, so push hasn't come to shove yet.

The only might change if there's a "blue wave" in the midterms, but even then I wouldn't count on it.


The tool in Acrobat is exactly placing black rectangles on stuff. There's a second step you are supposed to do when you are finishing marking the redactions that edits out the content underneath them, and offers to sanitize other hidden data:

https://www.adobe.com/acrobat/resources/how-to-redact-a-pdf....

That failed redactions happen over and over and over is kind of amazing.


I hope you're not blaming the users. It's understandable they would be confused. The software needs to clarify it for the user. Perhaps, when you try to save it, it should warn you that it looks like you tried to redact text, and that text is still embedded in the document and could be extracted. And then direct you to more information on how to complete the redaction.

We have 30 years direct evidence that the users would ignore that warning, complain about the computer warning them too much, insist that the warning is entirely unnecessary, and then release a document with important information unredacted.

The problem is that the user generally doesn't have a functioning mental model of what's actually going on. They don't think of a PDF as a set of rendering instructions that can overlap. They think it's paper. Because that's what it pretends to be.

The best fix for this in almost any organization is the one that untrained humans will understand: After you redact, you print out and scan back in. You have policy that for redacted documents, they must be scanned in of a physical paper.


The problem is that the user generally doesn't have a functioning mental model of what's actually going on

Sorry, but a professional user not having an operational understanding of the tools they're working with is called culpable negligence in any other profession. A home user not knowing how MS Word works is fine, but we're talking desk clerks whose primary task is document management, and lawyers who were explicitly tasked with data redaction for digital publication. I don't think we should excuse or normalize this level of incompetence.


I don't expect radiologists to have a good understanding of the software involved in the control loops for the equipment they operate. Why should a lawyer have to have a mental model or even understand how the pdf rendering engine works?

Have you ever had to actually react a document in acrobat pro? It's way more fiddly and easy to screw up than one would expect. Im not saying professionals shouldn't learn how to use their tools, but the UI in acrobat is so incredibly poor that I completely understand when reaction gers screwed up. Up thread there's an in complete but very extensive list of this exact thing happening over and over. Clearly there's a tools problem here. Actual life-critical systems aren't developed this way, if a plane keeps crashing due to the same failure we don't blame the pilot. Boeing tried to do that with the max, but they weren't able to successfully convince the industry that that was OK.


if a plane keeps crashing due to the same failure we don't blame the pilot

That's true, we blame the manufacturer and demand that they fix their product under threat of withdrawing the airworthiness certification. So where's the demand for Adobe to fix its software, under pain of losing their cash cow?

Yet, people here are arguing that it is perfectly OK that professionals keep working with tools that are apparently widely known to be inappropriate for their task. Why should we not blame the lawyers that authorized the use of inappropriate tooling for such a sensitive task as legal redaction of documents?


The link in the comment you are replying to has a screenshot of exactly this. it’s a prompt with a checkbox asking you to delete the metadata and hidden info involved with the redaction. you’d have to blaze past that and not read it to make this mistake. It is user error.

I guess if you really want to defend users here you could say people are desensitized so much by popup spam that a popup prompt is gonna just be click through’d so fast the user probably barely recognizes it, but that’s not the software’s problem. For whatever reason some users would prefer to just put black boxes over obfuscated text, so here we are


I hope you're not blaming the users.

If software developers designed hammers, you'd have to twist the handle before each swing to switch from tack to nail mode. And the two heads would be indistinguishable from each other.

If business MBA's designed them, you'd wind up with the SaaSy Claw 9000, free for the first month then $9.95 in recurring subscription fees, and compatible only with on-brand nails that each have a different little ad imprinted on the head.

But it doesn't matter, because by the end of the year all construction will be vibe-built from a single prompt to Clawde.ai, which will pound non-stop, burning through $1T of investor funds, and confidently hallucinate 70% of the nails until the roof collapses on the datacenter destroying the machine and civilization along with it, and a post-singularity survivor picks up a rock and looks calculatingly at a pointy shard of metal...


Professional users doing more than 1 document? Yes, I'm absolutely blaming them.

I agree that affordances are good, but tools are tools, they can have rough edges, it's okay that it occasionally takes more than zero knowledge and attention to use them.


The software could do better, sure, but in this case the accountability clearly falls on the lawyers. It's their job - and it's a job that can profoundly impact people's lives, so they need to take it seriously - to redact information properly.

Adobe's contempt for users strikes again.

I’ve not looked too deeply, but based on other discussion, I wonder if this was malicious noncompliance meant to reveal what the higher-ups were ordering hidden. If victims’ names are properly redacted that would be strong evidence.

It is more likely they have no conceptual understanding that the PDF is a file format. They likely assume that whatever is shown in the interface is what is exported.

I want to believe this is malicious compliance.

Lots of loyalists have replaced people there. It's for sure incompetence.

There are hundreds of thousands of documents being reviewed by probably a thousand or more FBI agents. There is zero chance they are all loyalists.

The pool of competence was still diluted

Indeed, incompetence is basically guaranteed if the organization selects for allegiance rather than competence. But I prefer to think that at least part of this was malicious compliance, because that suggests that at least some people at the FBI still have their soul.

I would imagine it would be possible to track down who was responsible for redacting X set of documents, so this seems rather risky?

Since hundreds of people were involved the most likely explanation is incompetence

Once I worked for a company that got a quote in the form of a Word document. Turned out it had history turned on and quotes to competitors could be recovered.

There is a lot of incompitence when it comes to file formats.


For one of my first jobs I negotiated a better offer because "strings" on the document revealed the previous offer they'd sent out, and made me confident I could ask for more.

Though, makes me wonder if someone has intentionally sent out offers like that with lower numbers to make people think they're outsmarting them.


Never match wits with a stringscillian!

You don’t even need a digital format for this. When I was a consultant I waited in a room with a flip chart for a negotiation. I flipped through the “old slides” of the flip chart and found one where they did budget planning for the project. This was very good background info for the negotiations.

Similarly, I’ve been sent PDF proposal letters by my customers with redacted pricing from my competitors so I can compare the scope against mine. A simple unflatten reveals the price along with the scope.

To be fair handling Word documents is much more complex than redacting a PDF properly.

Yup. I’ve gotten documents from an Asian country’s government. One known for its good governance and meritocracy, that had hidden sheets with competitor data.

Don’t underestimate work shoved onto a university intern.


The most likely explanation when hundreds of people make the same fuckup is the tooling and/or process sucks. Not that hundreds of people lack basic competence.

The most recent analogy I have is field techs in IT work. A company sends out "truck roll" tickets, and then complaints when there is a 40% failure/re-work rate on said truck rolls.

A single or handful of techs with said failure rate? Yep, perhaps incompetence.

A global failure rate across dozens of cities/countries and 40+ technicians total? No longer incompetence. At least at the field tech level. That's a documentation, process, and standards problem 100% guaranteed.

That some above average highly competent "hero" technicians are able to compensate for it is irrelevant.


I think it's more that the analogy is broken.

If I have a sheet of paper and I color a section black. That's it. It's black. No going back.

So I can see people thinking the same for PDFs. I drew the black box. It's black. Done. They don't realize they aren't dealing with a 2D sheet of paper, but with effectively a 3D stack of papers. That they didn't draw a black box on the page, they drew a black box above the page over the area they wanted to obscure.

The fact that this happens a lot is an indication that the software is wrong in this case. It doesn't conform to user expectations.


Having lots of people involved means that it's more likely to be malicious compliance or deniable sabotage. It only needs one person who disagrees with the redactions to start doing things that they know will allow info to leak.

Doesn’t having lots of people involved also raise the chance of incompetence?

You’re more likely to get at least one inept agent in a random sample of 1000 than a sample of 10.


Yep - I think they're both likely.

I'm sure not all those hundreds have been involved with every document.

I'm kinda surprised (and disappointed) nobody has done a Snowden on it though.


> Since hundreds of people were involved the most likely explanation is incompetence

Hundreds of people might be involved, but the only key factor required for a single point of failure to propagate to the deliverable is lack of verification.

And God knows how the Trump administration is packed with inexperiente incompetents assigned to positions where they are way way over their head, and routinely commit the most basic mistakes.


And here we are again rediscovering Hanlon's Razor.

It wouldn't be malicious though. Well, it's malicious towards the Trump administration, but not towards the people. Quite the opposite.

The maliciousness is always towards the compliance.

Never attribute to malice that which is adequately explained by stupidity

https://en.wikipedia.org/wiki/Hanlon%27s_razor


See also “weaponized incompetence”, which usually has to do with getting out of work but in this case could easily be used to get away with “bad” work for longer.

https://www.psychologytoday.com/us/basics/weaponized-incompe...


The other side of that same coin is to never admit to malice if your actions can be adequately excused by stupidity.

In 2025, never attribute to incompetence what you could to a conspiracy. [sarcasm]

They fired/drove away/reassigned most of those who are competent in the executive branch generally, it is pretty easy to believe that none of those managing the document release and few of those working on it are actually experienced or skilled in how you do omissions in a document release correctly. Those people are gone.


> - Associated Press + Facebook valuation estimate in court transcript (U.S., 2009) The AP reported it could read “redacted” portions of a court transcript by cut-and-paste (classic overlay-style failure). Secondary coverage notes the mechanism explicitly.

What happens in a court case when this occurs? Does the receiving party get to review and use the redacted information (assuming it’s not gagged by other means) or do they have to immediately report the error and clean room it?

Edit: after reading up on this it looks like attorneys have strict ethical standards to not use the information (for what little that may be worth), but the Associated Press was a third party who unredacted public court documents in a separate Facebook case.


> What happens in a court case when this occurs? Does the receiving party get to review and use the redacted information (assuming it’s not gagged by other means) or do they have to immediately report the error and clean room it?

Typically, two copies of a redacted document are submitted via ECF. One is an unredacted but sealed copy that is visible to the judge and all parties to the case. The other is a redacted copy that is visible to the general public.

So, to answer what I believe to be your question: the opposing party in a case would typically have an unredacted copy regardless of whether information is leaked to the general public via improper redaction, so the issue you raise is moot.


My guess would be that if the benefitting legal party didn't need to declare they also benefitted from this (because they legally can't be caught, etc.) they wouldn't.

I know and am friends with a lot of lawyers. They're pretty ruthless when it comes to this kind of thing.

Legally, I would think both parties get copies of everything. I don't know if that was the case here.


> strict ethical standards to not use the information (for what little that may be worth)

If it's worth so little to your eyes/comprehension you will have no problem citing a huge count of cases where lawyers do not respect their obligations towards the courts and their clients...

That snide remark is used to discredit a profession in passing, but the reason you won't find a lot of examples of this happening is because the trust clients have to put in lawyers and the legal system in general is what makes it work, and betraying that trust is a literal professional suicide (suspension, disbarment, reputational ruin, and often civil liability) for any lawyer... that's why "strict" doesn't mean anything "little" in this case.


> you will have no problem citing a huge count of cases where lawyers do not respect their obligations towards the courts and their clients...

There are almost 2000 disbarments annually in the US.

The california bar recieves 1 compliant for every 10 law licenses in the state every year.

There's a wikipedia page on notable disbarments.

Legal malpractice suites are on the rise.

If you are going to assert that legal malpractice is not legitimate concern, I think the burden of evidence is on you.


I’m not a lawyer, but I did watch every episode of Better Call Saul and I’d point out that a lawyer who generates one complaint likely generates multiple complaints so that 1 complaint/10 law licenses number is misleading about the scope of the issue. Similarly, 2000 disbarments sounds high until you realize that there are roughly 1.3 million lawyers. What’s more, when I was checking to see what reasons for disbarment might be, I found an article (https://law.usnews.com/law-firms/advice/articles/what-does-i...) which cited a number much lower (less than 500) and that pointed out that reasons other than professional misconduct can lead to disbarment including DUI and domestic violence. The following gives some reasons for disbarment:

> … disbarment is the presumptive form of discipline for an attorney who steals clients’ money, Best says.

> Disbarment is more likely when the attorney committed fraud or serious dishonesty, particularly in front of a tribunal or to a client. Similarly, priority may be given to cases where an attorney is convicted of a crime of moral turpitude, Levin says.

> Priorities also change in response to society’s changing values and when there’s a belief that tightening down on types of cases will help the profession as a whole, Best says.

> For example, in Massachusetts, there has been an increased focus on violations relating to the administration of justice, such as when prosecutors engage in racist behavior.

> And while, in the past, an attorney’s drunk driving or domestic violence would probably not have led to sanctions (because they were seen as unrelated to the attorney’s legal work), they now might result in discipline, Best says.


Well, also the lawyer would have to really badly fuck up for it to become public news that they had actually used the information.

> Edit: after reading up on this it looks like attorneys have strict ethical standards to not use the information (for what little that may be worth), but the Associated Press was a third party who unredacted public court documents in a separate Facebook case.

Curious. I am not a litigator but this is surprising if you found support for it. My gut was that the general obligation to be a zealous advocate for your client would require a litigant to use inadvertently disclosed information unless it was somehow barred by the court. Confidentiality obligations would remain owed to the client, and there might be some tension there but it would be resolvable.


My recollection is that it varies quite a bit between jurisdictions. The ABA's model rules require you to notify the other party when they accidentally send you something but leave unspecified what else, if anything, you might have to do.

A famous case where this came into play was one of the Infowars defamation suits. Alex Jones’s lawyer accidentally sent the families’ lawyer the full contents of a phone backup. They notified Jones’s lawyer, and gave him some time to reply. After that time elapsed, the whole dump was considered fair game.

This is the moment when that mistake was revealed in court: https://youtu.be/pgxZSBfGXUM and this is the hearing for the emergency motion to suppress that data: https://youtu.be/dKbAmNwbiMk


I’m unclear why this is downvoted given the below. While it would theoretically be jurisdiction-specific, if the ABA model rules don’t provide some specific guidance, it’s clear that the lawyers would be ethically obligated to use whatever info they obtained if it helped their client and as otherwise consistent with their ethical obligations in the jurisdictions that follow those. I’m admitted in New York, and I don’t recall any kind of bar on the usage of this type of info there. Seems like in a lot of jurisdictions they’d have a duty to notify, but that may not even be the case in all.

Here in NL if confidential information about offenders leaks from court documents, it usually leads to a reduction in sentencing because the leak of classified information is weighed as part of the punishment. If the leak was proven to be intentional, it might lead to a mistrial or even acquittal. Leaking of victims' information usually only results in a groveling public apology from the Minister/Secretary of Justice du jour.

What a joke…

Given the context and the baldly political direction behind the redactions, it's not at all unlikely that this is the result of deliberate sabotage or malicious compliance. Bondi isn't blacking these things out herself, she's ordering people to do it who aren't true believers. Purges take time (and often blood). She's stuck with the staff trained under previous administrations.

Or it is just the result of firing people who were competent and giving insufficient training to people who had never done this before.

This has happened so many times I feel like the DoJ must have some sort of standardised redaction pipeline to prevent it by now. Assuming they do, why wasn't it used?

I am happy with their lack of expertise and hope it stays that way, because I cannot remember a single case where redactions put the citizenry at a better place for it.

Of course if it's in the middle of an investigation it can spoil the investigation, allow criminals to cover their tracks, allow escape.

In such case the document should be vetted by competent and honest officials to judge whether it is timely to release it, or whether suppressing it just ensures that investigation is never concluded, extending a forever renewed cover to the criminals.


Of course there is a process.

There was also a process on how to communicate top secret information, but these idiots prefered to use signal.

I'm completly lost on how you can be surprised by this at all? Trump is in there, tells some FBI faboon to black everything out, they collect a group of people they can find and start going through these files as fast as they can.

"When a clown moves into a palace, he doesn't become a king; the palace instead becomes a circus."


Here in the UK we have a thing called the civil service. They are not immune from government meddling but if they are good at anything it's writing and following processes, even under duress.

> some FBI faboon

I'm not familiar with the term, and Urban Dictionary only has "fake babboon" which I suspect is not the intent here - what does it mean?


I wanted to write FBI Baboon

Secure systems are not exactly the right environment for quick release and handling. So documents invariably get onto regular desktops with off the shelf software used by untrained personnel.

there are FOIA lawsuits seeking the redaction training videos, one by https://bsky.app/profile/muellershewrote.com so maybe one day we will know more.

They probably fired that department.

DOGE

Not to mention when the White House published Obama's birth certificate as a PDF. I remember being able to open it and turn the different layers off and on.

Are you trying to suggest that indicated it was fraudulent? That has very much been debunked -- it's just an artifact of OCR and compression, something that many scanners do automatically [1].

You can still open it with Illustrator if you want to see: https://obamawhitehouse.archives.gov/sites/default/files/rss...

[1] https://www.snopes.com/fact-check/birth-certificate/


Yeah, the idea that proves it's fraudulent has been debunked, but the alternative hasn't been proven, either. Nobody has named the specific OCR software that does this destructive replacement. It's a case of "well, there's an alterastive theory, and that's good enough" debunking.

I just took a look at the layers. In some cases, e.g. the 2nd letter in the Local Registrar's signature, a single letter is partially in the background layer, and partially in the upper layer.

This is easily explained by the character separation software being not 100% accurate.

It's not at all explained if someone is fraudulently adding text. Why would someone put half of the character in 1 layer and half of the character in a different layer?


Not sure what you mean by destructive replacement, since nothing is destroyed.

So I just looked into this, and it's specifically Mixed Raster Content pipeline (ISO/IEC 16485) used in lots of different scanners. There's no need to find which specific software generated it because it's used by lots of them.

It's a technique used to attempt to isolate font characters of the same size and style as separate layers before OCR-ing to make OCR more accurate.

ABBYY FineReader, for example, is mentioned as producing the exact same type of results. But there's no guarantee that was the actual software because lots of scanning software does it -- it's a general technique. Plus it won't even be deterministically reproducible if it was e.g. scanned and OCR'd at higher resolution and then saved at a lower resolution, as is generally considered best practice for maximizing accuracy while keeping file sizes lower.

https://www.obamaconspiracy.org/2013/01/heres-the-birth-cert...

So this is very much a nothingburger. It's not an "alternative theory", it's a complete and total explanation.


So this program really doesn't keep the original image of the document as a raster layer? That's kind of surprising, especially if it's used in the legal world. Personally, I'd always want to be able to recover the original document from the OCR layers. Or, are you saying you can? Then you should tell snopes, because it'll make the snopes article a lot shorter if they can just lead with that.

I think you are misunderstanding. The pipeline is e.g.:

Scan (600 dpi) > MRC (600 dpi) > OCR (600 dpi) > Downsample (150 dpi) > Save to PDF (150 dpi)

The image is saved in raster format at 150 dpi. That's the document, but not at the original scanning resolution. If you performed MRC and OCR at the 150 dpi level, you'd get different/worse results than were originally gotten at 600 dpi. Which is why you always OCR before downsampling, and you downsample for smaller files.

This isn't changing anything about the Snopes article. It just explains why if you run MRC/OCR at the PDF resolution, you won't deterministically reproduce it because it's not the resolution it was originally run at.

You do understand that this OCR is only for being able to search and highlight text? It's not changing what's displayed. That's still the pixels.


I didn't see the original pixels in the document at any resolution though. That's the point.

You don't see the pixels when you zoom in? Try again:

https://obamawhitehouse.archives.gov/sites/default/files/rss...

If you don't see jaggy pixel edges to the letters and form elements, what do you see?


Are you saying that I'm saying that there are no pixels in the document? Like, do you think that I think that scanners have come to operate on pure platonic forms and no longer use the concept of pixels? That would be really cool, wouldn't it. But no, I don't believe that. Hm. Where did this conversation go wrong. I think I was unclear in my last statement. I have yet to see someone show the original scribbles or ink marks that these OCR layers were generated based on. That's what I meant by "destructive". Now, I'm no expert on documents, so you might want to just cut your losses and stop trying to educate me and let me be uneducated in this matter. I'll accept that I don't know what I'm talking about, and reduce my criticisms of this whole thing to pointing out that the explanations don't make sense to me.

> Are you saying that I'm saying that there are no pixels in the document?

I genuinely don't know what you're saying.

> Like, do you think that I think that scanners have come to operate on pure platonic forms and no longer use the concept of pixels? That would be really cool, wouldn't it.

Yes, because that is absolutely a thing. That's what Adobe ClearScan does, converting pixels to smooth vector outlines. Zoom in, and zero pixels in OCR'd text. That's not the case in this file though.

> I have yet to see someone show the original scribbles or ink marks that these OCR layers were generated based on. That's what I meant by "destructive".

I still genuinely don't know what you mean. The original scribbles and ink marks are a physical piece of paper. The MRC layers are generated from the scan and don't destroy anything, they only separate. The resulting layered bitmap is identical, pixel-for-pixel. My best interpretation of what you're saying is you want a higher-resolution scan? But why? Again, nothing "destructive" has happened except maybe reducing the resolution. But "destructive" is not a word people usually use for that.

> so you might want to just cut your losses and stop trying to educate me and let me be uneducated in this matter

I can't tell if you're being sarcastic or not. I am genuinely happy to help you understand, but if you really don't want to then obviously I won't spend any more time replying. But if you're going to publicly throw suspicion on the validity of the birth certificate, I feel it's important to correct the record here on HN simply for other people who might read this exchange.


I'm starting to understand where this conversation diverged. I'm coming from a place of having read the Snopes page and watched the videos linked there. I think understanding where I'm at, is a good place to start trying to explain it to me. To put it more clearly, at this point I've seen a video that seems to show that the PDF has a collection of layers, some contain text and one contains the page below. Now, it seems like you were saying that the text layers are just the pixels from the page moved up to a new layer. I said that I think that's surprising. Then we got caught up on the meaning of the words "original pixels". I probably should have said: a full buffer of pixels from the CCD sensor, perhaps with resolution reduction or compression, but nothing moved to new layers (whether that's normally considered "destructive" or not is another issue).

OK, well hopefully you understand now! This part is key:

> Now, it seems like you were saying that the text layers are just the pixels from the page moved up to a new layer. I said that I think that's surprising.

That's indeed all it is. It may be surprising, but that's how it works. Absolutely nothing about the pixels are changed as part of the process of that text layer separation. Later when saving the final PDF there's normal lossy compression, same as any JPEG, but the layer process actually preserves the text edges better than JPEG.

So I do hope you're satisfied now that everything about this is just normal image processing, and that nothing has been destroyed, there's no "missing evidence" or anything. And that the whole idea that the layers imply some kind of manipulation or forgery is false. What you're seeing is just the scan itself, saved (presumably) at a lower resolution and with the normal image compression scanners produce. The layer separation process is completely non-destructive and doesn't manipulate pixels at all.


Yes, signed by “U.K.L. Lee” himself. Did you actually look at it? These FBI goombas aren’t even trying.

If you're going to promote conspiracy theories here, you should at least explain them. The rest of us can't really be bothered to look them up.

"There are major differences between the Trump 1.0 and 2.0 administrations. In the Trump 1.0 administration, many of the most important officials were very competent men. One example would be then-Attorney General William Barr. Barr is contemptible, yes, but smart AF. When Barr’s DOJ released a redacted version of the Mueller Report, they printed the whole thing, made their redactions with actual ink, and then re-scanned every page to generate a new PDF with absolutely no digital trace of the original PDF file. There are ways to properly redact a PDF digitally, but going analog is foolproof.

The Trump 2.0 administration, in contrast, is staffed top to bottom with fools."

https://daringfireball.net/linked/2025/12/23/trump-doj-pdf-r...


> made their redactions with actual ink, and then re-scanned every page

That's not very competent.

> going analog is foolproof

Absolutely not. There are many way's to f this up. Just the smallest variation in places that have been inked twice will reveal the clear text.


> Just the smallest variation in places that have been inked twice will reveal the clear text

Sure. But anyone can visually examine this. That means everyone with situational context can directly examine the quality of the redaction.

Contrast that with a digital redation. You have to trust the tool works. Or you have to separate the folks with context from the folks with techical competence. (There is the third option of training everyone in the DoJ how to examine the inner workings of a PDF. That seems wasteful.)


> But anyone can visually examine this.

Can they? In principle it could be the difference between RGB 0.0,0.0,0.0 and RGB 0.004,0.0,0.0, that could be very difficult to visually see, but an algorithm could unmask the data with some correlation.

If you do it digitally and then map the material to black-and-white bitmap, then that you can actually virtually examine.

> Contrast that with a digital redation. You have to trust the tool works.

While true, I think the key problem is that the tools used were not made for digital redaction. If they were I would be quite a bit more confident that they would also work properly.

Seems like there could be a product for this domain.. And after some googling, it appears there is.


> While true, I think the key problem is that the tools used were not made for digital redaction. If they were I would be quite a bit more confident that they would also work properly.

Adobe Acrobat's redaction tools regularly feature in this sort of fuck-up, and they are (at least marketed as being) designed for such use


Just scan it with black/ white setting.

It's probably fine, but certainly better than what's being discussed ITT.

The larger point is that the "usual" redaction involves a tape pen or paint-style ink (dries opaque), IIRC, then photocopy, because the blocked out area is opaque. Scanner is probably no different than photocopy for these purposes.


> anyone can visually examine this.

They can't, if the variations are subtle enough. For example, many people are oblivious to the fact that one can extract audio from objects captured on mute video, due to tiny vibrations.

Analog is the worse option here. Simple screenshot of 100% black bar would be what a smart lazy person would do.



I suppose the best process would be this, and then after rescanning putting a black bar over each redacted text with image editing.

Or if the document is just text, simply scan it in black and white (as in, binary, not grayscale).

Perhaps an imagemagick pipeline dumping each page out as a png then blanking areas associated with a list of words (a pixel level concordance of the coordinates of all the words having been compiled from a text dump? Hand-waving here).

I'm probably overthinking this one but the various lengths of the redaction bars would provide some information perhaps? So three conspirators with names like Stonk, Hephalump and Pragma-Sasquatch would be sort of easy to distinguish between if the public had a limited list of people who might be involved?


> I'm probably overthinking this one but the various lengths of the redaction bars would provide some information perhaps?

You're definitely not overthinking this. Fitting words by length is the attack vector if the blanking itself has been done correctly.


> the various lengths of the redaction bars would provide some information perhaps?

Absolutely. It’s why officially-redacted documents typically take out entire sentences and paragraphs, wiping just names only sparingly.


It's like Russian spies being caught in the Netherlands with taxi receipts showing they took a taxi from their Moscow HQ to the airport: corrupt organizations attract/can only hire incompetent people...

https://www.vice.com/en/article/russian-spies-chemical-weapo...

Anyone remember how the Trump I regime had staff who couldn't figure out the lighting in the White House, or mistitled Australia's Prime Minister as President?


Yes I remember that incident. It was big over here.

However I'm 100% sure that that was not a real spy incident. But rather just a 'message' to be sent from the Russian govt. The same way they have infiltrated our airspace with TU-95 bombers nearly every month for decades. Just a message "Hey we are still watching you".

When you see how ridiculously incompetent they were, not just their phone history but also the gear they had with them. It amounts to nothing more than a scriptkiddy's pineapple. There's no way they would have been able to do any serious infiltration into any kind of even remotely competent organisation.

Also the visible fumbling about in a carpark with overly complex antennas instead of something more hidden (e.g. an apartment across the street, a cabling tent or something). IMO the objective here was to get caught and stir a fuss.


Reminds of the time Russian security services showed copies of the Sims as evidence of an Ukranian Nazi plot.

Or the passports discovered intact after a particularly heinous terrorist attack.

This wasn't a fuck-up though was it?

Knowing they would die in the attack, the terrorists just didn't care if their identities were known.


> with taxi receipts

Please tell me they were saving them for expensing.


Europol is nothing next to Natasha in accounting

Yes.


Nothing is. The point is it’s highly precedented, surprisingly robust and far more competent than half the armchair suggestions being raised in this thread.

I would just do the digital version of that: add 100% black bars then screenshot page by page and probably increase the contrast too.

The bigger difference from my perspective is that they have competent people doing the strategy this time. The last Trump administration failed to use the obvious levers available to accomplish fascism, while this one has been wildly successful on that end. In a few years they will have realigned the whole power dynamic in the country, and unfortunately more and more competent people will choose to work for them in order to receive the benefits of doing so.

His last administration was filled with traditional Republicans.

I may have disagreed with them on virtually every policy point, but they seemed to disagree with the most harmful Trump policies as well.

We would have never agreed on the right policy, but we definitely agreed that his policy was not the right one.


> but they seemed to disagree with the most harmful Trump policies as well.

I imagine Republicans such as this still populate a majority of the house and Senate. If they disagree, they are sure making an effort to do so silently.


The amount of things Trump did circumventing Congressional approval might suggest that he does not a clean pass even though Republicans have majority in both the house and the senate.

They have (had?) the power to impeach the president for a lot less than he's already done. Yet they don't.

That's on Congress for allowing Trump to repeatedly circumvent their approval.

>In a few years they will have realigned the whole power dynamic in the country

I disagree. It felt that way for the first few months, but the wheels are coming off. Trump is too old and unpopular to steal a 3rd term. Therefore everyone around him has to worry about what will happen in 3 years, and plan for post-Trump rather than forever-Trump.


> they have competent people doing the strategy this time

They had a great playbook in Project 2025. I'm not convinced Trump ever had the smartest people executing it.


You don't need to be the smartest person when you're pointing a big gun at someone.

[flagged]


> Had exactly did Barr and Co. accomplish in terms of moving forward the agenda people voted for? These guys were so eager to win accolades from liberals they couldn’t even pick the lowest hanging fruit.

Are you talking about the same Bill Barr? "Eager to win accolades from liberals" is a hilariously Trump-after-he-fired-someone thing to say.

Have you read his Wikipedia page? Do you know who he actually is?


I'm not talking about paper credentials, I'm talking about accomplishments. 90% of lawyers in DC are liberals. Conservative lawyers can get credit for being "one of the good ones" so long as they don't attack the core tenants of liberal universalism or advance conservative social change in any meaningful way.[1]

Obama's DOJ did stuff like go after Catholic nuns to make them offer birth control, to vindicate liberal principles like supremacy of secular values over religious values. Guys like Barr never did anything like that. Trump and his merry band of chuckleheads have achieved more legal wins for conservatism in a year than anyone in the Bush administration did in eight years.

[1] It's not necessarily apparent from the outside where those lines are drawn. Bush's $8 trillion effort to blow up the Middle East was far less controversial among D.C. lawyers than Trump's effort to restrict immigration from the Middle East. Liberal universalists agreed with Bush's fundamental premise, if not his approach. Both believed that Iraq was the way it is due to external factors like Saddam, not internal factors like Iraqi culture. Even if liberals thought it was a terrible idea to go to war to topple Saddam, they didn't disagree with the core premise that Saddam was the barrier to Iraq becoming just like Iowa.


So you haven't read his Wikipedia page then, and you are too young, I guess, to remember Iran-Contra. You apparently don't even remember how Barr got the job from Trump.

Iran-Contra is a perfect example. How did that advance conservative principles? Whether Nicaragua is communist doesn’t affect anyone in America. Precisely because it has no consequences domestically, you won’t get disinvited from Georgetown parties for trying to overthrow Latin American governments. In fact, there’s an upside for liberals in such policies. The resulting chaos facilitated mass migration and cultural transplantation to the U.S. from places where socialism and communism found fertile soil.

This is… an insane argument. But just the idea that Bill Barr gives a shit what liberals think makes me laugh.

DC is full of trad cons who are sensitive to what liberals think. More specifically, the type of liberal that dominates the professional class in DC—folks who will happily represent Phillip Morris but consider immigration and affirmative action to be moral imperatives.

It’s just the math of the city. The DOJ is more democrat-leaning than most college campuses: https://admin.govexec.com/media/general/2024/11/110124donati.... Trump got three times the level of support in AOC’s district than his share of donations from DOJ employees.


> William Barr. Barr is contemptible, yes, but smart AF

You mean the guy who covered up for Epstein's 'suicide' and expected us morons to believe it?


> You mean the guy who covered up for Epstein's 'suicide' and expected us morons to believe it?

Let's assume that's true. How does it clash with him being "contemptible...but smart AF"?


Yeah I mean, orchestrating an assassination in a federal prison of a guy the whole world is watching, and never even so much as a whiff of a leak? Because how do you contain that without whacking everyone involved (which we would know about)? You don't. Not without teleportation, time-travel, or at the very least post-hypnotic suggestion.

Oh he's smart AF, all right.


There were some leaks. Some of the events I would need to check my offline documents for more specific references but, for example, the Reddit leak with a person who was working at the jail saying that he had to let in someone associated with the military and they left shortly after Epstein was killed.

> but smart AF. When Barr’s DOJ released a redacted version of the Mueller Report, they printed the whole thing, made their redactions with actual ink, and then re-scanned every page to generate a new PDF with absolutely no digital trace of the original PDF file.

This is a dumb way of doing that, exactly what "stupid" people do when their are somewhat aware of the limits of their competence or only as smart as the tech they grew up with. Also, this type of redaction eliminates the possibility to change text length, which is a very common leak when especially for various names/official positions. And it doesn't eliminate the risk of non-redaction since you can't simply search&replace with machine precision, but have to do the manual conversion step to printed position


>exactly what "stupid" people do when their are somewhat aware of the limits of their competence

Being aware of one's limitations is the strongest hallmark of intelligence I've come across...


I'm not so sure it's about knowing his own limitations, rather it's about building a reliable process and trusting that process more than either technology or people.

Any process that relies on 100% accuracy from either people or technology will eventually fail. It's just a basic matter of statistics. However, there are processes that CAN, at least in theory, be 100% effective.


So following that strange logic if a dumb person knows he's dumb, he's suddenly become intelligent? Or is that impossible by your peculiar definition of intelligence?

Knowing your limits has to be a sign of intelligence.

"Dumb" people (FTR the description actually refers to something rather than that which you think it does...) run around on the internet getting mad because they haven't thought things through...


It's an interesting question though. I know quite some "smart" people who lack self awareness to an almost fatal degree yet can outdo the vast majority of the population at solving logic puzzles. It tends to be a rather frustrating condition to deal with.

I think that you are conflating things.

Knowing your limits is a sign of intelligence, but it's not the only one, and it's not a requirement. Meaning that not having that understanding doesn't exclude you from being intelligent.


Yeah that sounds like wisdom, not intelligence.

Wisdom would be knowing not to try and exceed those limits

Intelligence would be knowing they exist (I know that I cannot fly by flapping my arms, it took intelligence to deduce that, wisdom tells me not to try and jump from a height and flap my arms to fly. Further intelligence can be applied, deducing that there are artificial means by which I can attain flight)


Not at all. It's a procedure that's very difficult to unintentionally screw up. Sometimes that's what you want.

> you can't simply search&replace with machine precision

Sure you can. Search and somehow mark the text (underline or similar) to make keywords hard to miss. Then proceed with the manual print, expunge, scan process.


>Sure you can. Search and somehow mark the text (underline or similar) to make keywords hard to miss. Then proceed with the manual print, expunge, scan process.

I suppose a global search/replace to mark text for redaction as an initial step might not be a bad idea, but if one needs to make sure it's correct, that's not enough.

Don't bother with soft copy at all. Print a copy and have multiple individuals manually make redactions to the same copy with different color inks.

Once that initial phase is complete, partner up persons who didn't do the initial redactions review the paper text with the extant redactions and go through the documents together (each with their own copy of the same redactions), verbally and in ink noting redactions as well as text that should be redacted but isn't.

That process could then be repeated with different people to ensure nothing was missed.

We used to call this "proofreading" in the context of reports and other documents provided as work product to clients. It looks really bad when the product for which you're charging five to six figures isn't correct.

The use case was different, but the efficacy of such a process is perfect for something like redactions as well.

And yes, we had word processing and layout software which included search and replace. But if correctness is required, that's not good enough -- a word could be misspelled and missed by the search/replace, and/or a half dozen other ways an automated process could go wrong and either miss a redaction or redact something that shouldn't be.

As for the time and attention required, I suppose that depends upon how important it is to get right.

Is such a process necessary for all documents? No.

That said, if correctness is a priority, four (or more) text processing engines (human brains, in this case) with a set of engines working in tandem and other sets of engines working serially and independently to verify/correct any errors or omissions is an excellent process for ensuring the correctness of text.

I'd point out that the above process is one that's proven reliable over decades, even centuries -- and doesn't require exact strings or regular expressions.

Edit: Fixed prose ("other documents be provided" --> "other documents provided").


If the word you need to redact is also an English verb there is a risk that you accidentally mark the name of person in a context where that redacted word has a clear meaning in that context and can be used as a proof that such a term has been accidentally redacted because a large scale search&mark has taken place.

According to a random dictionary I found:

To trump. Verb. Surpass (something) by saying or doing something better.


You process doesn't make sense, why wouldn't you just black box redact right away and print and scan? What does underline then ink give you? But it's also not the process described in the blog

> that's very difficult to unintentionally screw up.

You've already screwed up by leaking length and risking errors in manual search&replace


> why wouldn't you just black box redact right away and print and scan? What does underline then ink give you?

These are roughly equivalent. The point is having a hard copy in between the digital ones.


Absolutely. The other comments replying to your original comment that are nitpicking over implementation details miss the purpose and importance of this step.

The fact that this release process is missing this key step is significant too imho. It makes it really clear that the people running this didn't understand all of the dimensions involved in releasing a redacted document like this and/or that they weren't able to get expert opinions on how to do this the right way, which just seems fantastical to me given who we're talking about.

In other threads people are discussing the possibility of this being intentional, by disaffected subordinates, poorly vetted and rushed in to work on this against their will. And that's certainly plausible in subordinates but I have a hard time believing that it's the case for the people running this who, if they understood what they were tasked with would have prevented an entire category of errors by simply tasking subordinates to do what you described regardless of how they felt about the task.

So to me that leaves the only possibility that the people running this particular operation are incompetent, and given the importance of redacting that is dismaying.

Regardless of how you feel about the action of redacting these documents, the extent to which it's done and the motives behind doing it, the idea that the people in charge of this aren't competent to do it is not good at all.


This is one of the biggest document collections ever released to the public (...or will be when it's finally done) and the redactions were done in a hurry by a government agency with limited resources which would usually be doing more useful things.

So it's likely there simply isn't the time to do extended multi-step redactions.

What's happening is a mix of malicious compliance, incompetence, and time pressure.

It's very on-brand for it to be confused, chaotic, and self-harming.


Why would I settle for a rough equivalence? The point was about the chance of making mistakes in redaction, so sure, if you ignore the difference in the chance of making mistakes (which the underline process increases), everything becomes equivalent!

> Why would I settle for a rough equivalence?

They're equivalent in security. The digital method is more convenient (albeit more error prone). What confers the security is the print-scan step. Whether one is redacting in between or before doesn't change much.

You'd still want to do a tabula rasa and manual post-pass with both methods.

> point was about the chance of making mistakes in redaction

Best practice is humans redacting in multiple passes for good reason. It's less error prone than relying on a "smart" redactor, which is mostly corporate CYA kit.


> They're equivalent in security

They aren't, security is defined as the amount of information you leak. If you have an inferior process where you're substituting the correct digital match with an in incorrect manual match, you're reducing security

> albeit more error prone

The opposite, you can't find all 925 cases of the word Xyz as efficiently on paper without the ease of a digital text search, my guess is you just have made up a different comparison (e.g., a human spending 100hrs reading paper vs some "smart" app doing 1 min of redactions) vs. the actual process quoted and criticized in my original comment

> Whether one is redacting in between or before doesn't change much

It does, the chance to make a mistake differs in these cases! Printing & scanning can't help you here, it's a totally set of mistakes

> Best practice

But this conversation is about a specific blogged-about reality, not your best practice theory!


The blog has no relevance to your claim that the print and scan procedure somehow fundamentally precludes automated search and replace. I refuted that. You remain free to perform automated search and replace prior to printing the document. You also have the flexibility to perform manual redactions both digitally as well as physically with ink.

It's clearly a superior process that provides ease of use, ease of understanding, and is exceedingly difficult to screw up. Barr's DoJ should be commended for having selected a procedure that minimizes the risk of systemic failure when carried out by a collection of people with such diverse technical backgrounds and competence levels.

Notably, had the same procedure been followed for the Epstein files then the headline we are currently commenting under presumably wouldn't exist.


> The blog has no relevance to your claim that the print and scan procedure somehow fundamentally precludes automated search and replace.

It has direct relevance since it describes the process as lacking the automated search and replace

> I refuted that

You didn't, you created a meaningless process of underlinig text digitally to waste time redacting it on paper for no reason but add more mistakes, and also replaced the quoted reality with your made up situation to "refute".

> and is exceedingly difficult to screw up.

It's trivial, and I've told you how in the previous comment

> Notably, had the same procedure been followed for the Epstein files then the headline we are currently commenting under presumably wouldn't exist.

Nope, this is generic "hack" headline, so guessing a redacted name by comparing the length of plaintext to unmask would fit the headline just as well as a copy&paste hack


It gets you the non-existance of a PDF full of reversible black boxes.

Can't leak a file that doesn't exist.


But you can leak the content of a file that you printed out and couldn't redact properly by using an inferior method

But such a document is obviously unredacted. A black boxed PDF appears to be redacted, but isn't. Accidents happen.

Now that you've shifted the goalposts back closer to the original discussion, what's your point? Yes, you can leak the "nonexisting" file in multiple ways, including the printed one, and yes, "accidents" happen. So are they more likely to happen if you ban digital search and force paper and ink redaction instead? Are they more likely to happen if you black out digitally before printing or underline digitally and ink out physically?

And the "obvious word needle in a haystack of many thousands of pages" isn't as self-healing as you appear to think it is.


> This is a dumb way of doing that, exactly what "stupid" people do when their are somewhat aware of the limits of their competence or only as smart as the tech they grew up with.

No, this is an example of someone understanding the limits of the people they delegate to, and putting in a process so that delegation to even a very dumb person still has successful outcomes.

"Smart" people like to believe that knowing enough minutiae is enough to result in a successful outcome.

Actual smart people know that the process is more important than the minutiae, and proceed accordingly.


> someone understanding the limits of the people they delegate to, and putting in a process so that delegation to even a very dumb person still has successful

Oh, man, is he the only smart person in the whole department of >100k employees and an >x contractors??? What other fantasy do you need to believe in to excuse the flaws? Also, if he's so smart why didn't he, you know, hire someone smart for the job?

> even a very dumb person still has successful

Except it's easier to make mistakes following his process for both smart and dumb people, not be successful!

> Actual smart people know that the process is more important

So he's not actually smart according to your own definition because the process he has set up was bad, so he apparently did not know it was important to set it up better?

> important than the minutiae

Demanding only paper redactions is that minutiae.


> this type of redaction eliminates the possibility to change text length

This is the only weakness of Barr's method.

> it doesn't eliminate the risk of non-redaction since you can't simply search&replace with machine precision

Anyong relying on automated tools to redact is doing so performatively. At the end of the day, you need people who understand the context to sit down and read through the documents and strike out anything that reveals–directly or indirectly, spelled correctly or incorrectly–too much.


> This is the only weakness of Barr's method.

Of course it isn't, the other weakness you just dismiss is the higher risk of failed searches. People already fail with digital, it's even harder to do in print or translate digital to print (something a machine can do with 100% precision, now you've introduced a human error)

> At the end of the day, you need people who understand the context

Before the end of the day there is also the whole day, and if you have to waste the attention of such people on doing ink redactions instead of dedicating all of their time to focused reading, you're just adding mistakes for no benefit


> something a machine can do with 100% precision

Forget about typoes. Until recent LLMs, machines couldn't detect oblique or identifying references. (And with LLMs, you still have the problem of hallucinations. To say nothing of where you're running the model.)

> if you have to waste the attention of such people on doing ink redactions instead of dedicating all of their time to focused reading

You've never read a text with a highlighter or pen?

Out of curiosity, have you worked with sensitive information that needed to be shared across security barriers?


Reading through material in context and actively removing the telling bits seems very focused to me.

Furthermore, reading through long winded, dry legalese (or the like) and then occasionally marking it up seems like an excellent way to give the brain short breaks to continue on rather than to let the mind wander in a sea of text.

I am for automating all the things but I can see pros and cons for both digital and manual approaches.


The reading is focused, but that focus is wasted on menial work, which makes it easier to miss something more important

> give the brain short breaks

Set a timer if you feel that's of any use? Why does the break have to depend on the random frequency of terms to be redacted? What if there is nothing to redact for pages, why let the mind wander?

> I am for automating

But you're arguing against it. What's the pro of manually replacing all 1746 occurrences of "Trump" instead of spending 0.01% of that time with a digital search & replace and then spending the other 1% digitally searching for variants with typos and then spending the last 99% in focused reading trying to find that you've missed "the owner of Mar-a-Lago Club" reference or something more complicated (and then also replace that variant digitally rather than hoping you'd notice it every single time you wade through walls of legalese!)


> What's the pro of manually replacing all 1746 occurrences of "Trump" instead of spending 0.01% of that time with a digital search & replace and then spending the other 1% digitally searching for variants with typos

Because none of this involves a focussed reading. It's the same reason why Level 3 can be less safe than Level 4. If you're skimming, you're less engaged than if you're reading in detail. (And if you're skipping around, you're missing context. You may catch Trump and Trup, but will you catch POTUD? Alternatively, if you just redact every mention of the President, you may wind up creating a President ***, thereby confirming what you were trying to redact.)

If it doesn't matter, automate it. If you care, have a team do a proper redaction.


> Because none of this involves a focussed reading

That's not a pro, thats an incorrect rejection of the pro of the alternative. The thing is is that it does not preclude focused reading, I don't understand why you make up the "skipping around/skimming" alternative when I've expclitily mentioned that the is that 99% of time you spend on focused reading. It's just when you do that focused reading, you don't have to waste time on trivial redactions that you've already done automatically, and instead you can dedicate that time to catching POTUDs. So in total, you can spend more time on focused reading.

> may wind up creating a President **

How? Even if your 1% on digital alternatives somehow doesn't include this obvious combination, why would the word "President" in "President **" be harder to redact during focused reading?

> If it doesn't matter, automate it. If you care, have a team do a proper redaction.

I don't get, where is the actual option that I've described - do both while saving time???


> this type of redaction eliminates the possibility to change text length, which is a very common leak when especially for various names/official positions

Increasing the size of the redaction box to include enough of the surrounding text to make that very difficult.


You'd need to increase it a lot, lest the surrounding text be inferred from context.

But that's a destructive operation!

I mean, sure, you can make the whole paragraph/page blank, but presumably the goal is to share the report removing only the necessary minimum?


Also the pedophile that tried to obscure his face in pictures with a swirl effect that they were able to reverse enough to identify him:

https://www.minnpost.com/politics-policy/2007/11/you-can-swi...

IIRC there was a Slashdot discussion about it that went "Oh yeah, obviously you need to black out the face entirely, or use a randomized Gaussian blur." "Yeah, or just not molest kids."


Typically these folks use standard redaction software. Has anyone explored the fact that the software is just a buggy, silly mess?

Follow the letter of the law, but not the spirit.

It already seems that they blacked out more than the law allowed, so following neither.

Not that it matters much what the law says if the goal is to protect the man who hands out pardons...



Its befuddling you think theres mechanisms to incentivize competency over loyalty in some of these organizations

Based on the prose style, I'm assuming you copy-pasted a ChatGPT "deep research" answer?

The prose style and the fact that it was super repetitive. Every bullet re-described the copy-pasting. Definitely LLM slop.

similar to pressing delete or emptying recycle bin, in that all that happens is the operating system is told that section of the hard drive is now blank, but the underlying files are still there and available to recover

Befuddling you are befuddled by non-tech obsessed people failing to grasp tech.

The covid origins Slack messages discovery material (Anderson & Holmes) were famously poorly redacted pdfs, allowing their unredacting by Gilles Demaneuf, benefiting all of us.

[flagged]


You mean the layers that were, in fact, just side effects of scanning the (non-authoritative) short form certificate?

Not proven yet.

It wasn't his actual birth certificate! It was the short form!

For me it‘s the opposite. I do have a good feeling what I want to achieve, but translating this into and testing program code has always been causing me outright physical pain (and in case of C++ I really hate it). I‘ve been programming since age 10. Almost 40 years. And it feels like liberation.

It brings the “what to build“ question front and center while “how to build it“ has become much, much easier and more productive


Completely failed for me running the code it changed in a docker container i keep running. Claude did it flawlessly. It absolutely rocks at code reviews but ir‘s terrible in comparison generating code


It really depends on what kind of code. I've found it incredible for frontend dev, and for scripts. It falls apart in more complex projects and monorepos


$$$$$


Yes, that is one of the main incentives for working.


Whenever I find someone at my company who has worked there over 30 years it's usually because of company doesn't pay enough to retire.


I suspect unless you have a gambling problem, that "34 years at Apple/Next" would be enough to retire on.


34 years? I have 10 years at a couple of FAANGs, and got $3M in stock, with maxed out 401k, etc. I am having thoughts about retiring early, maybe in 5 years. Long time Apple employees could definitely retire after 10 years. He most likely stayed there because he liked the job.


When did Apple start issuing stock/RSUs? Probably only in 2004 when Google got rich



$0.37 is the split-adjusted price, it was never actually quoted that low at the time (for anyone wondering if Apple really used to be a penny stock in the early 2000s).


Nit: share price was not $0.37, there's been a couple stock splits since.


In case anyone is curious, AAPL has split a combined 56-to-1 since 2003.


Jesus! Folks in the 30 years prior must be fuming


You could also have bought $1,000 worth of stock at the time and it would be worth one million today (since 1995 with reinvested dividends, source ChatGPT). Up to you whether the 32 years spent in the office makes the money more worthwhile to you.


Exactly how does one purchase $1,000 in stock of a company never listed on a stock exchange? NeXt was never public.

For the love of God, use the right tool. Portfolio back testers are a dime a dozen and easy to use and get 100% accurate answers. LLMs are the wrong tool to get investment expertise from.


Apple was, though. They went public in 1980. The IPO price, adjusted for splits, was a little under $0.10 per share. Ignoring any dividends, etc. your $1000 today is the equivalent of over 10,000 shares of Apple, worth almost $2 million today.


In this case it’s the model. There’s an insane amount of computation that should happen in milliseconds but given today’s hardware might run 10 times too slow. Mind you these models take in lots of sensor data and spit out trajectories in a tight feedback loop.


It‘s not under-explored at all. Millions of people work in architecture and city planning. Every city has several departments that deal with planning and construction.

It‘s just completely dysfunctional. Architecture professors have focused on “innovation” for 100 years and have achieved little. We still flock to the old, 19th century (or older) city centers and love it. We spend thousands to spend a week or two there on holidays.

Very few modern places exist where this is the case.

In survey after survey, 80% of the people prefer traditional over modern(ist) designs.

So the whole profession has failed, since about the introduction of the Bauhaus.


This is some strange no? It looks like you are talking about city planning more than architecture when talking about city centers.

Modern designs are affected by supply and demand, while modernist designs have been supplanted by many other school.

Innovation has ranged from tiny homes, to livable homes, to new materials, to shipping containers, building heights, concrete types, designs and more.

I’ve seen architectural styles emerge and evolve from different countries, so it’s hard to read this and find the source of your opinion.

The creation of public spaces is highly dependent on the governance of those localities.

I was bemoaning the growth of self sufficient enclaves as a real estate solution in Mumbai, but I acknowledge that this is the market providing for its consumers what the government is yet to provide.

Is this primarily an attack on academia, under the assumption that everyone hates the combination of “innovation” “modernism” and “professors”?


Academia here is highly dysfunctional.

For one, you don‘t need academics to build houses.

It‘s an idea of the 20th century that you would.

Previously architecture schools were part of the art departments. A bit of engineering, maybe, but that‘s it.

Now that you have academics, they need to be innovative.

The old doesn‘t count. Architecture becomes like fashion. Students are scolded if they want to produce anything traditional.

This is true for 99% of architecture schools worldwide. Notre Dame is a noticeable exception, as are several summer schools in Europe (by INTbau for example).

There is zero reason for neglecting or denying traditional architecture. The Romans have already known how to live well. Without artificial air condition. Perfectly climate adapted. Natural materials.

Second, architecture schools are not about education, it‘s about becoming part of a cult. It‘s about telling a story, about winning competitions, and about convincing investors. Not so much pleasing the users of a building.


Who the heck denies traditional architecture? My friends were studying architecture along side me in college, they have tons of studies on classical architecture.

Heck I know about classical building methods, styles and the economics behind them and I’m not even an architect.

And I was not even born in the west.

Like, I don’t expect a neophyte architect to use methods that can build houses only up to 3 stories based on stonework.

They need to use drywall and construction methods unique to their locality.

But there’s thousands of other and file architects, and thousands more who make insane and wonderful things, along with professors who …

Dear heavens, What happened on your journey?

Besides - there’s always place for new and interesting, if rebellion is your go to motif - have at it.

Modern itself was a rebellion against older forms and thinking.


It is a well-known fact that 99% of architecture schools are about making students follow a particular set of tastes we call “modernism” - an ideology that states that buildings need to be innovative, that traditional methods should be neglected. It favors minimalist forms, neglects art forms like ornament, idealizes the “genius architect”, and favors “modern” materials like concrete.

If you don’t follow this stream, more often than not you will get bad grades or fail tests.

Some people said “architecture is not about education, it is about entering a cult”.

Some more details here: https://youtu.be/syQMTZyzqcg?si=NTz362TrktrIEBhr


There's a bit of survivorship bias in your reasoning. The extremely wealthy built beautiful things that we still enjoy. But the horrible places that common people had to call home in the 19th century are not treasured in the same way.


Agreed. I think people prefer what’s familiar, and what’s familiar is what we can afford. Compare traditional to tasteful billionaire penthouse and most people will choose penthouse


yes, and add to that the 20th century.

And yet so many claim all we need is moooore housing.

I would love to see quality of life become more of a ubiquitous focus and feature of what is built.


Agreed. We do need more housing, but it can also be more quality housing. This is the part that most of the "YIMBY" folks miss out on.

We have a new, modern, but as cheap as possible building with the smallest legal unit sizes that went in around the corner from us less than 10 years ago. It's now nearly empty because every unit leaks, the appliances and cabinetry already need to be replaced, and it'll have to be half rebuilt to fix several structural problems.

The developer, fortunately, failed in getting a second building started because of community pushback. In response, the community has been attacked at the local, and recently international, level for being NIMBYs and "stopping necessary progress".

The same folks who promote more sustainable and people focused city design are fighting for these worthless buildings. Their intentions are right in the bigger picture sense, but they leave no nuance for what's actually happening on the ground.


Aesthetically pleasing it is, but also way less practical and way more costly to build. Nice stone facades can't have any thermal insulation on them (and having it inside is less than ideal), in Europe this would be a big problems apart from very south regions. I think mcmansions are trying to find some middle ground, but they don't seem to receive much love (those are not so common in Europe so just judging from far).

Medieval castles can be very pretty to visit too, I wouldn't want to live in one if given modern choice regardless of wealth, even if ignoring all the red tape for any sort of change or even repair.


those places are nice for vacations, but without denser apartment buildings the city tends to expand a lot horizontally and after a while it's very expensive to have a decent public transportation system. it's mostly impractical for large cities to be built this way. i've seen very few cities that managed to make this work.


Not true, there are many places with traditional courtyard blocks and 3-5 stories that reach the population density of Manhattan


> Architecture professors have focused on “innovation” for 100 years and have achieved little.

The issue is that architecture is not a science. It has nothing to do with the past 100 years. There is simply, to date, no solid theoretical foundation that can inform design. Corbu made a lame attempt in his early phase to establish a set of axioms, and that didn't work out.

So the search in the past 100 years wasn't entirely based on "innovation". The field is searching for something resembling a theoretical framework.

> So the whole profession has failed, since about the introduction of the Bauhaus.

This is a reactionary statement. There are numerous amazing works of architecture from the 20th. And your dragging in Bauhaus indicates you actually are not well read in the history of modern architecture. (This negative fascination with bauhaus carries a strong whiff of the National Socialist Germany, btw ..)

> In survey after survey, 80% of the people prefer traditional over modern(ist) designs.

Well, Architecture (contra building design) is high art. It is not for the unwashed. 80% of the people also prefer drivel for their cultural fare.


Your reasoning follows the exact playbook that is repeated by architecture professionals around the world. It is exactly these kind of statements that lead me to comment in the first place.

No, architecture is not high art. It is the most public of arts and hence needs to serve the people. And people know very well which environments they like and which ones they don‘t. Where they find emotional well-being. That is not a political question at all. The studies are consistent.

We also don‘t need a theory. Architecture is a bunch of patterns and insights into human nature that has been known for thousands of years.


> Well, Architecture (contra building design) is high art. It is not for the unwashed. 80% of the people also prefer drivel for their cultural fare.

The people should get what the people like, not what the elite likes. Nobody cares what you consider "high art". The term in itself is pretentious.

Your "high art" is too desperately trying to make a name by standing out through being weird instead of better.


People do get what they like and they should. We are discussing Architecture with a capital A. It has always, since day 1, been an elite concern. No one took polls of e.g. Greeks to see if they approve of the Parthenon. The English common man was not consulted by Christopher Wren. The list goes on and on. What is ironic is that this "traditional architecture" that reactionary ones like you keep raving about is nothing about recyclying "high art" of their ancestors.

Materials change. Scale requirements have changed. Techniques have evolved. The forms are reflecting that. It is entirely correct to note that many of such efforts (mostly copycat rehashing of masterpieces of modernist architects by lesser talent) have proven ineffective, but that it is just the nature of the field. Architecture is not software. It takes generations to iterate through the possible solutions.

> Your "high art" is too desperately trying to make a name by standing out through being weird instead of better.

You have zero idea of what I consider high art in architecture. You are tilting at your own windmills buddy.

+There is nothing pretensious about distinguishing high and low cultural efforts. Let's consider our own field: should we all be forced to code in JavaScript and disavow more powerful constructs such as e.g. Haskell since the "common man" is incapable of groking it??


Well just that it’s none of these things, as expressed by countless people, online and offline. It was voted as world’s ugliest building more than once etc. etc…


My prediction is that the world will stabilize somewhere around 80% renewables and 20% nuclear. Maybe less. Prove me wrong


I'd think more nuclear would be better for environment. Since sun does not shine around the clock and other renewables have larger negative environmental impact. Batteries for storage is not good either with today's technology.

Pure nuclear, or even as a majority production method, would be fool's errand, though. Unless someone manages to invent small enough reactors that can be started and stopped at will, to adjust a day's power demands. I doubt that can even be possible, though.


In the end it is a matter of economics. Right now there is no path towards nuclear becoming competitive again against wind, solar, battery.

Plus, storage and cleanup costs in case of failures are not even priced in and left to taxpayers.

This leaves nuclear to government actors influenced by lobbyists.


Given they don't complement each other, I predict that equilibrium to be unstable: either nuclear or renewables will grow to mostly replace the other, due to economic forces.


But they do complement each other... Nuclear provides the base generation that's online 24/7 while renewables are unstable and able to provide the peak demand.


No, nuclear wants a dispatchable generation source to provide the power. It only makes sense as a complement when the dispatchable generation is expensive to run somehow, if it's cheaper than the nuclear than you should just not bother with nuclear. The two things which complement it are gas turbine generators (cheap to build, dispatchable, expensive fuel), and storage (very expensive to build ATM, needs to buy power when there's excess, but otherwise cheap). Renewables are not this: the energy they produce is cheap but not at all dispatchable (curtailable, yes, but you can't just get more wind blowing on demand). What this means is that sometimes they fail to provide the peak and sometimes they can provide the whole peak and more, which both doesn't provide a reliable grid and eats into the economic justification for nuclear. So, you want to pair them with dispatchable generation to fill in the gaps, which sounds familiar, no? In fact the only difference is with nuclear your gaps are more periodic and there's not such a large range of the gaps.

That's why they don't complement each other: they actually want the same, different thing to complement them: something which can fill in the gaps in the power that they can economically provide. And renewables are a heck of a lot cheaper than nuclear at the moment.


No, they are not typically complementary. The optimal solutions for powering a grid tend to either be all-nuclear or all-renewable (usually the latter now), depending on cost assumptions. Optimal solutions with a mixture are uncommon.


At planetary scale it seems to be quite important not having to replace the whole fleet every few years don‘t you think? Just from a resource perspective this planet shouldn‘t drown in defunct solar panels.


Luckily, they are made of just a few basic, very recyclable chemical elements. Aluminium is fully recyclable. Glass is fully recyclable. Silicon gets purified as part of its manufacturing process. And that just leaves the plastic backing.


Just a thought, but recycling glass is extremely energy intensive.

https://www.nrel.gov/docs/legosti/old/5703.pdf

"Recycling of glass does not save much energy or valuable raw material and does not reduce air or water pollution significantly."


Yes, that's why I said the stability of perovskites is the only real issue.


Weird how you can operate such a service anonomously. Even the whois entry goes through whoisproxy.

In the EU you have to put an address on every website that makes you reachable by courts.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: