Hacker News new | past | comments | ask | show | jobs | submit login
IRS’ 60-Year-Old IT System Failed on Tax Day Due to New Hardware (nextgov.com)
156 points by artsandsci on April 30, 2018 | hide | past | favorite | 201 comments



The question for the industry remains: how do we deal with systems that should be up for 10 years? 25 yrs? 50 yrs? 100 yrs? We didn't even have computers as they exist today 50 years ago.

How do we separate business logic in a self-contained way that can be independent of hardware and (from a certain level) software?


>The question for the industry remains: how do we deal with systems that should be up for 10 years? 25 yrs? 50 yrs? 100 yrs? We didn't even have computers as they exist today 50 years ago.

Is this a question the industry actually cares about? If anything, I'd say it got much worse at creating long-living software. It sucks to run a mainframe-based system from 50 years ago, but running a 10-year-old Web Forms application can be borderline impossible. I'm scared to think of what will happen with all the cloud-based stuff in the next 10 years.

>How do we separate business logic in a self-contained way that can be independent of hardware and (from a certain level) software?

I often say that there is no such thing as "business" logic. There is just logic. The least maintainable systems I've worked with were the ones with the most sophisticated attempts at abstraction.

In my opinion, the key to software longevity is making systems that allow for local reasoning, i.e. analyzing parts without knowing the whole. This can be achieved in a lot of different ways, but it needs to be a deliberate goal in system design.


I agree it's a tough cookie. I actually don't think layers and layers of abstraction is a good idea.

But something were you can describe something like "Read this value from this field, do some math based on fields in another table, return the result"

That's why I raised the question, how do we do that over long periods of time without rewriting it in several different ways? (ASM to flat files, COBOL and JAVA to different SQL flavours, what else is next?)

And if you migrate, let's say from Oracle to MSSQL it will work for 90% of the things, for the last 10% it will be a PITA.


> The least maintainable systems I've worked with were the ones with the most sophisticated attempts at abstraction.

Seconded. Enterprise patterns mandating layers and layers of abstraction, making it incredibly hard to reason about the far flung pieces of code that actually do anything.


> Is this a question the industry actually cares about? If anything, I'd say it got much worse at creating long-living software. It sucks to run a mainframe-based system from 50 years ago, but running a 10-year-old Web Forms application can be borderline impossible. I'm scared to think of what will happen with all the cloud-based stuff in the next 10 years.

Cloud deployments, especially modular systems (and of course microservices) are often sold to companies as the best way to reduce technical debt. I don't think they've been around for long enough to truly test it but would be curious if a 30yr old system of microservices running in containers is easier to update than COBOL software running on a mainframe.


We have shown pretty clearly that any computer we build at day 1 can be emulated perfectly by a computer built day 1 + 50 years. However, building systems that can evolve is a larger issue.

The nut here is that these systems evolve in a very haphazard way do to funding availability, scaling stresses, and changing requirements.

For the IRS, I think it would be useful to engage the US Digital Service to see if they could design a system based on a cluster of non-specific hardware with only network interfaces, that could host a 'tax computation engine' and a growable data store. While it is a 'big' problem, it is not as big as some of the problems I have seen hosted on 25,000 machines as a 'small' cluster. That is 800,000 cores (assuming 32 core machines) so for 140M tax payers that is 175 payers per machine. Something that could no doubt process every return in a single day, and run fraud analysis as well.

Clearly the government needs warehouse scale computers that their vendor (IBM) is not in a position (yet?) to deliver.


> We have shown pretty clearly that any computer we build at day 1 can be emulated perfectly by a computer built day 1 + 50 years.

We can emulate pretty much any old system, but emulated "perfectly" is at best not demonstrated and at worst already proven wrong. Perfect emulation implies emulating the undocumented bugs that end up working--and there are a lot of those. Getting the correct behavior in the case of parallel components (note that this applies even to single-core systems, since dedicated hardware constructs can race with the CPU) is particularly tricky. Even without parallel issues, you can have cases where software blatantly violates the API but it works reliably and consistently, necessitating complex emulation to make it work correctly (there are some games that don't work on WINE for this reason).


"Getting the correct behavior in the case of parallel components (note that this applies even to single-core systems, since dedicated hardware constructs can race with the CPU) is particularly tricky."

Heck, even if you emulate the documented performance perfectly, you can still suffer regression behavior if the existing program just happens to rely on misbehavior of the original hardware due to less tolerant exception handling, stricter memory alignment protections, etc. in the later deployment environment.

I've seen this a lot in memory misuse / misalignment when porting between various UNIX implementations. If your development environment just happens to forgive a lot of sloppiness (cough Solaris cough), misbehavior may only show up later on other platforms.


As for the solaris it is probably only UNIX where ignoring the ISO C rule that anything ending with "_t" is reserved for implementation will bite you hard as solaris' libc headers are probably the main reason for that and typedef stuff like lock_t.

As for the rest doing cycle exact emulation of 50 year old system is and probably always will be doable and from some PoV trivial: you can just simulate the hardware logic of the original implementation on RTL or even gate level and it still will be faster than realtime.

[Edit: "reserved for implementation" was missing from the first paragraph]


It occurred to me just now that the US Digital Service will become increasingly powerful as it helps digitizes the government, and 30 years from now it could be synonymous with the government.


> How do we separate business logic in a self-contained way that can be independent of hardware and (from a certain level) software?

As strange as this may sound, documentation. Software is at it's core rather brittle it's designed to operate inside of a very specific system and trying to make it portable is only reasonable when you know the target ahead of time.


Better yet, test cases. That way you can throw your new implementations against them.


How do you create incentive structures for new employees who have to spend months (years?) coming upto speed with giant state machines before they feel remotely confident of the the effect of their changes? For better or for worse, currently the glory is in writing new systems rather than improving existing ones.


"state machines" - nice pun in this context.


You get them doing interesting work on the system as often as you can.

Fixing bugs and babysitting all day isn't going to convince people to stick around. Best you can do is get them something they can try to work with.


Unfortunately, it's not just about "glory". If you work in "obsolete" languages/environments, you likely will be perceived as less valuable to your next employer.


This right here is a huge problem with our industry. If you're not jumping from employer to employer every 1.7 years, you're labeled as out of date. It takes some really talented developers to truly understand some of these old systems. They should be given the proper respect they deserve.


In reality it often means these systems are built in bad, ineffective way. Most likely in bad, ineffective languages/technologies. You don't need 17 years to be able to contribute to Apache Spark. You would not call that a small project, would you?


No, you can stick around.

I wouldn't consider someone who spent 5 years on AWS, especially if they moved internally, out of date.

Whereas I've done nothing remotely close to where I want to be in about 3 years. I feel very scared about my skills, especially since I'm not even 10 years into my career.


I spent 9 years at my first FT job out of college at a hedge fund known at the time for culling the bottom 10% of performers annually at the time. That long of tenure at a place with that reputation has openned more doors than it has closed, especially within the finance industry. Lately, I've been getting 4-6 emails from local recruiters per week.

An aside, even when you spend that long in a single position, you do need to keep learning on new and relevant technology; I attended every on house training seminar I could, which helped a lot, and spent a good portion every day reading. I spent 7 years at that company in the same effective role, and never changed job titles, but saw a >200% increase in take home pay in that same period.


I heard that COBOL programmers do pretty well because they are a rare commodity serving expensive industries.


If you're happy with working with something that destroys your brain and makes you useless to any other software technology developed in the last 20 years be my guest.


COBOL doesn't destroy your brain, that I recall. In fact, if it bores you, it leaves a lot of brain capacity available for funner stuff, like MVS exploits via Resolve.

To me, it's the cool-app-of-the-month-that-harvests-private-data-off-peoples-devices that is brain-killing.


>How do we separate business logic in a self-contained way that can be independent of hardware and (from a certain level) software?

Come up with programming languages with limited domain specific functionality that don't allow you to do funny machine specific things. (or even ordinary things like pointers)

Use the restricted languages for business logic and have a second language to do the plumbing.

Or just target a generic x86_64 virtual machine.


So all critical software in Python or JS? Not so sure about that. I get the idea but think having strict code guidelines might be more effective than choosing a more restrictive language.


No no, restricted, a programming language that can't do a lot of things. A language that can only do business logic kinds of things.


IBM mainframe assembler is actually somewhat abstracted from the hardware. The original assembler is very CISC-like, but today it's running on RISC-like cores.

As for the hardware, I/O is done through channel programs, which can have any device as a backend - it's a generic way to do DMA.

There is a lot of legacy stuff (like CCKD, VTAM) but more or less it's all emulated today by the hardware/software.


I'm sure academics have thought about such information representation problems. As for industry, why would they be interested in that? It's like killing the goose that keeps laying golden rewriting eggs.


IBM has already shown that the machine layer and the OS layer can be separate and that progression in one does not necessarily require a full change in the other.

the method by which it works is that code compiled into a new form which can be translated on the fly to match the hardware level below it.

now what is probably trapping the IRS besides a base that not everyone full understands but the sheer complexity of what they have to deal with, the tax code and whims of those in Congress.


This is a relatively solved problem. Abstractions.

50 years ago the IRS had to build a webserver, the servlet application code, it had to build in fault tolerance without SQS or Kafka.

If built today, the IRS' entire system would probably run on Rails with JRuby, deployed with Docker and K8s, and be 1000x more maintainable.


50 years ago the WWW didn't exist and there was no such thing as a servlet.

The system ran on a mainframe not 50 but 60 years ago. The only high level language at the time was FORTRAN, but this system was apparently written in assembly language.

Does anyone know the type of computer it ran on? The article it linked to says IBM mainframe but which model?


> The only high level language at the time was FORTRAN, but this system was apparently written in assembly language.

No direct insight to the actual system used, but FWIW IBM mainframe assembly IMHO seems 'surprisingly' user friendly due to the large amount of standard macros/routines provided, clear/direct integration with JCL, and the minimalist ethos of the OS. Take a look at some youtube videos for some examples, it actually looks like a fairly interesting/productive, if arcane, environment. Writing clean/well architected applications in it doesn't actually look like it would necessarily be a ridiculous notion.

Of course, how well any given production system was constructed and what cruft has accumulated in it over 50y is probably another story..


They use IBM/360[1] mainframes. Bloomberg had a good story [2] some time ago that was re-run last April 17th on some of the history.

[1] https://www.washingtonexaminer.com/tech-timebomb-the-irs-is-...

[2] https://www.bloomberg.com/view/articles/2018-04-17/the-irs-c...


COBOL, ALGOL, PL/1


Not in 1958.

The first COBOL compiler was available 2 years later in 1960. Algol 58 wasn't available until 1960 either.

The ALGO (Algol 58) manual for the Bendix: http://www.piercefuller.com/collect/bendix/algo6008.pdf

The first compiler for PL/1 was delivered much later, in 1966.

You could have suggested LISP 1.0 but that appeared in 1960 too (as an interpreter, the first compiler was written in 1962).

LISP, COBOL, ALGOL: 1960 was a remarkable year in the history of programming languages.


Oh yes, 1000x maintainable with 1000x more lines in the codebase.

Speak softly into my ear, these little lies you like to tell me.


Make sure you're including all the lines, including those in libraries/frameworks you reference. Because they all count towards your MTBF.


Having not worked on large scale systems like what I imagine they must be working with, is this really true? I would have thought that a modern framework would make things easier. I've certainly had life made easier by modern web frameworks like React and Angular, although the support timeline is more on the order of 2-3 years, rather than 20-30.


Not sure, what if you have an enterprise system that uses a few hundred docker containers and at some point in 10-20 years there's a major vulnerability that can only be fixed by breaking compatibility for one function? Securing that system could be significantly harder than fixing a monolithic system.


From my experience the government is excited about kubernetes and openshift and are slowly starting to use it for prod (security approval is tricky)


If the hardware is unsupported and no longer maintainable you could run the software on an emulator. That way you don't need to know how the software works. The hardware's instruction set would be a lot less complex and almost certainly better documented.


Whenever this comes up, I’m reminded of two of my favourite calculators: the HP48 and the HP49g+. HP used a processor called the Saturn for a long time, and when they finally could no longer source it, they switched to ARM running a Saturn emulator for the ‘49. The ‘48 had battle-hardened firmware, and this was probably a great decision.

https://en.m.wikipedia.org/wiki/HP_Saturn


At least this article acknowledges a period of "parallel validation" prior to retiring the old system.

Too many legacy migration projects fail to account for the active-active nature of migration.


Code all long term business logic in magnetic core memory running on hardened valves with dixie tube displays.


I feel like the authors of this article are being very misleading by using broader terms where they could be more specific, like using system where they could instead say software. The IRS isn't running 60 year old hardware, and the failure was in hardware that was just 18 months old. But because the IRS is running software that is 60 years old, the authors can claim the "system" dates back to the 1960s.

It's also clear the authors don't entirely understand what they're talking about. In the second article linked to, which is written by an author of this article, he states "[t]he Individual Master File, a massive application written in the antiquated and low-level Assembly programming language" as if to imply there is but one assembly language.

A GAO report I found chasing the last link in the article indicates that the two oldest systems in the Government are the IMF and Business Master File (BMF) which it says both run on IBM Hardware. My guess would be that the IRS is running modern z/Architecture mainframes in System 360 Compatibility mode. The System 360 dates back to the early 60s and IBM has maintained backward compatibility with it all the way up through z/Architecture systems. This would also explain why they're so far over budget and behind schedule.


FWIW, the IRS.gov IT docs make reference to "Assembler Language" as if it were a proper noun:

https://www.irs.gov/irm/part2/irm_02-005-003

> The scope of this directive is Servicewide. This includes software developed by contractors. Where the guidelines apply to Assembler Language, COBOL, C Language, C++ programming, and Java programming, these guidelines shall be followed respectively.


The two references in that document use it correctly, one is a document title and the other is contextual to a domain. The usage in the article is similar to saying "the foreign language" vs "foreign language". The former implies that there is only one where as the later does not.


The oldest parts are not System 360, they run IBM 7074 assembly. Just FYI. The caching system is much newer, they have some Java proxy, there were posts about it the last year.


The manual for the 7070, which was similar, is here: http://www.bitsavers.org/pdf/ibm/7070/A22-7003-01_7070_Refer...

The instruction set documentation starts at page 42.


The manual... and it's less than 300 pages... oh boy, I remember the System 3090 manual, it was several metres thick :D (in several volumes, of course)


I assume it runs under an emulator by now.

Once I did manual disassembly of compiled code back into a high-level (for the time) language. It wasn't very fun, but , thankfully, the compilers of the day didn't optimize much.

It'd be a fun project.


  the compilers of the day didn't optimize much
I worked in the IBM mainframe segment to start my career, and we actually had an awesome Capex optimizing COBOL compiler. We also had a core dump formatting system (not sure if it was part of the Capex set) that made dumps much easier to read and debug, plus they broke out the generated assembler code to make it quite intuitive. I got so good with reading their dumps that going to the first realtime debuggers was slower.


In this case, it was a C compiler for MS-DOS.


> they run IBM 7074 assembly

That potentially puts them in the late 50s. That might be some of the oldest running code in the world. That's pretty cool.

I wonder if the person who wrote it is still alive?


https://archive.org/stream/irshistoricalfac00unit/irshistori... a bit broken up but it says clearly:

> January 1962 Automated data processing was officially put into operation in the IRS.

https://www.irs.gov/pub/irs-soi/62dbfullar.pdf says

> --Dn September 19, 1961, less than a year following its establishment as a separate division, the Automatic Data Processing Division...

So the division was likely established in 1960. This https://books.google.ca/books?id=x1YESXanrgQC&pg=PA119&lpg=P... history corroborates by saying the treasury authorized the IRS to computerize fully in 1959.

They were 7074, we know that from http://blog.modernmechanix.com/big-brother-7074-is-watching-... Popular Science but also that some docs from 2016 says https://www.irs.gov/pub/irs-utl/scap-pia.pdf

> SCAP downloads Corporate Files On-Line (CFOL) data from the IBM mainframe at the Enterprise Computing Center, Martinsburg. The CFOL data resides in a variety of formats (packed decimal, 7074, DB2, etc.).

> That might be some of the oldest running code in the world.

https://www-03.ibm.com/ibm/history/exhibits/dpd50/dpd50_chro... says The IBM 9090 SABRE airline reservation system begins to be installed by American Airlines in 1961. That makes SABRE contemporaneous with the IRS Master Files or slightly older. I do not know, however, how much of that code is running today.


Feel free to not read articles written by non-tech people. Or, acknowledge the fact that a journalism major can only put in words what he understands of a deeply technical matter that's outside his expertise


Or, suggest that this is a low quality article, and if journalistic enterprises want to report on subject matter, they should have their articles approved by people knowledgeable in that subject matter.


Unfortunately that's true for most journalism. It's just that we techies are not usually subject-matter experts in the many other areas where they make mistakes.

Consistently high-quality in-depth reporting is the exception and is not a threshold most media will accept as a publication bottleneck.


My Dad used to say: "The Economist "(magazine)"sounds authoritative until you read an article about something you are an expert in."


I still find they generally do a good job breaking down topics in a way most people can understand it. Often, being accurate also means being unnecessarily complex. If your goal is to explain research in an easy way, simplifying it will often introduce inaccuracy. That doesn't mean the writer isn't aware of it, they just try to get as many people as possible to read to the end (as this gets tracked online).


True for all journalism.


You're assuming the target audience to only comprise of subject matter experts. This article is perfectly ok for 90% of the tax paying population that works outside tech


And maybe even better for those 90%. Introducing more technical terms (e.g. specifying the Assembler used) makes it less readable for anyone without the domain knowledge. Often, when trying to explain something to the broad public, making it inaccurate is a necessity.


I wasn't suggesting the article needed to be more technical in nature or lacked details but rather pointing out that the improper use of a grammatic article demonstrated a lack of editorial oversight by a technical resource.

In other words, the authors are speaking with authority about something they demonstrated ignorance of by improper use of an article and they went out of their way to use language that would lead a non-technical reader to conclude the failure that occurred was a result of 60 year old code when in reality it was due to 18 month old hardware.


Or, fund education and set higher standards in education so you can rationally expect typical journalists are capable of diving into technical issues.


Yes, and many places have technical editors that help with these sorts of things.


It's general reporting by general reporters using general language for the general public. It's not a tech publication.

Joe Lunchbucket doesn't care if it's hardware that failed or software. The system failed. That's what his tax dollars pay for.


  Joe Lunchbucket doesn't care if it's hardware that failed or software
Unfortunately, Joe and Jane Doordash in Procurement are the ones making the big choices, and it would be nice to have them informed in detail by something other than the salesdroids that have been perpetuating the problems for decades.


There's no reason not to include the details. People who don't care can ignore them.


But how many details do you include? How much ram? Length of the power cords?

It is the job of the writer to write for the audience, not include every single detail, no?


Enough so that the tax payers understand the IRS isn't actually running on an IT system from 1958. That's a pretty hilariously low bar to set for a journalist to step over.


> There's no reason not to include the details. People who don't care can ignore them.

Based on my experience regularly communicating technical information to non-technical people, the above suggestion works out very poorly. Each technical detail is a hurdle for the listener, causing misunderstanding, frustration, feelings of inadequacy, and doubt about your competence as an advisor and communicator. Generally, they respond like most people do when confronted with something unpleasant; they tune out your message. My advice is to include the minimum possible technical details and to express them in the clearest possible non-technical language.


the z/whatever to host legacy mainframe applications is basically a bare metal platform, custom hypervisor OS, running system 360 VMs, right?


I'm not familiar with all of IBM's z offerings but it was my understanding that older systems were completely emulated and that there is no hardware compatibility. The emulation is completely independent of the underlying hardware platform and it allows them to retain ridiculous uptimes because of all of the hotswappable redundant systems that are able to run the systems concurrently.


>>Still, the crash forced the IRS to extend the tax filing deadline one day, delaying some 14 million submissions.

For a system that is supposedly so brittle, they were able to get enough of it back up and running to at least accept submissions within one day?

That's pretty amazing for 60 year old technology.


You just have to get the stuck card unjammed, then repunch it and recollate...


...and it was the re-collation step that took a dedicated team of professionals a full day.


To be fair, 140 million taxpayers account for a lot of punchcards.


(assuming it is not /sarcasm)

The data is not stored in punch cards, the programs are/were. The data is/was stored in tapes or in tape emulators.

Storing data on cards would make random access wayyyy too slow (the operator would have to retrieve and load the card and keep them sorted manually, which is impossible to do at scale)

On tape, however, you can predict the distance you have to rewind by computing the physical length each records take if they have auto incremented IDs. A tape can contain a lot of records. Especially if it's a modern one or an array backed emulated one. At least that's what some people I went to school with told me it's how they did it.


I did a bit of stuff with data coming in on punchcards. You get a batch of punchcards, run it through a program (which can load data from tape or other punchcards) and a new batch of new punchcards with updated data come out on the other side, along with an impressive pile of printed invoices.

Wide adoption of tape was a lifesaver.


I doubt that physical media or tape are used beyond initial load onto disk, often directly into a database, with a single IEBGENER step. Heck, that was the case 30 years ago.


It's more like it took them an entire day to figure out how to reboot a VM instance.


A delivery date for CADE 2, the IRS’ subsequent modernization effort, has slipped several years even as contractors working on the project have earned as much as $290 million.

They call them "Beltway Bandits" for a reason.


As someone working on a modernization contract, the problems are more complicated than just contractors standing on shovels.

Building an app to a complex and ever shifting spec is a much different process than the way that startups work. You can’t just release a half done app to friends and family. You need to do the whole thing at once, with 100% if the functionality and it has to be ready for everyone in the country to use it in day 1. And the federal government has fairly onerous requirements for getting stuff through security— you can’t just google for random shit from GitHub and expect to put it straight into production.

I can only speak to my particular project, but everyone on it is working hard at getting it done on time and on budget, using modern deployment and development practices, but there’s limits to how much you’re allowed to just move fast and break things.


I'm always amazed that so many are outraged by Investment Bankers and Lawyers making millions when IT consultants get so little attention. Maybe it's because few people understand what IT consultants actually do and have no alternatives but hiring them but cost overruns due to increased IT consultancy appear much more prevalent than escalating legal costs.


Fun (scary) fact – significant portions of the IRS tax system is still written in assembly. It's basically the original codebase plus 40 years of additions and patches as the tax code changed.

They were working on projects to "elevate" it to Java, because it's pretty difficult to maintain it as-is.


> Fun (scary) fact – significant portions of the IRS tax system is still written in assembly.

Why is that scary? It has decades of real life testing applied to it if there were any obvious bugs they would have been found by now.

Assembly is just another language, it is slower to develop in but there are many examples of high level language issues that are at least as scary as what you could do in assembly. In the end it is all about the processes around your development as much or more than it is about what language you pick. The best way to reduce your chance of bugs is to reduce the size of your project.

https://www.mayerdan.com/ruby/2012/11/11/bugs-per-line-of-co...


But it's not something that was written once and has been used as-is for 40 years – it has constantly changed to match the tax code. The IRS has claimed that it is difficult for them to find engineers who can maintain it. After all, they didn't start a major project to port it to Java for no reason.

I also disagree with Mayer's assertions about LOC. In my professional experience, the best ways to avoid bugs are optimizing for clarity as well as continuously grooming your codebase (including modernizing when possible). Obviously the latter is not an option for safety-critical software, and probably not suitable for the tax system either. But in general, those practices have worked better for me than have others.


> <snip> if there were any obvious bugs they would have been found by now.

Really, that's what they said about CPUs - at least if you're not in the fab industry.


> Fun (scary) fact – significant portions of the IRS tax system is still written in assembly. It's basically the original codebase plus 40 years of additions and patches as the tax code changed.

> They were working on projects to "elevate" it to Java, because it's pretty difficult to maintain it as-is.

Yeah, apparently an eight person IRS development team nearly developed a replacement, but the main developers were hired on a special program to pay them at above-GSA rates, and that program expired before they could put it into production:

https://federalnewsradio.com/tom-temin-commentary/2018/01/ir...


Couldn't they open-source it? I assume finishing it would be relatively quick and a lot of comp-sci graduates would love to have the chance to play with IRS systems.


I'm guessing that would scare the crap out of the masses. Security by obscurity may not work but if there ever were a glitch or breach it sure does look bad to say "and they knew about the bug because we made our codebase public"


I don't think you could spin it that way. "The popular media can only handle ideas expressible in proto-language, not ideas requiring nested phrase-structure syntax for their exposition." Though maybe a better reason is that there doesn't seem to have been any such spin on big open source issues (such as Heartbleed) of the recent past. Why would they spin it as an open source problem rather than say a government/contractor competence problem (or something else) in the IRS case?


I mean, maintainability aside they are a giant calculator right? You don't need to be running the latest, hippest web platform to do a good job at doing a lot of math. And, will any of the current en-vogue tech still be working in 50 years?

If you had to pick, Java does make a lot of sense. Java & C are safe bets for "languages that will stick around for decades".


Having worked on financial code that is brittle and subject to far less minutiae than the multitude of IRS code, I can be very sympathetic on the complexity involved. It is hard to handle all of the complexity in a generic maintainable manner. I once ran into a 56-page printed function to calculate the margin required for a portfolio at a prime broker. Was it poorly written? Yes. Was it also representative of the complexity of the problem? Yes. Not an excuse for poor design, though.


I imagine processing the data, deciding whom to audit, and tracking the process is more complicated. Heck; they're probably using ML for audit red flags


How are they retaining staff with the skill-set to maintain that code?


They don't retain staff. But the ones who retire or leave can generally get a very large contract when things like this pop up.

Until, of course, they are all gone. At which point the system will need to be rewritten by IBM, Oracle or some other useless, gigantic consulting firm.


Its tough to recruit for, so you really end up with folks who have job security for the forseeable future and others who become retired-in-place because of the rarity of cleared folks eligible to take on the post.


Most big banks run on COBOL, new Fintechs run on micro-services powered by Go, JS or whatever's trendy at the moment. I trust software with a proven track record of several decades with no considerable downtime more than architecture that's currently considered to be "best practice".

NB: Don't object any of the new languages, I've never written code in COBOL or for a mainframe. But I've also never developed software with that kind of requirements.


Should it be written in Go, or Rust?


Because that always works so well ...

See the Chrysler Comprehensive Compensation fiasco: https://en.wikipedia.org/wiki/Chrysler_Comprehensive_Compens...

These kinds of systems are all about the exceptions and how you enable humans to deal with those, not the rules and how the computer deals with those.

The problem is that the system first needs a comprehensive test suite written. THEN you could actually think about rewriting pieces.


I have sympathy for Rust but will it be easier to find Rust developers in 2040 than COBOL developers?


Depends on the decade you ask it.


Haskell. Obviously.


Something that the kids of IT today will never learn:

Never replace anything unless the thing doing the replacing is better than the thing being replaced.

The IRS' IT "upgrade" is just a huge money grab. Nobody has the slightest interest or desire to do anything properly - they simply want in on that lucrative IRS money.

It's like the saying: "Anyone who is capable of getting themselves made President should on no account be allowed to do the job." Likewise, the kind of people who are bidding on this project are certainly not the kind of people that should ever be allowed anywhere near the IRS' computer systems.


> Never replace anything unless the thing doing the replacing is better than the thing being replaced.

The problem is 'better' is subjective. Plenty of mechanics will tell you older cars are better because they're easier to repair, don't require specialist machine diagnostics, have manual buttons / levers / knobs that are easier to troubleshoot than digital versions.

To most drivers however, newer cars are better because they're more comfortable and convenient to drive around in.

'Better' means different things to different people.


Modern cars in general get much better mileage and are much safer to drive in. (Yes, airbag shrapnel has happened.)


Only if you're driving forwards. The crash test rating arms race gave us vehicles we can't see out of then we backed over so many kids that the federal government mandated a rear camera in everything.


I want to think that’s just consumers’ preferences shifting towards SUV and Crossover cars that necessarily have poor rear visibility. Nothing about having a good rear crumple-zone leads to restricting the size of the rear window or placement of the C-pillars.


It's the deep seating position that really kills visibility. the higher (relative to the bottom of the windows) seating position in crossovers helps drive their sales a lot.

Crossovers typically have equal or better rear visibility to the car they're based on because the higher seating position means you can look down at a steeper angle greatly reducing the maximum distance at which a kid height object is not visible to the driver.


Is it true that older cars had fewer toddlers-crushed-per-trip?


> The IRS' IT "upgrade" is just a huge money grab. Nobody has the slightest interest or desire to do anything properly - they simply want in on that lucrative IRS money.

This is factually wrong (at least for the most current effort): https://federalnewsradio.com/tom-temin-commentary/2018/01/ir...

An eight person IRS development team nearly finished building a replacement, but the main developers were hired on a special program to pay them at above-GSA rates, and that program expired before they could put it into production.


Open-source it and we'll help with the fixes.


And then we'll see how vicious the incumbent companies get. There's no FUD greater than government contractor FUD.


I say bring them out into the light. It's an excellent disinfectant.


That would be a huge exploit/privacy risk.


Right because that's what we say about the Linux kernel too, right? If IRS is the maintainer, they still review changes. And so do all of us. If someone slips in an exploit, we'll catch it the same way every other open source project does.


The kernel is an entirely different beast. The systems in question have been around three times longer, and have been closed from the start. A hardware caching issue bringing the whole thing down probably means it's closely tied into the hardware, and difficult to contribute to. Obscurity isn't security, but it may not be the best idea to pop the lid off everything at once in such a case.


Unless it's an exploit-able weakness that is discovered only because the source is open to be both read and tested against, like Heartbleed.


I sincerely hope that machine is not accessible from the open internet.


How would it be a privacy risk, unless, the data files were included/embedded in the source code repo?


Heartbleed[0] is a perfect demonstration of such a real-world exploit risk.

[0] https://en.wikipedia.org/wiki/Heartbleed


Is that why Google's and Amazon's linux kernels are constantly being exploited?


Heartbleed[0], it fact, was exploitable in the wild well before Neel Mehta of Google discovered and diagnosed it.

[0] https://en.wikipedia.org/wiki/Heartbleed


Or you can continuously keep your code / systems up to date and never get wedged into a nearly impossible situation of having to rewrite 60 year old code (which likely has nothing resembling a test suite nor even a spare piece of compatible hardware to test on).


Most people would like that but think about how that sounds from a budget perspective: a bunch of pricey employees for something which seems to already be working fine, with a constant stream of political grandstanding lies about waste and overpaid staff. Most big organizations have trouble funding maintenance & operations versus new features and the government is especially subject to that because there are so many outside voices who benefit from criticism and have no vested interest in success.

The .gov pay scale has not kept pace with either inflation or IT wages and the benefits have been cut by Congress at various points so that’s less of a balance than it used to be. The GS-15 position mentioned in the IRS staffing article is the top of the non-executive scale and technical positions tend to be rare and specialized; managers are overseeing a fair amount of staff and budget. In DC, that starts at $134k and tops at $164k - not bad but DC isn’t a cheap place to live either: https://www.opm.gov/policy-data-oversight/pay-leave/salaries...


It sounds like literally any other piece of equipment you buy.


Yes, and there are plenty of examples of people asking for maintenance funding, being told to wait for a better budget year, and eventually having to pay considerably more when something blows up.

Software is especially prone to this as many people think of it as a project with a defined end date rather than an ongoing endeavor.


The people in the trenches most likely prefer that (I know I did). The problem is convincing the people higher up to spend part of a tight budget (money and manhours) every year without an easily-defined return.


This. A common answer to criticism of systems in need of replacing is that it is simply too big, too complex and depends on too many old legacy components to make any large changes to. The real solution is not to get into these scenarios in the first place by not doing any kind of maintenance work for decades.


> Or you can continuously keep your code / systems up to date

Easy to say as a grunt, but businesses (and governments) are run by managers and accountants.

What you describe is upgrading for upgrading's sake. If a system isn't broken, there's no point or financial reason to replace it.

Imagine trying to upgrade a system with incredibly complex data on a BILLION users that has to work 24/7, then deal with massive data influx four times a year plus catastrophic data ingestion one time a year.

And not only take in and sort those records, but validate the data and perform calculations based on rules that change almost monthly, and that sometimes must be applied to years-old data and sometimes not.

Plus, you're dealing with people's money. You don't fuck around with people's money.

It would be easier to merge Facebook and Google's databases.


> Imagine trying to upgrade a system with incredibly complex data on a BILLION users that has to work 24/7, then deal with massive data influx four times a year plus catastrophic data ingestion one time a year.

I think you're over-estimating the amount of data involved. There were about 150M individual tax returns and ~5.8M corporate returns in 2015. Even if we assume 50 years of storage in the 'active' dataset that is < 8B returns. Also, most returns don't contain a particularly large amount of data in them. Pretty sure we're talking about an in-memory scale dataset here, certainly for the active set in any given year. It actually would not surprise me if we (current co) ingest more data in a week than the entirety of the US tax return history.

> And not only take in and sort those records, but validate the data and perform calculations based on rules that change almost monthly, and that sometimes must be applied to years-old data and sometimes not.

I might believe that is an issue if the IRS actually calculated everyones taxes for them and sent out a 'do you agree' form, but they don't. It's more like they randomly sample some returns based on some hand picked 'warning' flags and have someone manually check them, maybe with the help of some excel formulae.


>I think you're over-estimating the amount of data involved.

The one billion figure was from the article.


Regarding the latter point: one of the goals of modernizing the system is that it should give the IRS the capability to implement a modern fraud detection system, or calculate your taxes for you (if Intuit ever lets that happen.)


Managers and accountants can handle the idea that a car's oil and tires need to be continuously replaced to keep the car in working order. They can handle the idea that airplanes need mechanics to go over them before every flight. They can handle the idea that certain equipment has a lifespan of X years before it needs to be replaced. There is nothing in management or accounting that is fundamentally opposed to the idea that something needs to be continuously maintained. It is simple incompetence by those in charge.


Imagine if you were an airline that didn't bother to maintain planes.

When you build software you should plan to maintain it. Bug fixes, features, upgrades, etc...


Imagine if you were an airline that payed your mechanics absolute bottom dollars and gave them incredibly silly restrictions based on fears of non-tech people. Imagine if you made them sit in countless meetings while people with more years with the airline made decisions that the mechanics KNEW were wrong. That's how the government does IT.


>I think you're over-estimating the amount of data involved.

A better analogy would be if you were an airline and kept upgrading airplanes before they were needed.

Airplanes (especially private ones) fly for decades and decades.

No one is suggesting that the IRS systems weren't maintained. Obviously they were since they exist on virtualized hardware.


Airplanes get upgraded all the time - fresh seats, newer gps tech, etc.

Software systems like this get “mainenance” of the bare minimum sense - think a new battery after the original is dead.


Exactly. The government also ensures that if an issue is found with a airplane The Shit Hits the Fan and multiple parties are heavily incentivized to fix it. We need a similar approach to critical IT infrastructure.


Small development teams have a hard enough time with that. I can't imagine that is even remotely possible within the bureaucratic framework of a government agency.


Is the 60-year old assembly code updated as the tax laws change? I guess it must be.

Before embarking on a re-write, how about we create a simplified tax code (laws)? Not sure which task is tougher.


>"Anyone who is capable of getting themselves made President should on no account be allowed to do the job."

... which is a very good example of Catch 22 logic:

https://en.wikipedia.org/wiki/Catch-22_(logic)

I personally prefer Groucho Marx :

https://quoteinvestigator.com/2011/04/18/groucho-resigns/

“I don’t want to belong to any club that would accept me as one of its members.”


Having this stuff digitized is actually pretty nice. The German ELSTER is basically PDF forms on steroids; same tax forms you presumably used to fill out by hand but with input formatting, validation, basic sanity checks and it estimates what you'll pay / get back at the end of it all. Then you can even submit it online if you have registered for a private key.

Of course the tax forms themselves are still insane.


All programs by the government are ways to employ people. They can be ineffective or effective but it still makes whoever employed happy to vote for whoever made the job up.


> Something that the kids of IT today will never learn

Why, because you're busy being condescending to teach?

Software should be considered a system that needs ongoing care and maintenance, just like anything else.


Why wouldn't "the kids of IT today" ever learn that?


Some of them will, by which point they won't be kids.


Yes, because nobody who's old now was ever a dumb kid.


What part is complicated when transitioning such a system?

Legacy data which must be transformed?

Old business logic which must be maintained? I do not know if there is a need to replay old rules vs. just kerping a snapshot

Continuity? Maybe there are so many transactions happening, linked to existing data that an interruption is complicated?

Risk of accessing old data which is better left undisturbed?


'The Individual Master File contains data from 1 billion taxpayer accounts dating back several decades and is the chief IRS application responsible for receiving 100 million Americans’ individual taxpayer data and dispensing refunds. '

I hope they bought a new hard drive to back it up...


don't know about you, but I'm hoping they didn't.


What kind of hardware was it running on? The article is not very specific about that.

(At first I thought that they upgraded IBM mainframe to z14, and that caused the crash, but it's not yet 60 years old.)


While the code is 60 years old, the hardware is modern. According to this article, the component that failed is about 18 months old. When they talk about the software and hardware "no longer being supported", I think they really mean that the original target platforms are only present in the form of emulated software. IBM still supports execution of System/360 software on the System z. [1]

"Overall IRS maintains over 20 million lines of assembly code. These millions of lines of archaic software, and hardware, that is no longer supported become more difficult and costly to maintain each year, and poses significant cybersecurity risks. To IRS’ credit it keeps these old systems running during filing season, but relying on these antiquated systems for our nation’s primary source of revenue is highly risky, meaning that the chance of having a failure during the filing season is continually increasing." - David Powner, director of IT management issues for the Government Accountability Office [0]

"[IRS Chief Information Officer Gina Garza] explained the code for the IMF was written in the 1960s, but the hardware it runs on is modern." [0]

[0] https://federalnewsradio.com/technology-main/2017/10/a-sense...

[1] https://arstechnica.com/information-technology/2014/04/50-ye...


They use very broad terms to imply it was 60 year old hardware that failed but early on in the article state it was an 18 month old piece of hardware in a "system" that dates back to the 60s. If you dig into some of the other links and a GAO report you learn that there's two programs, the IMF and BMF, that are partially written in assembly that dates back to the 60s.

They're most likely running some z/Architecture in System 360 virtualization mode.


So they need something proven, scalable, highly concurrent, fault tolerant, and runs in a VM so it's not tied to hardware.

Sounds familiar.


Be thankful we're not getting all the government we're paying for. --Will Rogers

Source: https://www.brainyquote.com/quotes/will_rogers_128073


Can the USDS assist here?


Last time I talked to somebody at USDS two years ago maybe, they where already helping. I think the biggest task was trying to figure out getting off tape and moving normal hard drive storage. Apparently this is hard because the really old programs are the OS and file system. Without the details it sounds to me this is what broke or something around that.


It's probably easier to just scrap the income tax than to fix the system. Maybe go to a land tax instead.


>[...] date back to 1960, when John F. Kennedy was president

No he wasn't, so my confidence starts flagging.


When they can write AI to understand and implement the entire US tax system, then I’ll start worrying about the incoming AI apocalypse they’re always taking about.


More than likely, it will shoot back a question: "Why the hell did you make this so complex?"


"A strange game. The only winning move is not to file"


If only it were that simple...

Not filing or, worse, making an error in your filing makes it a really really nasty game. No recommended.

I'll even break my own rule: "Not filing considered harmful"


So that every congressman can give their favorite d̶o̶n̶o̶r̶s̶ constituents their very own tax loophole.


Answer: "42."


That's assuming the poor thing doesn't choose to immediately self terminate.


Next time they do this they might want to trying testing out the new system before putting it into production.


> Since Republicans gained control of the House of Representatives in 2010, their partisan attacks have left the IRS with nearly 10,000 fewer customer service representatives to assist taxpayers and a patchwork of IT systems, some dating back to the Kennedy Administration, which is ultimately harming all taxpayers,” Rep. Gerry Connolly, D-Va., told Nextgov.

So the patchwork 1960s-era systems are the fault of Republicans in the current decade?

Maybe "starve the beast" really is a better approach. Nothing else has worked.


I worked next to a team in the late 80's where one service center prototyped a wage reporting program that brought in >$150m the first year. They asked for $2M to expand it nationwide. Refused until Clinton came in.

It's easy to throw stones but I doubt anything built today by anyone on HN will rival problems as complex as DEWS, ATC, IMF/BMF or Sabre and be around in ten years, let alone 60.


>So the patchwork 1960s-era systems

You have (purposely?) mis-stated the previous statement quoted above. A patchwork of systems that includes some systems dating back to the Kennedy Administration is not the same as a patchwork of 1960'era systems. Could it be that the "starve the best" mentality hindered or blocked a number of planned migrations/upgrades?


If you look again, you will see the charge is that there are now 10k fewer first line support workers who previously could help citizens and (for some reason) the legacy hardware.



Their customer service reps sucked long before 2010 and have never remotely "assisted" me.


My experience differs from yours. I have used IRS customer service five different times over the last two decades and each time they have been courteous, efficient, and helpful.


The real question is why we have a tax system that would need 10,000 additional customer support reps. The problem isn’t a lack of reps, it’s a system that requires that many reps.

The flat tax, file-on-a-postcard system proposed by 2016 presidential candidate Ted Cruz was mocked by the establishment because it seemed tha politicians on both sides have a vested interest in a complex tax code — it is the means by which politicians derive their power. See Milton Freidman on the subject:

https://youtu.be/TruCIPy79w8


You can stay away from regressive schemes like a flat tax, and have a tax system with a rubberstamp level of UX. That's how it works in most of Europe and could work here. The IRS already has all of the information needed outside of gambling and stocks, and could simply send you forms already filled out for you to sign. H&R Block and TurboTax simply heavily lobby to keep this from happening to keep their business model around.


Whats interesting about the flat taxes is that it makes it hard to avoid at the highest levels, where complex tax structures end up having a very variant level of taxation. Much like corporate taxation is 35%, but google or facebook pay only 11%.

Milton Friedman used to say that a flat tax of 15% would be enough to replace the entire scheme of income tax. There is also one major boon: lots of capital is spent on taxes, from accountants to lawyers to gubernamental organizations, etc etc.

An effective 15% rate without exemptions would fly very well with the right (simple flat taxes) an the left (no exemptions) but as MF said, the political stances on this topic dont trust each other and they are both right: even if we moved to such a system, the right would still advocate for exemptions and the left for increases in taxation.

Its a noticeable deadlock.

I also want to mention that flat taxes are not regressive by definition. The progressiveness or regressiveness in that case requires to see where the taxes are spent.


First off, the difference between regressive and progressive says nothing about how the taxes are spent. A flat tax is regressive by definition.

Secondly, I'm not a big fan of how you (and quite a few economists) conflate personal and corporate income taxes. It really muddies the water as even questions like "what even is income" is different between the two.

Additionally, flat tax doesn't go well with the left. On the corporate end, exemptions in a lot of ways are the carrot to the sticks of fines and penalties that we use to regulate business (encouraging positive behavior). On the personal end, you greatly increase the tax burden on those least able to bear it, with an increase in benefits being totally orthogonal (most flat tax proposals intend to be revenue neutral). Someone making $25k nearly doubles their tax burden with a 15% flat tax.


> Secondly, I'm not a big fan of how you (and quite a few economists) conflate personal and corporate income taxes. It really muddies the water as even questions like "what even is income" is different between the two.

It was an example of how nominal tax rate is not the same as effective tax rate. For example, mortgage interest deductions are a regressive benefit which lowers the effective tax rate. Unfortunatenly, we cannot make good date on what the effective tax rate is because it is protected information.

> On the personal end, you greatly increase the tax burden on those least able to bear it, with an increase in benefits being totally orthogonal (most flat tax proposals intend to be revenue neutral)

That depends on what you cut. And thats why i say that you need to look at the revenue and the expenditures to know if taxes are regressive or not.

For example, Federal income tax is nominally progressive, but california has plenty of regressive taxes: sales taxes and property taxes that favor older tenants. In general state and city taxes are regressive, because those are the easiest to levy.


Once again, progressive vs regressive has nothing to do with how much you get back in services. That's literally not a component of the definition. You might be confused because deductions are a component of how regressive a tax is.


It is not economics to just look at how a tax is levied, and not how it is spent. The combination is a must to actually make a meaningful argument about how taxes should be.


It is absolutely a component of economics, just not the only piece needed for a comprehensive tax policy.

You're literally arguing against the accepted definition of a term.


Technically, a flat tax by definition is neither regressive nor progressive, it is proportional. Federal income taxes and I presume corporate taxes are currently progressive (with a million and one deductions). Moving to a flat tax for either of those would be a move in the regressive direction, but it's still not technically regressive.

Personally, I'm fine with taxes having progressive rates but not the countless deductions.


Technically it is regressive. The flat amount comes out of rent and food on the low end of your tax base, and luxury on the high end.


Thats sales taxes, but flat income taxes would not be like that.


Uhhh, yes it's exactly like that, regardless of what is getting taxed.

Let's say today I make $24,900/yr. That puts me at the edge of the 20th percentile for 2017. The average effective tax rate for people of my income was -4.7% (or a tax of -$1170)[0]. You're suggesting increasing my taxes to 15%, taking $4905 out of my pocket as compared to today. Now, since I make next to nothing, pretty much everything I make is going to rent, food, and just generally getting by, and I have nearly $5k less real money (or ~20% of my income!!!) in my bank account to make that happen.

[0] - http://www.taxpolicycenter.org/model-estimates/baseline-aver...


Hah. You are making the case that a flat tax is regressive because current income taxes are progressive. So moving from progressive to flat is regressive.

Its a tautological affirmation on progressive taxes being progressive. A flat tax would make a $24,900 pay $3,735, and if you made $249,000 you would pay $3735, your taxes increase just as proportional to your income. Its just..flat.


No I'm making the argument that a flat tax is regressive because the utilitarian value of a dollar isn't anywhere near linear with total income.

With a flat tax:

* On the high end you're taking from ability to have luxuries.

* On the low end you're taking from the ability to pay rent and put enough food on your table for your children.


After criticizing my argument that looking at spending of taxes is necessary to argue if its progressive or regressive, you now propose an argument that would make progressive taxes also regressive unless they are enough to compensate the utilitarian value of how the poorer and the richer spend their own wealth.

The utilitarian argument is for me one of the most practical and also one of the most scary. Are you ready to exploit the poor if it were proven that it was economically profitable to do so? I digress. My point is that seeing flat taxes as regressive is either false or inconclusive, regardless of the desire of being flat, regressive or progressive in taxation.


Flat taxes are regressive. Period. The reason why is because a bigger proportion of the income of the poor must be spent on necessities, so a flat tax rate is going to impact their ability to pay for necessities much more than the same tax rate paid by a rich person.


You are confusing flat taxes to sales taxes. A flat income tax would not have that effect.


Regressive doesn't refer to how the taxes are spent. Regressive refers to the fact that 15 cents out of every dollar matters a whole lot more to someone on minimum wage and living paycheck to paycheck. Regressive refers to the fact that a true "no exemptions" hits the poor more than the rich. So the left has to set aside exemptions for the poor, and that leaves room open for lobbyists to drive trucks full of money into adding more and more exemptions.


A 5% tax levied on everywhere to a lamborghini subsidy policy is regressive. It is important to look at both sides of the equation to make a conclusion.

Thats why you can have a flat tax, and a progressive government by spending for the poorer and levying flat. In fact, a flat tax is probably progressive simply because the rich dont use a multitude of public services the average poor does, like schooling, or healthcare or whatever.


Regressive is not the opposite of Progressive in tax politics. You may be confusing "conservative" here?

The commonly accepted denotation:

> (of a tax) taking a proportionally greater amount from those on lower incomes

Has absolutely nothing to do with spending, but on how much proportionally you are taking.


Again, it is not economics to not look at how it is spent, because the spending can be progressive or regressive.

You can have super progressive taxation and a wealth distribution from the poor to the rich, and a super regressive taxation that distributes from the rich to the poor.


> The IRS already has all of the information needed outside of gambling and stocks,

With the reporting changes of a few years back, for any stocks acquired after the changes took effect, they also already have all the information needed for the stocks as well. All that is missing is the type of sale you wished to make (FIFO, LIFO, average cost, etc.) when you sell the stock if you don't sell all of it off at once.


File on a postcard. Sure. This year, I could have used the 1040EZ (the entirety of which could fit on a post card if you removed the embedded instructions), and left some money on the table, so I chose the more complicated route. Last year, I just used the 1040EZ. Flat tax is a completely different issue.


It is weird to see the argument that making tax filingd hard builds power, for it to work you have to dismiss the inuitive idead that making tax filing simpler will make citizens happy and less likely to complain since they have beens spared from some hassle.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: