According to the paper, the dataset goes up to 1978 because that's when copyright law was updated to automatically apply to newswires. It's unfortunate that we got into the situation where academia has to play by the rules wrt copyright while big private labs flaunt it.
This is, unfortunately, how our (US) justice system works.
The entire concept of an "NDA" has been bastardized in this manner, if you think about it. Conceptually you may think of an NDA as protecting sensitive data from disclosure, a sort of intellectual property right. However, it's been co-opted by folks to do nothing more than cover up inconvenient truths because they realize most people cannot afford to either (a) give up money they are promised in the future or (b) bankrupt themselves in their own defense.
So it's basically a game where whomever has the most money can ensure their narrative wins out in the end, because competing narratives can simply be "bought out".