Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not about storing XML, it's (as far as I understand the patent) about a specific representation of XML that can be more efficient to read.

The patent is about representing documents with markup (XML or otherwise) not by embedding them in the text, but rather having them stripped and maintained as a separate list of (tag, position) pairs, with the document only containing the raw text.

I'm only surprised that Microsoft couldn't find prior art, because having a (content-type, address) index at the beginning of a file is not exactly an unusual representation. It also reminds me that the USPTO's idiosyncratic usage of non-obviousness doesn't really match my intuition.



This is a huge issue with the patent world in general. There's just so much prior art out there, and you have to be really clear about showing that it applies. This isn't a patent case, but I have a great Google Maps case involving Wi-Fi where a judge completely borked it. As for this particular patent, I'm not enough of an XML expert to say whether the court got it right here. But it is worth noting that Microsoft tried to invalidate the patent several times with USPTO and failed to do so there as well. So perhaps there's something more to the patent than meets the eye, or that is was novel at that time but not modern XML. Remember, the actual i4i patent at issue was filed in 1994, and it only matters if there was prior art from before 1994. It might have been novel at the time.


> Remember, the actual i4i patent at issue was filed in 1994, and it only matters if there was prior art from before 1994. It might have been novel at the time.

I am aware of the date of the "invention". I was programming on 8- and 16-bit computers in the 1980s and I was using this and similar kinds of formats for non-textual data, simply because it was easier to do this in assembler than writing a parser, paired with the difficulty of finding unused special bytes in binary data to separate meta-information from the data proper.

And I was also talking about non-obviousness, not novelty.


Fair enough. I haven’t seen the invalidation proceedings and am clearly less of an expert than you. So don’t know whether they got it right. Non-obviousness is, erm, non-obvious.


Am I right to understand that it would be the equivalent of visual studio's wpf designer [1], where you have the WYSIWYG editor side by side with an xml editor and you can make the change in either of them and it translates into the other?

If it is, it would have been really really cool.

[1] https://i.stack.imgur.com/8pJnn.png


No. It's more like what the following piece of code produces:

  def convert(xml):
      import re

      parsed = re.split(r"(<.+?>)", xml)
      output = parsed[0]
      tags_with_pos = []
      for i in range(1, len(parsed), 2):
          tags_with_pos.append((parsed[i], len(output)))
          output += parsed[i+1]
      return tags_with_pos, output


> the USPTO's idiosyncratic usage of non-obviousness doesn't really match my intuition

Remember that USPTO gets paid for each patent application, and not penalised when it's later falsified.


Well, it was apparently upheld twice on reexamination, where they could have fixed that. The problem is more that the bar for non-obviousness is so low, it's basically on the floor. Paired with a discipline (software development), where independent reinvention is common, this is just a recipe for disaster.


Everyone knows 1+1=2, so why did Russell spend many many pages/hours on a proof, surely if people know it then it's easy to demonstrate? /s

Programmers are notoriously good at documenting everything after all. /s

It's easy to give documentary evidence for things someone found self-evident and so only wrote a scribbled note about in a workbook 40 years ago. /s

FWIW patent law obviousness is not the same thing as ordinary notions of obviousness either.

All my personal opinion, ofc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: