Powerset?

henning · on Oct 13, 2007

I don't see what they know that Google doesn't. Google certainly has no shortage of people familiar with symbolic natural language processing methods. It's not like Google hasn't read the major papers and books in that field. If the Xerox technology Powerset licensed has been published anywhere, Peter Norvig and lots of other people there have almost certainly read it very carefully.

socmoth · on Oct 13, 2007

you can take what you said and apply it to google and their published pagerank. or the establishment of the previous players.

as long as there is a lot money in search, there will be a lot of people competing for it, and people funding them. And the very least it will do is keep google on its toes, which is good.

testapplication · on Oct 14, 2007

pagerank is published, but patented.

joeguilmette · on Oct 13, 2007

speaking of which, yahoo's latest additions to its search are pretty impressive, i must say.

their front page is still a bit messy for my taste, but they've really upped the ante for google in ways that no other search engine i've seen thus far has.

anamax · on Oct 14, 2007

search.yahoo.com is the equivalent of www.google.com .

www.yahoo.com is a portal page with a search box.

blader · on Oct 13, 2007

Anaphoric: Our consumer web search product is still some time away, but we have launched Powerlabs to a small group of users and we are inviting more every week. I'll see if I can get you in to the next batch.

anaphoric · on Oct 13, 2007

Thanks and sorry about the public grousing to get myself invited :-).

I understand that you guys are taking on a profound and very important challenge. And I am very much wishing the venture success. I think that NLIs eventually will become a dominant interface solution for many tasks, search included. I look forward to looking into what you guys have.

BTW are you doing anything in domain specific search? Feel free to reply at mjm@anaphoric.com or mjm@cs.umu.se.

Regards, MM

blader · on Oct 13, 2007

We are demoing some domain specific use cases in Labs, but the goal is ultimately an open box search engine at web scale.

anaphoric · on Oct 13, 2007

Yes it was my impression that it aimed at the open domain. BTW are you doing any dialogue modeling?

The reason why I asked about closed domain work is because that's what I am working on. See http://www.youtube.com/watch?v=fWio8bHq4wQ

blader · on Oct 17, 2007

Can you email me at schen (at my company's domain) so I can send you an invite?

prakash · on Oct 14, 2007

can you get me into your next batch as well? Thanks.

blader · on Oct 17, 2007

Can you email me at schen (at my company's domain) so I can send you an invite?

dfranke · on Oct 13, 2007

After they finish porting it to Hurd.

plinkplonk · on Oct 13, 2007

They apparently use Ruby for their front end and Rails for their internal tools. I am a little curious as to how that turned out. I haven't seen any blog posts recently.

With all due respect, I doubt if they have the brain power to create something to dethrone Google. Google has some incredible people working on search.

A couple of years ago, when the Yahoo vs Google war for search dominance was at its peak, I asked a friend of mine who worked at Yahoo, "Who is your equivalent of Peter Norvig? " (who was then Google's Director of Search Quality - today he is Director of Research) and after some hemming and hawing he told me " well, we don't have anyone like that but then we are a media company, not a search company.("We are a media company" was the mantra Terry Semel was repeating at Yahoo then) and I knew then that Yahoo would never beat Google (in search).

I wonder what the Powerset guys tell themselves? I find the internal mythologies of companies fascinating.

bsg · on Oct 13, 2007

From what I understand from the presentation given at the Singularity Summit by Barney Pell, Powerset's NLP technology was developed over 30 years by Xerox. (You can listen to the audio at http://www.singinst.org/media/singularitysummit2007 ). That could conceivably give them an edge, or at least a short term lead. :o)

anaphoric · on Oct 13, 2007

Honestly it may be somewhat heretical to say it, but I doubt very seriously that an old Xerox patent really has so much relevance.

The academic community has been working with semantic indexes for quite some time. I know many of those involved. The real question is whether they have developed something fundamental recently.

As for performance, yes perhaps they have some innovations. But with NLIs its usability that matters the most in the end.

At a more technical level I think the question is how expressive/consistent is the logical form they are mapping too. If they have developed a parser that maps to an LF that can lead to actual inferece then that would mean something. But then they need an open domain strategy to actually reason over such expressions in a meaningful/useful way. We will see.

neilk · on Oct 14, 2007

To me, the idea that they went back to a thirty-year old algorithm is the one credible thing in the whole story. Almost nobody reads old papers in computer science.

Somewhere out there in the libraries are the computing equivalents of transparent aluminum, but you can barely get researchers to look at this stuff, let alone Joe Javahead.

anaphoric · on Oct 14, 2007

Good point. And actually from another post in this thread it sounds like it is more than a patent/algorithm, it is a painstakingly built system. So yes perhap they have something. We will see.

As for you comment about researchers not being aware of the literature, I agree 100%. I review from time to time and the number of papers that are reinventing the wheel (and doing so in a sloppy way) is staggering. I think the problem is that too many researchers are just concerned with building up as many papers as possible to beat the tenure clock and/or to impress their rivals.

Evaluators somehow need to stop bean counting publications as a measure of merit. The problem they face is they don't know how else to evaluate...

corentin · on Oct 14, 2007

I'm sure that many interesting software gems were abandoned because the hardware was not powerful enough at the time.

Plus, when computer science was new, a lot of crazy stuff was being researched; nowadays the academic research seems pretty close-minded.

plinkplonk · on Oct 13, 2007

Not to start an argument (you(==bsg) make a good point), but one should be very careful in judging a technology as an "edge providing" one just because it comes from PARC.

PARC has certainly done brilliant,pioneering work in many areas. What most people miss is that PARC had many deadwood/unsuccessful projects (and people and worthless papers ) in its time.

I know nothing about what exactly PARC licensed to Powerset, but without more data, I wouldn't automatically assume that it provides an edge.

bsg · on Oct 13, 2007

I was thinking principally of the sheer number of man years that have been put into the system. I know that's not a very accurate yardstick, hence the smiley. :o)

My understanding is that the technology Powerset got from Xerox is a production quality, multiple language parsing system with a language-independent core, with the long development time being due to the engineers being allowed to work on the problem until they had solved it to their satisfaction.

Of course, all of my information comes from the above-mentioned presentation, so a large pinch of salt is almost certainly required. :o)

ESYudkowsky · on Oct 14, 2007

An important point to remember about Artificial Intelligence technology: Stanford's Stanley robot, that won the DARPA Grand Challenge by driving across a desert course, was based on a conceptual revolution that started 20 years ago in 1987. The Bayesian revolution. It wasn't based on a big new idea, it was based on a big two-decades-old idea, the Bayesian revolution, which it took that long to finally get right.

The latest cool robots an' stuff that the media wants to report on, are generally based on AI ideas decades old; and the newest most brilliant ideas of AI today, may not yield impressive technology for years to come.

If PARC developed Powerset's ideas 30 years ago, that might just be par for the course.

rokhayakebe · on Oct 13, 2007

Powerset is simply a bunch of hype and money thrown in the windows. You can build an engine much better overnight using the Yahoo answers and AnswerBag APIs.

anaphoric · on Oct 13, 2007

Yes I too am skeptical. I work in the area of natural language interfaces and when I hear their claims I cringe.

My own feeling is that NLIs can be useful, but really only in closed domains. In fact my own efforts are toward NLIs to relational databases.

If I understand their approach, they are building semantic indicies over large sets of documents (e.g. wikipedia). Sure you can match the user's query using these indicies but the inference thereafter certainly must be very weak. Yes any functionality this gives could probably be achieved with simple IR based techniques.

Still I am curious about how this will all go down once they launch a public interface... Still, I will be pleased if they actually show something of value. Time will tell.

aswanson · on Oct 13, 2007

Agreed. Think about it. If they can answer any naturally posed question, are they not Turing test material? This rant, which I linked to awhile ago in another post, sums up the situation: http://blog.searchenginewatch.com/blog/061005-095006

anaphoric · on Oct 13, 2007

Oh an another thing, I was very polite and considered when I requested to be a member of their power labs. And four weeks on now and no response. I consider that to be a bad sign.

rokhayakebe · on Oct 13, 2007

How to build a natural language search engine. YahooAnswersAPI + AnswerBag API + index all FAQ webpages + index YELP and similar Q&A sites. That is it. Then when you get bought for that secret underlying technology send me a check. Matter fact don't. Send it to a educational charity in Africa or the 100 laptop project. Edit: I know this because I got one of my co-workers to do it for me in the past using only the Yahooanswers api and believe it anyone who tried it loved it although it was more fun than helpful.

rami · on Oct 13, 2007

http://www.lexxe.com/