Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> is their project a spam site or not?

To me that depends a LOT on how they present the advertisements and what they do on pages where they don't have information for a product. My biggest complaint with so called review sites is that they are presented more advertisements than content. They also tend to have automatically generated pages for every model number you can imagine, including incorrect ones that they receive searches for. On those pages there tends to be links to shopping sites and prices for unrelated appliances and products. I absolutely consider that to be spam because they are content free.

The problem with google searches lately, and especially for things like appliances, is that the spammers and content mills are clearly winning. In my current search for a washer and dryer the manufacturers page was on page four or five or the results. The first few pages were flooded with bogus content pages, sale pages and unrelated pages. Trying to filter out shopping sites and explicitly target specific keywords and filter others doesn't help. There are a growing number of search topics for which google is simply broken.



In regards to the style of the pages, I completely agree with you -- probably because we have similar tastes in content. Others, however, might not. I'm not saying that flashing text and floating images are okay, just that the threshold between "mildly annoying" and "total shit" is a personal thing.

The problem is, as you point out, that most people, most of the time, are beginning to see results they don't need or like when they type a search. This is a big problem for both searchers and the companies that provide search. If you create an algorithm for directing people's behavior (a search engine) folks are going to game it. You and I might not like it, but "gaming the way people do things" is called marketing in any other context and has been around for hundreds of years.

This leads me to suspect that no simple (or even complex) system of finding things for people is ever going to work for an extended period of time. It's a radar vs. radar detector problem. It's a natural competitive situation.

But it doesn't have to be all bad. From competition and fitness criteria comes evolution. Spammers and search engines will probably be a key part of how AI evolves. It'll be neat to see if we move beyond Bayes -- and if so, how would that work?

The one thing you bring up that's interesting is what to do with bad searches. How do you deal with a mis-typed part number? Should a system know which part number you have? If so, how would that be done?

I think the spammers covering all the misspellings are doing a service -- as long as the site isn't obnoxious and provides the user with the information they are looking for. We think of it as a failure of Google, but in fact it looks like a win: thousands of little spammers trying to find all the mistakes I make and providing content for them -- as long as they have my best interests in mind (and are not trying to trick me). I'll happily look at an advertisement for a Ford Explorer in return for valuable information on my 1978 dishwasher that I couldn't read the entire part number for. And I hate ads. I like that scenario a lot more than looking for a favorite mp3 for a cell phone ringer and spending the next 3 hours in spammer hell.


> The one thing you bring up that's interesting is what to do with bad searches. How do you deal with a mis-typed part number? Should a system know which part number you have? If so, how would that be done?

There's lots of ways to handle this but doing a fuzzy search for similar or possibly related part numbers should be easy enough. The user can then be presented with those search results. If the model/part number isn't found you could even provide them the option of adding content for that part if the site takes user generated content. I don't think I've seen any site take this approach but instead go with the spammy show a ton of ads approach instead.

Appliance models and part numbers is actually pretty interesting. A couple of people I know built and maintain a simple desktop application for smaller appliance retailers and parts/service companies. The application contains a database of all valid part numbers issued by every major appliance vendor for the last twenty years or so. This information is updated about once per week by the manufacturers and they supply this information freely to anyone that wants it or to members of specific programs. Some of this information is provided via faxes or emails which sucks but data entry can be farmed out to temps. This company aggregates the data and provides it as a service to their customers. The application can do full or partial matches for part numbers and can filter based on appliance type. If a small two man team that doesn't even work on the project full time can successfully manage that I don't see why the big web based sites are so full of bogus content and spam.


There's lots of ways to handle this but doing a fuzzy search for similar or possibly related part numbers should be easy enough. The user can then be presented with those search results. If the model/part number isn't found you could even provide them the option of adding content for that part if the site takes user generated content. I don't think I've seen any site take this approach but instead go with the spammy show a ton of ads approach instead.

Fuzzy logic searches would be awesome.

The problem here, of course, is that the site doesn't own the search program. They can only influence it in certain predefined ways. So if you're building a site for dishwashers from the 1970s and you know that folks consistently misspell some brand name? You either provide a page for that misspelling that Google can crawl or those folks don't get content. Assuming you're doing a quality site, folks who can't spell need content as much as those who can. Yet if you provide a page based on a misspelling folks will yell "spammer!". It puts you in a bind. There's no answer everybody is going to be happy with.

I think people can tell whether or not site owners are trying to help them out or just trying to trick them using Google. At least I hope so. I know as much as I hate ads, I'm happy if I never saw one again for the rest of my life. I have to be careful not to take that personal opinion and apply it to all site creators, however. There's nothing wrong with noticing that folks are looking for something, can't find it, and providing content in that area.

In a lot of ways Google is a victim of their own success. The net was so new, the algorithm so cool, that it looked a lot like magic. People got used to the magic and forgot that it's just a computer program somewhere. I think we may expect too much.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: