You aren't the first company to try grid computing on personal computers, so once people stop talking about how cool the idea of grid computing is or how to protect the web surfers rights, they'll start asking the hard questions. Like, how you're going to make money.
How do your paying customers feel about their data being strewn about the Internet? How does your processing interfere with the game that the user is playing? What kind of problems have you earmarked as being particularly parallelizable like this (ie what is your marketing plan?), and how usable is your API?
At the end of the day, is going through the effort to rewrite a program to work on distributed Java applets going to produce substantially more processing for the customer dollar than a $150 headless dual-core 1gz board with a gig of RAM running C++? You say that you can provide a $1500/yr savings over EC2, but this is only for applications that are "embarrassingly parallel", and will certainly require hiring a programmer to write code to run in tiny chunks on lots of clients. Will this cost less than $1,500? Worse, it will be money up front instead of spread out over a year.
You could say that the system is best leveraged by clients looking for a lot of compute power on a budget, but those are the people most likely to build their own grid. When you can get linux running on a dual-core 2.0gz for $200 / node, spending that money every year on an untested platform that will require substantial development work and vendor tie in, your service is going to be a tough sell.
You will be the only people interested in writing code for this platform, so please repost when you know which algorithm you can run to gross $10 million a year.
These are all very good questions and we certainly know that we're not the first company to do grid computing on PCs. :)
First, $1500/node-year (or even $200/node-year) comes to a very substantial amount of money for our target customers. These customers use 10000s of nodes and buy high-end computers and pay a ton for network, power, cooling, and support personnel.
My background is in HPC and I've faced and solved these issues many times. We have a multi-pronged approach to using the compute power:
1) We are definitely writing our own apps on top of Plura. It's a great source of compute power and we have lots of ways we want to use it ourselves.
2) We have a short list of very high impact embarrassingly parallel customers that are evaluating Plura. We intend to help our customers absorb the porting cost associated with Plura.
3) We are open to letting students and universities use Plura's excess capacity for free. We've had several nibbles at this already.
That being said, we realize Plura is not for everyone. It is for a subset of the class of embarrassingly parallel applications out there.
PC-based grid computing has already been made profitable. The trick is to make it both profitable and legal. Perhaps some variant on fast-flux name resolution could help with a particular subset of content delivery. Perhaps some other legal-but-Russian Business Network-inspired maneuver. Botnets are naturally limited because they're stamped out with extra vigor when they get big and famous; this could have staying power to leverage for tasks even Storm would've find impractical.
Are there ample warnings to, or explicit agreement with the user that his/her CPU time is going to be used by non-game functions? This seems suspiciously like theft of services.
When I run any process I expect it to restrict all its actions to servicing direct functions related to that process. Plura is expressly unrelated to ANY process/game it is bundled with.
I would personally be annoyed/angry that a program I was using, web-based or not, was using my computer's resources to make money without my knowledge. In fact, you're not just using my computer's resources, you're also drawing more power from my electrical grid, because a processor doing more work consumes more electricity. That's something that directly costs me, the end user, more money. So I would absolutely classify this as theft of service unless the user explicitly agrees to have Plura running in the background.
If there is ample warning to the user, I am fine with this, and would consider it a good idea. Otherwise, it feels really sleazy and wrong.
Thanks for the comments. We really try to be above board, include it in TOS, and encourage disclosure to the users. I mentioned this in another response, but we even have some affiliates that do some sort of opt-out procedure.
Our hope is that the users end up reaping the benefits from this via increased development dollars or reduced ads. We have some affiliates that are exploring ways of giving some form of in-game currency in exchange for Plura time. For example, you might earn more gold, a better performing sword, higher production, or something like that.
Legally speaking if the game (or whatever) doesnlt make 100% clear what is going on ("we are going to be using your CPU") then I'd say it's very shaky territory.
Also I am pretty sure that in a lot of countries you would be pretty much required to either have a splash saying "click yes to play this game and work some units" or have a workable opt out (I dont know if that exists??)
Hey guys, we just launched our private beta for Plura.
I submitted our game page as the HN link, but www.pluraprocessing.com has more general information.
Take a look and tell me what you think. We've been paying our alpha affiliates for months now and are ready to take on a much larger pool of affiliates in our beta. "Affiliates" are the people that put Plura on their site, browser game, or other content (even download apps) and get paid for their user time.
It seems like there is potential for a much bigger market than games. Streaming video and audio... fantasy sport applets, etc. Is plura just starting with games since they provide what I suspect are ideal conditions (already using CPU and running long enough to for Plura to actually do some work).
There is a big search in the gaming industry for new sources of revenue. Development costs are skyrocketing but income is not. Retail and subscription are the big two, with online advertisements growing in popularity. If harnessing players' compute time enables a gaming service provider to offer a better product at a lower price, this could be a win for the gamer, the game company, and the ultimate computing consumer.
Yes, we are definitely interested in other sites too. Our main page is pretty generic, but we decided to do a special pitch to game developers for exactly the reasons you state.
I think this is an innovative approach to getting more people involved with grid computing for major computations and a clever way to market it in general. However, I agree with some other people here, there has got to be some sort of licensing issue. As much as the whole "you clicked okay to the TOS, you gave us permission." is a tank in court, it still seems to risky. If nothing else, it would risk users no longer using the games from the providers using that service if they found out that their computer slowed down at all during playing that game because it was doing things it wasn't supposed to. Not that I would likely have a problem with it, but lets face it...the general public is slap happy with lawsuits.
Also privacy issues in the entire concept. I didn't notice any (note: I didn't go out of my way to find one either) way for the end users to directly go and see exactly what kind of materials were being processed and what data, if any, was being collected from the user's machine. That might also be a source of trouble in the long run.
Interesting comments. First, let me explain that the client side runs entirely within memory; the hard drive is not touched at all. All processing or computation is done within the Java sandbox. In other words, you are not giving us unfettered access. In fact, it's quite the opposite. :) We follow everything listed at http://java.sun.com/sfaq/, whether we're running an applet from the browser or from within a desktop application.
Second, although we encourage disclosure, we leave the disclosure up to the individual sites and some have chosen to use some form of opt-out or opt-in. That being said, we also provide a TOS that they can use for disclosure if they choose. If the disclosure is done in a positive way, the users should see the benefits of getting more game features for free.
Nice rebuttal, I wish you the best of luck, it seems like something with a good amount of potential. I especially liked your perspective on because the gaming services do this and you pay them for it, they can implement better features for their own users.
Oh well, most Flash stuff already runs my CPU to 100% even when it's clear it shouldn't. (100% CPU to display a static "Game Paused" screen? Give me a break!).
---
"After receiving the WU, the user's computer will perform the computation. The game developer can control how much of the user's CPU is devoted to computation by setting the % usage (either statically or dynamically)."
How do you accurately control the amount of CPU a program takes in user-space? Can you really have a process that will restrict itself to using no more than 25% CPU? (Also, 25% of a fast CPU is more than 25% of a slow one)
The apps that we're running right now control it pretty well using a sneaky trick. We can't get much computer information from java in an unsigned app, so we measure how long each sub-part of a work unit has taken and sleep an appropriate amount of time. For example, if we are trying to use 25% and a sub-work takes 100ms, we sleep for 300ms. This effectively gives us 25% usage of the available CPU. It will be no higher than that anyway - it might be lower if something else is using substantial CPU time.
As far as the fast versus slow nodes, we normalize everything based on the average computer that completes work each day. So a fast computer may earn 2X or more what a slow computer earns.
What's with the payment being per-user per-month? Why not just straight per-work unit? Isn't a user who plays the game 20h/m (or at least leaves their browser window open that long) more valuable than the one who only plays for five minutes?
Monthly payment = $2.60 times # avg. simultaneous users times CPU %
It sounds dishonest to summarize this as 2.60$ per user per month. It is 2.60$ per user, if that particurlar user plays non-stop for 30 days straight, assuming 100% CPU
In other words, you get 1 cent per 3 hours of play time. To get 100$ per month, you need 10000 users, playing one hour every week or so. (hopefully, my math and interpretation is correct)
We put it in terms of average simultaneous users because that's the way a lot of affiliates think. They say, we have an average of N people playing the game at any given time.
For websites and other affiliate types, we say we pay $15/MWU (million work units) where a work unit is 15 seconds on an average node (the system-wide Plura average node performance).
This does strike me as verging on unethical; you (and your affiliates) need to make very clear statements to end users that you will be using their computers to do other processing while they are playing.
That said this is a pretty cool model, and I'm sure that most people would be happy to let a few of their CPU cycles pay for the game they are playing.
I guess it's time to null-route pluraprocessing.com.
In other words I really really really don't like hidden or undisclosed functionality in whatever the software I am running. Especially if it is explicitly designed to utilize my otherwise idle resources.
Few years back my friends and I were talking about putting a DES brute-forcing client into a flash applet and then sticking it on a high-traffic site. It never went anywhere beyond a discussion though exactly because of the ethical issues involved.
I applaud your effort. Great idea! Personally, I can see this widely used, on both the user and customer side. Man, why didn't I think of that! hahahha.
There is no question there will be plenty of affiliates/CPU time (re)sellers.
The real question is whether there is big enough demand from those who need grid computing.
You had a quote from some quant guy, I suppose sciences is another market, not sure about others.
Here's a blog post we wrote comparing us to Amazon's EC2 for HPC apps: http://pluraprocessing.wordpress.com/2008/10/23/comparing-pl.... Our numbers should be very compelling for certain types of applications compared to the cost of building your own clusters or using an EC2-type service.
There are definitely certain applications where this form of grid or cloud computing (I hate to use such a popular buzzword) will enable apps that weren't possible before. Previous attempts at grid computing for HPC depended more on philanthropy instead of having a scalable business model that allowed the addition of 100s and 1000s of nodes at a time.
1) I think this is a really smart idea, and hope it works well for you. It seems like a win for everybody, as long as the end user knows what's happening.
2) You maybe should look at the phrase on savings in your paper:
"if your application is suitable for Plura, you can save 7X on your compute costs".
I take this to mean you save 7 times your compute costs, where it should be six sevenths of the cost. There's a difference, and your customers will understand it. Sorry if I've just misunderstood.
Yes and no. We have created a signed applet capability in the Plura infrastructure, but we have not pushed this out yet. In the future, if an affiliate chooses, he can request work units for a signed applet that would allow them to make http requests. We will probably pay more to affiliates that do this and will certainly require user opt-in. Our normal applet is unsigned and cannot make http requests due to the java sandbox.
We have one company (80legs.com) that is developing a web crawling solution using Plura. We are using a different model to acquire the nodes for this.
Folding@Home is of April 2009 sustaining over 8.1 PFLOPS [1], the first computing project of any kind to cross the four petaFLOPS milestone. This level of performance is primarily enabled by the cumulative effort of a vast array of PlayStation 3 and powerful GPU units.[2]
The entire BOINC averages over 1.5 PFLOPS as of March 15, 2009[3].
SETI@Home computes data averages more than 528 TFLOPS[4]
Einstein@Home is crunching more than 150 TFLOPS[5]
i like what you are doing, i just don't understand the commericail opp when these guys do it at the univ. for free?Folding@Home is of April 2009 sustaining over 8.1 PFLOPS [1], the first computing project of any kind to cross the four petaFLOPS milestone. This level of performance is primarily enabled by the cumulative effort of a vast array of PlayStation 3 and powerful GPU units.[2]
The entire BOINC averages over 1.5 PFLOPS as of March 15, 2009[3].
SETI@Home computes data averages more than 528 TFLOPS[4]
Einstein@Home is crunching more than 150 TFLOPS[5]
No, we tried to do this, but we can't run Plura in Adobe AIR. AIR uses Webkit for its browser, but Adobe has removed support for browser plugins, including Java.
Oh, and most developers use 50% so far. We have a download app that affiliates can distribute too if they want to. We can set that to run at different numbers when idle and in use. For example, I have mine set to 100% when I'm not using the keyboard and mouse and 50% when I am using it.
Really cool concept. One cosmetic issue - when you mouse over the links on the right side, the underline makes the text below look really cramped. A few more pixels of padding would be nice.
I'm curious about where the money is coming from, i.e., what kinds of computational problems are you solving for your clients, and what kinds of clients have hired you?
Hey nice concept!
May be personal, but I don't like the color-scheme.
Game developers work for and with love, so probably they want some colors with love ;)
No, it does not have to be a game. We're just focusing our initial affiliate marketing on games because of the higher engagement.
The type of sites you mention will work well too. Send us a note through the Contact Us page on the website and we can help you figure out how much you could earn.
One of the issues that I see with this approach is that for most applications the scaling issue is data scaling rather than CPU. Here's why I think you will run into scaling issues:
Lets assume you have a gigabit line out of your colo. Lets also assume that your average game client is on a cable modem with a 1megabit connection. That gives you capabilities to stream work units to 1024 clients simultaneously as an upper limit. In keeping with an average 1 megabit client, it will take 3 minutes to stream down a 20 megabyte work unit, maxing their connection. 1024 concurrent clients * 20 megabytes = 20 gigabytes. So you're looking at 3 minutes of overhead transfer out, and likely another 3 minutes or so of overhead transfer of results back from the client. So that's approximately 6 minutes per gigabyte just in transmission overhead. And it gets worse as clients are added to the system, since you would need scale out your datacenter just to handle coordinating all of the clients. Which begs the question: why aren't all those servers just doing the dang work already?
That kind of overhead limits this technique's usefulness only to applications which have relatively high computational complexity and relatively small amounts of data. And those applications do exist, however they're pretty far from the day to day needs of most companies. Sun found this out the hard way with their Sun Grid project, which last time I checked was a failure. Sorry, I really wish you the best of luck.
Yep, we quite aware of all of these issues. My background is in HPC and my previous company was a successful exit to a major oilfield services firm. My software and its descendants are used on nearly 100,000 CPUs.
We are definitely focused on the applications that have extremely high compute/io ratios. In general, these boil down to either high compute problems with no real data or problems where the data can be shared between multiple work units. An example of the latter is stock market analysis - the nodes download stock data for a few stocks and stay busy running different combinations for a very long time.
So that brings up a good issue, namely the proprietary nature of the code you're distributing. For instance, stock market analysis firms zealously guard their algorithms. Let's say I'm a potential customer: How will you protect my bytecode from being stolen by competitors when it has to be run on unmanaged machines in the wild?
Good question. There really isn't anything that stops someone from reading and trying to interpret the byte code.
There is some protection in the fact that you never know what type of work unit is happening at a given time. The algorithms are typically each snippets of code instead of full applications, so someone would need to piece together quite a lot of information.
Of course, if someone is particularly concerned, they can run a jar obfuscator.
How do your paying customers feel about their data being strewn about the Internet? How does your processing interfere with the game that the user is playing? What kind of problems have you earmarked as being particularly parallelizable like this (ie what is your marketing plan?), and how usable is your API?
At the end of the day, is going through the effort to rewrite a program to work on distributed Java applets going to produce substantially more processing for the customer dollar than a $150 headless dual-core 1gz board with a gig of RAM running C++? You say that you can provide a $1500/yr savings over EC2, but this is only for applications that are "embarrassingly parallel", and will certainly require hiring a programmer to write code to run in tiny chunks on lots of clients. Will this cost less than $1,500? Worse, it will be money up front instead of spread out over a year.
You could say that the system is best leveraged by clients looking for a lot of compute power on a budget, but those are the people most likely to build their own grid. When you can get linux running on a dual-core 2.0gz for $200 / node, spending that money every year on an untested platform that will require substantial development work and vendor tie in, your service is going to be a tough sell.
You will be the only people interested in writing code for this platform, so please repost when you know which algorithm you can run to gross $10 million a year.