Show HN: TensorDock Core GPU Cloud – GPU servers from $0.29/hr

wintermute9124 · on July 28, 2022

GPU pricing of alternative clouds (lowest price to highest):

[A100 PCI]

Lambda Labs: $1.10/hr

TensorDock: $2.06/hr

Coreweave: $2.46/hr

Paperspace: $3.09/hr

[A100 SXM]

Lambda Labs: $1.25/hr

TensorDock: $2.06/hr

Coreweave: N/A (I think PCI only)

Paperspace: N/A (I think PCI only)

[A40]

TensorDock: $1.28/hr

Coreweave: $1.68/hr

Paperspace: N/A

Lambda Labs: N/A

[A6000]

Lambda Labs: $0.80/hr

Tensordock: $1.28/hr

Paperspace: $1.89/hr

Coreweave: $1.68/hr

[V100 SXM4]

Lambda Labs: $0.55/hr

TensorDock: $0.80/hr

Coreweave: $1.00/hr

Paperspace: $2.30/hr

[A5000]

TensorDock: $0.77/hr

Coreweave: $1.01/hr

Paperspace: $1.38/hr

Lambda Labs: N/A

Jonathan thanks for the post. A question: it sounds like TensorDock partners with 3rd-parties who bought these servers and TensorDock doesn't actually own any of the servers you rent out. If that's the case, how do you ensure security? If not, please ignore.

[References]

https://www.paperspace.com/pricing

https://lambdalabs.com/service/gpu-cloud#pricing

https://coreweave.com/pricing

https://www.tensordock.com/product-core

freediver · on July 28, 2022

A100 is available for $1.03 from GCP and V100 is available at $.27 per hour from Alibaba according to CloudOptimizer [1]

[1] https://cloudoptimizer.io

jonathanlei · on July 29, 2022

Just a side note here, it seems that these are spot instances, whereas all our VMs are non-interruptible. So there is a bit of difference (e.g. you probably wouldn't do a weeklong Blender render on a GCP VM if it could be interrupted and lose your work, whereas you can definitely run it on TensorDock because our VMs are reserved).

Of course, you can set up data checkpointing to save your data, but overall, it is a bit of an extra hassle to run on spot/interruptible instances, and if you do get interrupted, you are wasting valuable time waiting for stock to free up again.

tinus_hn · on July 29, 2022

Kind of sad because a Blender render is exactly the right usecase as it doesn’t really matter that much if it takes a little bit longer.

wintermute9124 · on July 28, 2022

I have never seen this tool. Thanks for sharing!

The Alibaba price you cited is for interruptible/spot. For V100 on-demand (uninterruptible) Oracle is least expensive from that list at $1.275/hr.

freediver · on July 28, 2022

Thanks, I created this tool to exactly find cheapest GPUs then expanded to everything else.

Interruptible is sufficient for model training which is what these high end GPUs are typically used for (and you want cheapest possible)

jonathanlei · on July 28, 2022

Wow, cool! Yes - interruptible can be very cheap... I'll add it to our backlog so we do that instead of idly mining

I was wondering, do you happen to have an API for listing servers? We're launching a marketplace later in August (https://www.tensordock.com/product-marketplace), and we expect pricing to be really really cheap. Like #1 in industry cheap while retaining.

Interruptible, if we add that, would probably be even less than those prices listed.

It'd be really cool if we could auto-update availabilities of GPU servers through an API so that we can list our servers on your tool as well :)

freediver · on July 28, 2022

If your prices really end up being industry cheapest let me know and we can make it happen.

wintermute9124 · on July 28, 2022

Didn't realize you made this tool. It's super useful.

Some unsolicited feedback, if you're still actively developing:

- You should consider some of the lesser known cloud providers (e.g. Coreweave/Lambda Labs/TensorDock).

- Add information about whether the servers support NVLink/NVswitch. For example, A100s come in 3 flavors: PCIe without NVLink bridges, PCIe with NVLink bridges, and SXM with NVlink/NVSwitch fabric.

freediver · on July 28, 2022

There are hundreds of smaller providers, each having a different API , if having it at all, so this is not possible for a one man operation. CloudOptimizer is already the largest cloud comparison tool on the web (12 cloud providers listed)

Havoc · on July 28, 2022

>TensorDock doesn't actually own any of the server

Vaguely recall reading they're using a mixed model, but OP will hopefully confirm

fxtentacle · on July 29, 2022

A100 at €0.44 = about $0.5

https://puzl.ee/cloud-kubernetes/configurator

I'm quite happy with them, but storage is limited to 1GBit/s and you should know basic kubectl to use it fully.

mdda · on July 29, 2022

Isn't that for only one quarter of the A100?

artemisart · on July 31, 2022

Yes... a full A100 (+ reasonable CPU etc) looks closer to 2€.

antupis · on July 29, 2022

Somebody should build "cheap" service to EU. Quick googling and Paperspace is only one with datacenter at EU. Those GDPR requirements are now rather nasty so you usually cannot move customer/third-party data outside EU.

jonathanlei · on July 29, 2022

We will consider! We are working with two potential suppliers in the area... expect more news by the end of the year if those deals go through! :)

jonathanlei · on July 28, 2022

Thanks for the comment! Quick answer: it's complicated.

Long answer: We own a substantial amount of compute ourselves, as far as Singapore where we have fully-owned hardware at Equinix SG1. We started with our own crypto mining operation just outside of Boston, but as wholesale consumers approached us in 2020 due to pandemic surges, we added business internet and power backups. Suddenly we were operating a rudimentary "office data center." Two large reseller sites sell on our fully-owned hardware. Boston's electric costs are very high ($0.23/kWh), so we're gradually moving more hardware to tier 3/4 data centers that are cheaper on a per-unit basis.

But, we partner with 3rd parties too (4 large scale 1000+ GPU operators each to be exact) to resell their compute. This is also how we'll enter Europe... we're working closely with an existing supplier that colocates servers at Hydro66 in Sweeden and another at Scaleway Paris. We provide the software, they provide the hardware, and we pay them a special rate based on the volume we're doing. Partnering with others is the only way we can handle large scale without insanely high capex costs (that being said, we do get preferential pricing as an NVIDIA Inception Program member, which we take advantage of for our own fully-owned hardware).

We have a doc in it here: https://docs.tensordock.com/infrastructure/reservable-instan...

We're also working on a marketplace (client site: https://www.tensordock.com/product-marketplace, host site: https://www.tensordock.com/host). We expect a beta version to be up and running in the next ~2 weeks. With this, we'll have a script that hosts will use to install a small version of OpenStack. Then, they set prices, and customers can deploy directly on that hardware. By aggregating all these hosts together on the same marketplace, we hope we can slash the price of compute.

So far, owning our own hardware has allowed us to negotiate better rates and enter markets where previous services don't exist (namely, Singapore, where we sell subscription servers with 1070 GeForce cards for $150/month — unheard of pricing for an APAC city). Eventually, we hope there'll be suppliers in every city selling on our marketplace or core cloud product so that we can really become the #1 place for ML startups to provision compute. In a way, we want to be the Amazon of cloud computing. Amazon, in a way, created a global marketplace. Yes, they sell their own products, but they also sell others' products. By doing so, you know that you're getting a good deal on whatever you buy. We want to end up being the same thing for compute, but that's still a few years off :)

TL;DR - we own a lot of hardware, and we resell a lot of hardware. But in the future, we want to focus on the reselling aspect to truly be able to nail the user experience and handle demand surges while maintaining low costs.

porker · on July 29, 2022

And data security? Can you answer that part of the question?

Most of my ML training is done using personal data or sensitive documents, and I have not found a cheap provider yet that I can use.

wintermute9124 · on July 29, 2022

Yes. Curious on this too.

Lambda/Paperspace/Coreweave do own their own servers (Lambda being the cheapest). That alleviates some security concerns.

It all depends on your security requirements though.

jonathanlei · on July 29, 2022

Whoops, apologies for missing this! For our core cloud product, we only partner with established providers. Large-scale compute wholesalers with $5m+ of compute each in secure data centers. These companies' entire businesses are built on selling secure compute to customers like us and other medium/large businesses. Basically, this isn't some random dedicated server host off of LowEndTalk :)

We have data protection agreements with all of them, and we can also do bare metal machines on request so that you have full control over your physical machine.

treasurebots · on July 28, 2022

Always cool to see new cloud GPU offerings, but isn't Lambda Labs' cloud offering roughly equivalent (in the case of V100s) or substantially cheaper (in the case of A100s)?

https://lambdalabs.com/service/gpu-cloud

Unless I'm missing something?

jonathanlei · on July 28, 2022

Thanks for the comment! Yes, Lambda Labs recently lowered pricing, so they're now roughly equivalent to us, and they beat us on some configurations.

I haven't used them, so please fact check me, but it seems like the machines come with directly attached storage. So, if you're using an 8x V100 and want to switch to a 1x RTX 6000, you'd have to spin up a new server and manually migrate your data over.

We built our platform with networked storage. You can spin up a CPU-only instance for $0.027/hour (<$20/month), upload your data, convert it into a GPU instance to train your models, and then convert it back. We frequently see users converting servers from 8x A100s (to train workloads) back to 1x RTX 4000s (to run inference). This kind of flexibility saves people time, which equates to money given how expensive ML developers are now.

(Our networked storage model also enables people to shut off their VMs and save money)

I'm sure Lambda Labs is working on something similar, but it seems they are doing dedicated servers based on how they advertise.

I think we also have a higher variety of GPUs (10 SKUs with us vs 4 SKUs). This lets people switch from between, say, an NVIDIA A6000 to A5000 to A4000 to truly "right-size" their compute so they don't pay for anything they don't need.

Cost-wise, we also have better long-term pricing like GeForce 1070s for $100/month in Boston or $150/month in Singapore Equinix SG1 - which is really good pricing for an APAC city in my opinion (https://console.tensordock.com/order_subscription), and we're working on a marketplace to let compute suppliers list their compute on our platform to get closer to cheapest for those who really care about cost (https://www.tensordock.com/product-marketplace).

dimitry12 · on July 29, 2022

Lambda Labs has (slow, low IOPS) cloud filesystem to persist data between instances. Attached storage does not persist but is high bandwidth and high IOPS, which is a necessity if training small-medium sized models.

tekgnos · on July 29, 2022

I agree, it is always cool to see GPU offerings. How is this different from Vast.ai (mentioned 4 months ago here on HN)?

Vast has RTX 3090 GPUs for $.32/hr on-demand or $.18/hr for interruptible. You can see live available offers on the website right now.

[References] https://news.ycombinator.com/item?id=30736459 https://vast.ai/console/create/

jonathanlei · on July 29, 2022

Thanks for the comment! Our core GPU cloud product is all data center-based. People can modify VMs after created, giving greater flexibility. Storage is 3x-replicated, 10 gbps networking, etc. We think of this more as an AWS replacement than a Vast.ai competitor.

We are actually working on something very similar to vast.ai (https://www.tensordock.com/product-marketplace) set to launch into a soft beta mode within the next two weeks and probably a real "Show HN" by the end of August. We'll have a few dozen GPUs scheduled to come online during the launch week at prices similar to Vast.ai. This would be with with full virtualization, which we think is better than Docker containers because we customers can run Windows VMs and do cloud gaming/rendering, thereby generating hosts more income. We might also add VM disk encryption later on, which would be more secure. Still, they are very large, so it'd be large road uphill, but we're working on something similar.

Also, if I remember correctly, with Vast (as a former user myself), an issue can arise when you have a VM in the stored state but someone is claiming the GPUs running an on-demand workload, which prevents you from being able to pull your data out. Because VMs are all booting from network storage and can be rescheduled to other compute nodes, you won't face that issue on our core cloud product here :)

octagons · on July 29, 2022

This looks like a very viable solution for hobbyists, which has me interested in trying it out and I intend to do so.

Some feedback on your landing page: your "Frequently asked billing questions" on the bottom of the pricing page[0] are extremely aggressive sounding even if they are meant to be tongue-in-cheek. For example:

- "I'm a bad customer and want to chargeback" - This doesn't seem like a very common billing question so it seems out of place here.

- "So, don't chargeback or dispute a charge from us, ever" - This is a very aggressive ultimatum that sounds almost threatening. Are you so risk averse that you need to threaten potential customers?

- "We cannot afford to provide services where there is a risk that we could not be paid" - This is an axiom for any business model. Why state it here?

- " Money-back guarantee? Sorry, no. Instead, start off with $5 and scale from there. If you are displeased, we can close your account and refund the $5, nothing more." - This is a fair approach, but you also advertise a $10k/month tier that would receive special treatment. As a business owner, this type of attitude makes me reconsider any kind of partnership when comparing similar services. For example, AWS is well-known for their billing forgiveness in the event of a mistake or other reasonable situations.

- "Credit cards that do not support 3DS are automatically assigned a higher fraud score and are at a higher risk of being placed in a 'fraud check' mode." - It's smart as a startup to try and counteract the risk of fraud in your revenue stream. However, this typically happens in the background and is generally assumed/hoped for by someone typing their credit card number into a website for payment. Combined with your aggressive stance on displeased customers, this smacks of the issues caused by automated punitive measures implemented by YouTube, Twitter, GMail, etc. that often make the front page of Hacker News.

Although this copy isn't necessarily front and center on your landing page, I think TensorDock would benefit from some time spent editing and adding questions that are more likely to be asked, such as, "what payment methods do you support?"

[0]: https://www.tensordock.com/pricing

jonathanlei · on July 29, 2022

Will definitely update! Sorry about the aggressive-sounding questions. Fraud was a very big issue in our early days. People would ask for refunds after crypto mining and then dispute the charges, etc... but we added other protections, so I think we can get rid of those :)

octagons · on July 29, 2022

Totally understandable! Looking forward to trying this out with my $1 of credit!

jonathanlei · on July 29, 2022

Yeah, feel free to shoot me an email for like $25 of credit. For startups, we usually do $100 (just send a request from your startup email address).

Even with $5 credits, we had people abusing the system and creating multiple accounts to cryptocurrency mine. So we chose $1 as an amount that would be so low that nobody would want to take advantage of, haha :)

panabee · on July 28, 2022

congratulations! this is very cool. do you have single GPU instances with more than 16 GB of GPU memory? we use AWS and oracle now.

ryansun · on July 29, 2022

Hello panabee, Ryan from TensorDock here.

We offer Quadro RTX 4000s (which perform up to 2x faster than Tesla T4s) with 8GB VRAM starting at 0.29 cents an hour which are much cheaper than what AWS and Oracle offer for the 16GB GPUs (Tesla T4s. P100s, and V100s). We also have RTX 5000s and A4000s with 16GB of VRAM which will be slightly more expensive than AWS and Oracle, but they have much better performance which will lower the time to train a new machine learning model, thereby lowering the cost. If you're looking for long-term inference worklords, we have subscription servers starting at $100 a month.

https://console.tensordock.com/order_subscription

Feel free to ask anymore questions by emailing me ryan@tensordock.com

Too · on July 29, 2022

Curious on your take of designing your own api rather than using Kubernetes.

Is it due to difficulties of containerizing gpus or is the api space of k8s too big and difficult for a small cloud provider to implement?

Maybe the workloads completely don’t make sense together. I’m just curious what others think about it.

raitom · on July 28, 2022

Pretty cool! I have a project I'm about to launch and I was preparing it for GCP but it would be way cheaper to run it on Tensordock. Plus, I don't need crazy powerful GPU, the RTX4000 are perfect for my usage. Signing up right now and I'll try it this weekend.

atomict · on July 28, 2022

Me too, this looks ideal for me, I'm going to try it out tonight and over the weekend

shermanda1919 · on July 29, 2022

Cool! Feel free to email me at amanda@tensordock.com if you have any questions.

NicoJuicy · on July 28, 2022

Nice, will be experimenting with this for work.

Any examples of a docker file with GPU support?

jonathanlei · on July 28, 2022

Cool, let me know if you need anything at jonathan@tensordock.com!

Unlike other managed Docker hosting services, these are fully virtualized VMs, so you can use whatever tools you want, and most software applications work out-of-the-box. You can cloud game (or run virtual workstations) through Windows, too.

Our ML images come with Docker and NVIDIA-Docker2 preinstalled, so that should work with whatever workloads you'll be running :)

raitom · on July 28, 2022

What would be cool is if we could run some kind of jobs in a docker container on those instances with the nvidia drivers pre-installed.

What I mean is I prepare a container, I go to tensordock and start my container with some parameter. Tensordock prepares an instance in the background for me, run the container and stop the instance once the container stop.

jonathanlei · on July 29, 2022

Oh yes! Our standard Ubuntu 18/20 OS images have nvidia-driver-515 + Docker + NVIDIA-Docker2 preinstalled. The all three ML images also come with CUDA if you need that. We'll probably add cloudinit support sometime so that you'll be able to script background workloads that autoshutdown the VM once completed, but at the moment, you'd have to manually SSH into your VM and run it...

Thanks, Jonathan @ TensorDock

pj_mukh · on July 28, 2022

Ooof, NVIDIA-Docker2 being pre-installed is clutch!

shermanda1919 · on July 29, 2022

@pj_mukh I agree.

ryansun · on July 29, 2022

I agree

noajshu · on July 29, 2022

Is there any information available about the beehives? I saw the link to purchase tensordock honey at the bottom of your homepage.

ryansun · on July 29, 2022

Are there any specific questions that you may have?

Ryan @ TensorDock

jonathanlei · on July 29, 2022

@Ryan - I think OP was asking about the beehives that we sell on the website, haha :)

It was started by Mark... he raises bees in his backyard, so we have some kick-ass honey raised right from our New England backyards :)

1 pound - https://buy.stripe.com/aEU28j7CR6tV81i7su 3 pounds - https://buy.stripe.com/8wM6oz3mB7xZbduaEH

yewenjie · on July 29, 2022

What is something nontrivial but interesting/useful to train on GPUs these days (not talking about big AI level compute) ?

jonathanlei · on July 29, 2022

Disco Diffusion (and generated art in general) is really cool: https://replicate.com/nightmareai/disco-diffusion/examples

Happy to give you whatever credits you need (would $25 be enough?) to try running some cool stuff on our infrastructure. And feel free to email me the results of whatever you do at jonathan@tensordock.com :)

skainswoo · on July 29, 2022

Is it possible to bring our own OS images, ie is it possible to run NixOS on TensorDock?

jonathanlei · on July 29, 2022

Not yet, but it's something we can definitely accommodate. If you just send us a link to a .qcow2 or .img file that's a cloud image configured with DHCP, we'll be able to set you up in no time :)

- Jonathan @ TensorDock

undefuser · on July 29, 2022

Can these server instances run traditional 3D applications, like games or CAD?

jonathanlei · on July 29, 2022

Absolutely! You can use our Windows 10 instances and cloud game (like how airGPU sells cloud gaming containers on us), render blender scenes, or do CAD on remote desktops. These are dedicated virtual machines for you to use :)

andrewstuart · on July 29, 2022

You can get GPU instances on AWS for 12 cents per hour.