jtsymonds's comments

jtsymonds · 2024-08-06T15:15:08 1722957308

So helpful. That makes total sense.

jtsymonds · on June 29, 2023

Uhh, they were using AI moniker in 2019. They just circled back to it this year:

https://web.archive.org/web/20190428005007/https://min.io/

jtsymonds · on Aug 12, 2022

This is supported by MinIO, but not "as a service." Essentially you run MinIO everywhere (AWS, GCP, Azure, IBM, on-prem, OpenShift, Tanzu etc). In the public clouds you can either roll your own or use the marketplace offerings.

In effect, you are choosing MinIO object storage over the "stock" object storage (which is incompatible with the other clouds).

You can use MinIO's ILM policies to replicate, tier, etc.

You still pay for compute, network + drive but then pay MinIO vs. S3/Blob. There will be no egress fees.

jtsymonds · on April 13, 2022

Many companies that do this look at MinIO for object storage. Given they run in AWS, GCP and Azure, they will minimize or eliminate your application rewrites. They are cloud-native by design and very fast.

jtsymonds · on Aug 23, 2018

The key here is that TDA is packaged into an application that is designed explicitly for use by practitioners. All of the underlying math (and you know there is lots of it in TDA) is abstracted. What is shown is the groups and the atomic level explains (this group is here for these reasons e.g. they received albuterol upon admittance). Your instinct is correct, but that is what is interesting about this case - the hospital, without a single data scientist, was able to to achieve this with only slick SQL skills and engaged doctors.

Screenshots for the app and videos can be found here: https://www.ayasdi.com/solutions/clinical-variation-manageme...

jtsymonds · on Oct 15, 2016

Twrrim,

Slightly different. We have appended all of the data from Factual as well. This includes the location of every business in NYC.

35bge57dtjku · on Oct 15, 2016

Why don't you publish the real data then?

imaginenore · on Oct 15, 2016

Because it's their business?

jtsymonds · on Oct 15, 2016

Hi infinite8s, to get additional information on how that chart was made, you can to go https://www.mapd.com/product/ scroll down to the bar chart, and click “See Details” under the chart. Shows the machines used, queries, and the source data set and size. Note that the machine configurations used to generate the chart were normalized for equivalent cost on AWS, i.e. the chart is hardware-dollar normalized.

jtsymonds · on Oct 15, 2016

Hi SXP, thanks for your comment. You might want to check out Mark Litwintschik's posts (independent blogger who has benchmarked this dataset across many different databases) for performance on GeForce GTX TITAN X's. 4 x GeForce GTX TITAN X: http://tech.marksblogg.com/billion-nyc-taxi-rides-nvidia-tit.... 8 x K80s: http://tech.marksblogg.com/billion-nyc-taxi-rides-nvidia-tes.... He has additional posts on MapD on Pascal Titan X's and AWS as well. In full disclosure I work at MapD...

sp8962 · on Oct 15, 2016

Nice demo, but it would be even nicer if you could get the blooper fixed: "Mapbox's Openstreetmap"

MapBox is an active and respected participant in the OpenStreetMap project and uses our data in some of its products, but that is it.

jtsymonds · on Oct 15, 2016

Blooper Fixed :)

sxp · on Oct 15, 2016

Thanks for the info. The tl;dr is "It's fantastic to see that I've been able to use a machine that costs 1/10th of the one used in the 8 x Tesla K80s benchmark but still have queries running within 33% of the previous performances witnessed."

However, I'm suspicious of the numbers in those articles since the author lists only 4 data points in each trial and doesn't mention the stdev in his measurements. One of his measurements was .964 vs .891 so it looks like the Titan Xs were 90% as fast as K80s if the numbers can be trusted.

Twirrim · on Oct 15, 2016

Both those links got shortened for some reason, into invalid ones.

jtsymonds · on Oct 15, 2016

Yep, sorry about that:

Here is the Titan X link http://bit.ly/2e6C3Gg

Here is the K80 link: http://bit.ly/2eiIwvp

jtsymonds · on Oct 15, 2016

There is not an open source version as yet, but you can spin up these instances on an hourly basis on AWS https://aws.amazon.com/marketplace/pp/B01M0ZY2OV?qid=1475606... and on IBM Softlayer.

vegabook · on Oct 15, 2016

thanks, but at 5 bucks an hour for an entry-level instance (single 12GB GPU) I'm looking at 120 bucks a day if I don't want to constantly re-upload my dataset into MapD (a very slow operation judging by Mark Litwintschik's posts linked by you). That's a very very high price for such a modest hardware configuration, not to mention the more credible one which goes for an eye-watering 30 bucks an hour ie not much change from a grand a day. Not for us startup folk, clearly.

I have to say it seems your pricing for such a new entrant and before having built share, is bound to attract very stiff newcomer competition. "Interesting" business model.

felipe_aramburu · on Oct 15, 2016

You could try BlazingDB if what you are interested is the gpu powered SQL component. There is a free community edition available here: https://docs.blazingdb.com/docs/quickstart-guide-to-blazingd...

You can install this on AWS or on your own infrastructure (I run this on my laptop for example).

tmostak · on Oct 15, 2016

MapD has a persistent store and normally customers would keep that on an EBS volume, so they don't have to reload their data every time they spin up an AWS instance.

ruw1090 · on Oct 16, 2016

very interesting. Do you have a link to documentation? I'd like to take a look before I try it out.

vegabook · on Oct 15, 2016

fair enough but software cost 3-4x the (already high) hourly hardware cost seems excessive.

wingless · on Oct 15, 2016

If you find this too pricey, then build a cheaper or free competitor.

jtsymonds · on Oct 15, 2016

Look at the coloring around the rides near bridges. People take the subway down to the closest point and then take a cab home. The hybrid trip is both pocketbook friendly and probably faster.