> When we launched TimescaleDB, we met a fair amount of skepticism.... The top v...

kirse · on May 5, 2021

Having been on HN long enough, what I look for during any idea/startup launch is polarization and intensity of viewpoints. If people are reacting to the idea (for better or worse), it means its had an impact. Those are often the products that find success. A no-comment launch is far worse than one riddled with criticism.

IMO HN's classic "skepticism" is usually just engineering nerd insecurity projected outwards, with enough techno-jargon to maintain plausible deniability. Folks feel threatened by a great idea so it's safer to find some way to tear it down. Not to dismiss all feedback as projected insecurity of course.

andrenotgiant · on May 5, 2021

I've been working on a rubric for evaluating HN reaction to "Show HN" launch posts:

1. Universally Negative - Either it's cryptocurrency-related, or it depends on source of negativity:

   A. "I read the site and I don't know what this is" - Genuinely bad explanation of an idea that doesn't seem particularly technically interesting or challenging.

   B.  Criticism of superficial aspects (e.g. website, related topics) - Genuinely bad explanation of an idea that DOES seem particularly technically interesting or challenging. _(Commenters don't get the message, but are worried they'll appear ignorant if they say it.)_
   
   C. "Nobody needs this" "Why is this a thing" - Either bad or HN is nowhere near the target audience.

   D. "This is not the right way to do it" "You can just do X" - Either bad or revolutionary (and new enough that the idea hasn't clicked with anyone.)

2. Polarization -

    A. If positive people are REALLY positive about it - potentially a disruptive technology, potentially ahead of its time.

    B. If negative people say it's actually much harder to solve - the idea is great in principle but the only reason it hasn't already been solved is it's not possible or very difficult in practice.

3. Universal Adulation - It will transparently never make any money, it is some kind of attempt at decentralization that will never get adoption beyond hardcore nerds.

manigandham · on May 5, 2021

Your comment sounds like wild projection in itself. Most skepticism is based on wisdom and experience gained over years of working in the industry and noticing the patterns of 100s of past companies and projects.

Timescale when it first launched was little more than an automatic-sharding extension for Postgres with some convenience functions for handling time data. It was competing with Postgres itself which added native partitions, other sharding extensions like Citus, and an entire class of column-oriented relational databases that have become much more capable.

Timescale today is very different and has added a lot of the missing functionality to make it a very attractive database option, especially the columnstore/compression feature mentioned in that first HN comment.

etaioinshrdlu · on May 5, 2021

I'm still confused why time-series databases are even a thing. It seems to me that time-series just means you have a date/time column plus an index on it. Which is something typical databases already do well, and like the referenced post mentioned, you could use a column store for better performance.

But I just don't see anything that makes creating an entire database design for one specific index type worthwhile...

I index many tables on my site by num_upvotes so I can find the top ranked items to show. Does this mean that I need an UpvoteDB? I don't think so.

A previous time I argued this point, it was mentioned that you rarely need to update or delete old rows. This allows you to tailor the storage solution better. However, this basically means a compressed column store, which again, doesn't really have much to do with time.

cookguyruffles · on May 5, 2021

The internals are completely different. Given the collection of software technologies we posses today, you can't assemble them around a database using a row-oriented encoding and come up with something that can outperform (in space, time and cost) the kinds of query styles that column-oriented encodings absolutely murder.

Logically they're the same thing, but engineering is about details, details in this case that could easily be a 2x to 20x budget difference given an appropriate project

A column store can take 100 years worth of samples occurring every 10ms that yield a constant result and using technology we actually have, represent those ~87 million data points on disk and in CPU using somewhere under 10 bytes.

whimsicalism · on May 5, 2021

But there are plenty of non-"time series DB" that are column oriented, maria, monet, etc.

gautamcgoel · on May 5, 2021

That's way more than 87M samples, more like 300B.

cookguyruffles · on May 5, 2021

Oops :) You're right, fat fingered some quick calc

ironman1478 · on May 5, 2021

Certain time series databases tend to me be optimized towards making the most recent data readily available and quick to fetch. There are also certain filtering / compression algorithms that are run on these time series databases that only make sense in a time domain.

Also, some of these time series databases have very specific use cases and you have to also think about the client tools associated with the database. Many of these databases sit in power plants, factories, etc. and they stream data to tools that are built to visualize or analyze the last few minutes of data and then trigger alerts based on patterns. Also, these database are very "device" aware and integrates with other systems that represent their data in a timeseries fashion already (like a sensor). A lot of customers who needed this type of database care only about this index because their concern is record keeping and monitoring. Not necessarily number crunching (this is changing though).

There are drawbacks to storing your data this way. If your primary index is time, it can be hard to merge that with some based on a coordinate system. So doing certain types of analysis is really difficult unless you replicate your data into some other database with a different index.

pgwhalen · on May 5, 2021

This is a thought exercise I've done myself, and your questions will mostly be answered by looking at the features (https://docs.timescale.com/api/latest/) that TimescaleDB provides.

> However, this basically means a compressed column store, which again, doesn't really have much to do with time.

It does though: which data do you compress? The old data. Why not let the database figure that out for you, so you don't specifically have to tell it.

Other features include:

- Continuous Aggregates: a materialized view aggregating data over time is doable, but why not let the database materialize it for you, and automatically fall back to an un-materialized query for the newest data?

- Retention: deleting (or downsampling) old data is easy to do on your own, but why not let the database do it for you according to a policy?

hetspookjee · on May 5, 2021

If I recall correctly TimescaleDB is mostly some extension functions for Postgres, with indeed some specific indices that vastly increase some often used insert & lookup query's. You can also just extend it with PostGIS for those really fancy smancy geographical oriented time series query's. Pretty neat stuff running out of the box. Here's the docker implementation: https://hub.docker.com/r/timescale/timescaledb-postgis/

jandrewrogers · on May 5, 2021

Everything works reasonably well in a relational database if your data is small. As you scale up, the performance will fall off a cliff for any data model that the internals of the database kernel were not specifically designed for. No relational database kernel is optimized for time-series data models, so poor performance is just a matter of scale.

whimsicalism · on May 5, 2021

> IMO HN's classic "skepticism" is usually just engineering nerd insecurity projected outwards

HN is addicted to bikeshedding. It's among the top 3 comments on almost every "Show HN" or new product launch.

mrweasel · on May 5, 2021

There’s also a tendency to think: “I don’t need this, so neither does anyone else”. I know that guilty of applying that logic more times that I’d like.

ksec · on May 5, 2021

Similar to DropBox.

https://news.ycombinator.com/item?id=8863

slashdev · on May 5, 2021

I think it goes to show how impossible it is to judge an idea. YC itself doesn't pretend to do this with the ~15,000 applications they go through each cohort. They try instead to look at the team, look at their progress, imagine what would need to happen for the company to succeed at the level required for them to get the returns they seek.

Founders should have a thick skin when it comes to criticism on HN, because we don't know either.

ignoramous · on May 5, 2021

To be fair, the skepticism wasn't without merits given the lengths TimescaleDB goes to make timeseries work. From their blog entries [0][1], it is evident that they essentially shoe-horn techniques from columnar stores like Apache Druid / Kudu, and file-types like Apache ORC / Parquet into Poatgres' row-based data-model. Reminds me of BigTable / HBase, in a way, too.

TimescaleDB's biggest feat here is of course pulling the engineering magic rabbit out of the hat by chipping away at it for 4+ years, and effectively answering the skepticism by delivering on their promise.

Note though, Amazon Redshift is built on Postgres, and (allegedly) so is Amazon Timestream.

[0] https://blog.timescale.com/blog/building-columnar-compressio...

[1] https://blog.timescale.com/blog/time-series-compression-algo...

slver · on May 5, 2021

Technically you don't need a good idea to raise $40 million these days. /s

To me building upon PostgreSQL was however, a good idea. Long term all databases gain relational features, through a rather painful process of realization that RDBMS actually did some things right. They'll skip that pain and focus on new features.

Microsoft does something similar by offering a Graph DB on top of MS SQL.

Xcelerate · on May 5, 2021

I’m starting to think the best way to succeed as a startup is to get a bunch of negative comments when you announce your work on HN.

ForHackernews · on May 5, 2021

Just because it's a bad idea technically doesn't mean it can't be a business success. VHS beat Betamax.

dlevine · on May 5, 2021

Betamax was slightly technically superior, but the reason it lost was that Sony initially limited recording times to 1 hour. This meant that most movies required 2 Betamax tapes, vs 1 for VHS. Betamax players were also much more expensive.

The point is that a lot of "bad ideas" have something major going for them.

gsich · on May 5, 2021

Because VHS was better in multiple regards.