Hacker Newsnew | past | comments | ask | show | jobs | submit | zX41ZdbW's commentslogin


Two problematic statements in this article:

1. Test pass rate is 99.98% is not good - the only acceptable rate is 100%.

2. Tests should not be quarantined or disabled. Every flaky test deserves attention.


a test pass rate of 100% is a fairy tale. maybe achievable on toy or dormant projects, but real world applications that make money are a bit more messy than that.

I definitely have 100% pass rate on our tests for most of the time (in master, of course). By "most of the time" I mean that on any given day, you should be able to run the CI pipeline 1000 times and it would succeed all of them, never finding a flaky test in one or more runs.

In the rare case that one is flaky, it's addressed. During the days when there is a flaky test, of course you don't have 100% pass rate, but on those days it's a top priority to fix.

But importantly: this is library and thick client code. It should be deterministic. There are no DB locks, docker containers, network timeouts or similar involved. I imagine that in tiered application tests you always run the risk of various layers not cooperating. Even worse if you involve any automation/ui in the mix.

Obviously there are systems it depends on (Source control, package servers) which can fail, failing the build. But that's not a _test_ failure.

If the build it fails, it should be because a CI machine or a service the build depends on failed, not because an individually test randomly failed due to a race condition, timeout, test run order issue or similar


If one is flaky, then you are below 100% friend.

That's not what I mean. I mean that anything but 100% is a "stop the world this is unacceptable" kind of event. So if there is a day when there is a flaky test, it must be rare.

To explain further

There is a difference between having 99.99% test pass every day (unacceptable) which is also 99.99% tests passing for the year, versus having 100% tests passing on 99% of days, and 99% tests on a single bad day. That might also give 99.99% test pass rate for the year, but here you were productive on 99/100 days. So "100.0 is the normal" is what I mean. Not that it's 100% pass on 100% of days.

Having 99.98% tests pass on any random build is absolutely terrible. It means a handful of tests out of your test suite fail on almost _every single CI run_. If you have 100% test pass as a validation for PR's before merge, that means you'll never merge. If you have 100% test pass a validation to deploy your main branch that means you'll never deploy...

You want 100% pass on 99% of builds. Then it doesn't matter if 1% or 99% of tests pass on the last build. So long as you have some confidence that "almost all builds pass entirely green".


"most of the time" != 100% pass rate

Read my other response. It's about having 100% be the normal. There is a difference between having 99.99% all of the time, and having 100% all of the time and 99% in rare occasions.

So "100% most of the time" actually makes sense, and is probably as good as you might hope to get on a huge test suite.


When I was at Microsoft my org had a 100% pass rate as a launch gate. It was never expected that you would keep 100% but we did have to hit it once before we shipped.

I always assumed the purpose was leadership wanting an indicator that implied that someone had at least looked at every failing test.


Even something as simple as docker pull fails for 0.02% of the time.


On top of 2., new tests should be stress-tested to make sure they aren't flaky so that the odds of merging them go down.

I can run flaky tests on my machine a thousand times without failure, whereas they fail in CI sometimes.

Yes, that's why you need to stress test in CI.

Similarly, this is how it was introduced in ClickHouse in 2019: https://github.com/ClickHouse/ClickHouse/pull/4774


I once tried to post an interesting visualization to r/dataisbeautiful and received hundreds of upvotes, but then it was wiped for an unknown reason. Then I contacted the mods, both on Reddit and over email, to no avail.

It was a very frustrating experience, don't recommend anyone to try.


Though ClickHouse is not limited to a single machine or local data processing. It's a full-featured distributed database.


Another alternative is Exasol that is factors (>10x) faster than Clickhouse and scales much better for complex analytics workloads that joins data. There is a free edition for personal use without data limit that can run on any number of cluster nodes.

If you just want to read and analyze single table data, then Clickhouse or DuckDB are perfect.

Disclaimer: I work at Exasol


When a single server is not enough, you deploy ClickHouse on a cluster, up to thousands of machines, e.g., https://clickhouse.com/blog/how-clickhouse-powers-ahrefs-the...


This is good news.

I was trying to add Exasol to ClickBench (https://github.com/ClickHouse/ClickBench/) since 2016, but it was not possible due to the limitations and the fact that it required using a custom virtual machine image.

Now we should try it again...


This is bad news: it is not usable:

> 5.3 Licensee may not disclose any benchmarking or results of evaluating the Software without Exasol´s prior written consent


This seems like a leftover from the old enterprise licenses. I will see if we can get that changed.

We'll be happy to be part of Clickbench. Reach out to me and we can work together to make it happen.


I would also like to have something like this, but for "vintage" links - something that looks like it was from the late 90s.

I use them in tests, just for fun: https://github.com/ClickHouse/ClickHouse/blob/master/tests/q...


There was a "shadyurl". The site itself seems to be long gone, but this'll give you some context: https://www.mikelacher.com/work/shady-url/


There's an example shadyurl link in here: https://news.ycombinator.com/item?id=14628529

Funnily enough the domains appear to have been bought up and are now genuinely shady.


GPU databases can run a small subset of production workloads in a narrow combination of conditions.

There are plenty of GPU databases out there: mapD/OmniSci/HeavyDB, AresDB, BlazingSQL, Kinetika, BrytlytDB, SQReam, Alenka, ... Some of them are very niche, and the others are not even usable.


The query tab looks quite complex with all these content shards: https://hackerbook.dosaygo.com/?view=query

I have a much simpler database: https://play.clickhouse.com/play?user=play#U0VMRUNUIHRpbWUsI...


Does your database also runs offline/locally in the browser? Seems to be the reason for the large number of shards.


You can run it locally, but it is a client-server architecture, which means that something has to run behind the browser.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: