More

danthelion · on Jan 24, 2025

At Estuary, we’re creating a real-time data streaming platform that doesn’t rely on Kafka and uses JSON as the primary data format stored in an object storage. Many people are interested in how we achieve millisecond-level latency in our data streams, so we will be publishing a series of articles on this topic!

danthelion · on Oct 27, 2024

Imagine a system monitoring payment transactions. Each transaction stream (e.g., purchase events) could be joined with customer account data (e.g., past purchasing patterns or blacklist flags). Streaming joins enable flagging potentially fraudulent transactions leveraging live context.

danthelion · on Oct 26, 2024

Materialize is powered by timely dataflow and differential dataflow, Estuary Flow uses a streaming mapreduce architecture: https://estuary.dev/why-mapreduce-is-making-a-comeback/

danthelion · on Aug 7, 2024

Gazette is at the core of Estuary Flow (https://estuary.dev), a real-time data platform. Unlike Kafka, Gazette’s architecture is simpler to reason about and operate. It plays well with k8s and is backed by S3 (or any object storage).

Onavo · on Aug 8, 2024

Interesting, are there any open source alternatives to tinybird?

https://www.tinybird.co/

mrbluecoat · on Aug 8, 2024

Not an exact match, but https://github.com/soketi/soketi might work for your needs (API-compatible with https://pusher.com )

vpol · on Aug 8, 2024

I've wrote one, but it's not public/production-ready yet. Built on top of clickhouse as well.

Basically pipe is just a collection of WITH statements with some template processing.

danthelion · on July 30, 2024

With the 2-pass strategy, we can write arbitrary row group sizes while using a fixed amount of memory, with probably 100-200 MiB of overhead for the parquet file processing, depending on how large the metadata is for the scratch file. without the 2 pass strategy, the amount of memory is proportional to the size of the row group.

danthelion · on Dec 13, 2023

I wrote git-genie to automate commit message writing with GPT & pre-commit hooks, works surprisingly well (most of the time) - https://github.com/danthelion/git-genie

danthelion · on Nov 25, 2023

Git-genie can now be installed as a pre-commit hook which enables commit message generation automatically based on the staged changes using GPT.

danthelion · on Aug 14, 2023

Yes, that was precisely the motivation! (+ some interest in learning how to publish a Chrome extension).

freedomben · on Aug 14, 2023

Always interesting to hear how others do things. There are only a handful of usernames that I know, but when I look for them it's important because (for example) they are the author/creator of a tool. From just a couple minutes ago for example, reading the Elixir post[1] the comments from josevalim are from the creator of the language, and the author of TFA. Not knowing that I think would make the thread a lot different (IMHO in a bad way), but I can definitely see the appeal of anonymizing other times.

[1]: https://news.ycombinator.com/item?id=37122006

kaycebasques · on Aug 14, 2023

Maybe the extension should also randomize the ordering of comments. Because I can infer that the top comment is more likely to be from someone influential in the space.

danthelion · on Aug 14, 2023

Only Chrome, might check out how to port it when I have some time.

danthelion · on June 9, 2023

You can now, added the URL as a dimension!