Agreed, I like `ingest` as well. It does somewhat violate principle d), but the other solutions violate more of those. And to your point, they're principles, not rules.
Prequel | Staff Frontend / Full Stack Engineer / SWE Intern | ONSITE in New York City | Full Time | https://prequel.co
- Prequel is the customer data access platform. We enable SaaS companies like LaunchDarkly, Gong, and LogRocket to make data accessible to their customers.
- Transferring trillions of rows between data stores every month.
- Revenue is up 50% since Jan 1.
- We're launching a new product which is quite frontend heavy, and want to bring a pro onboard who can act as tech lead for it. This is not your standard CRUD app -- a big part of it is building complex SDKs and APIs used by engineers at top-tier companies.
- Frontend in Typescript/React, backend in Go.
- Team is stacked, with alums from Stripe, GIPHY, Google, Flatiron Health, and more.
Disagree on the data silo issue. There's a growing trend of SaaS providers making data available to their customers by feeding it back into their DWs. It started with the likes of Segment and Heap, and has now grown to include companies like Stripe, Salesforce, and Zuora to name a few. I'd wager that making data accessible is only going to become more table-stakes over time.
Why wouldn't you simply use SQLite (or some other in-memory flavor of SQL) instead of hacking the main Postgres db and adding load to the primary instance?
The author makes a valid point that there's something nice about using familiar tooling (including the SQL interface) for a cache, but it feels like there are better solutions.
Because SQLite is in process. Usually, when you start thinking about cache, you have more than one application server. Each application server running its own cache make cache invalidation a nightmare (I worked in a company where one genius did that and caused a lot of troubles). Don't show me any sqlite replication things because that's not how you want your cache to work.
My issue with running PostgreSQL as cache would be its thread per connection model and downsides of MVCC for cache.
> hacking the main Postgres db and adding load to the primary instance
nobody said this had to be on the primary database server, and how is this hacking?
Is every app server going to have its own local "sqlite cache"? Or is it going to use one of the sqlite server/replication things? So why not just use PG?
That's a bit of a strawman argument. Per the post, you can't leverage this on a read replica, it has to be run on primary. So you're going to stand up and manage a full new Postgres instance for this?
I'm sure there are many cases when that makes sense, but there are many cases when that's also overkill. An in-memory cache inside your server will give you better performance, and a lot of less infrastructure maintenance complexity.
Prequel | https://prequel.co | Senior/Staff Software Engineer | Full Time | GoLang, Postgres, Typescript, React, K8s | $150k-$200k + equity | ONSITE in NYC
Prequel is an API that makes it easy for B2B companies to sync data directly to their customer's data warehouse, on an ongoing basis.
We're solving a number of hard technical problems that come with syncing tens of billions of rows of data every day with perfect data integrity: building reliable & scalable infrastructure, making data pipelines manageable without domain expertise, and creating a UX that abstracts out the underlying complexity to let the user share or receive data. We're powering this feature at companies like LogRocket, Modern Treasury, Postscript, and Metronome.
I don't thing there's necessarily anything there. Microsoft might be burning money because they've decided that browser adoption and usage is worth it to them. It doesn't have to involve OpenAI in any way.
Prequel | https://prequel.co | Senior Software Engineer | Full Time | GoLang, Postgres, Typescript, React, K8s | $150k-$180k + equity | ONSITE in NYC
Prequel is an API that makes it easy for B2B companies to sync data directly to their customer's data warehouse, on an ongoing basis.
We're a tiny team of four engineers based in NYC. We're solving a number of hard technical problems that come with syncing tens of billions of rows of data every day with perfect data integrity: building reliable & scalable infrastructure, making data pipelines manageable without domain expertise, and creating a UX that abstracts out the underlying complexity to let the user share or receive data. We're powering this feature at companies like LogRocket, Modern Treasury, Postscript, and Metronome.
Our stack is primarily K8s/Postgres/DuckDB/Golang/React/Typsecript and we support deployments in both our public cloud as well as our customers' clouds. Due to the nature of the product, we work with nearly every data warehouse product and most of the popular RDBMSs.
We're looking for a full stack engineer who can run the gambit from CI to UI. If you are interested in scaling infrastructure, distributed systems, developer tools, or relational databases, we have a lot of greenfield projects in these domains. We want someone who can humbly, but effectively, help us keep pushing our level of engineering excellence. We're open to those who don't already know our stack, but have the talent and drive to learn.
The "how do we make money" section on their website is interesting.
> We have not made any decisions about how we may charge for the product in the future. That said, we believe your personal AI should always be directly aligned to your interests. We therefore think it's crucial that you are the only person who pays for it, so that will likely be our primary default business model. However, it’s still early days for this new technology. We also recognize that some people would rather access a free service and would prefer to see adverts in return.
I'm sympathetic to the idea that startups need to iterate on their business model to be successful. At the same time, this sounds a whole lot like "we promise that our business model doesn't rely on selling your data, unless we decide otherwise."