Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, that's also what I think must have happened—missing 2FA. But it seems it's also not mandatory within Snowflake accounts also, from what I understand from the general message.

I've never used Snowflake and assumed that because you push all your data into it, it probably has 2FA enabled by default. Is it optional?



I don't quite understand how Snowflake works.

My understanding was that you had to grant storage access (e.g. S3) and compute access (e.g. EC2) from your account to Snowflake, which would then use said resources to perform queries that you issue from their hosted web UI.

In that case it would mean stealing the Snowflake demo account of a SE should not expose your data unless you forgot to revoke their access to your underlying resources.

Can someone explain if that is how it works?


I always wondered why snowflake doesn't just install a control plane on customers own cloud resources a la databricks. Seems like they'd be able to mitigate a lot of liability that way.


No, Snowflake runs it's own storage and compute (on either AWS, GCP, or Azure depending on what you pick).


You can definitely bring your own storage (e.g. store your data in your own s3 buckets and integrate it with snowflake) using storage integrations and external tables.

See https://docs.snowflake.com/en/user-guide/data-load-s3-config...

Personally believe this is the right approach as the data resides in a location fully under the company's control. You could ditch snowflake and the data still resides in your s3 buckets for reuse with a another platform (just remove the iam permissions for snowflake).


Yes, federated queries (external tables) are supported but that is a lot slower than ingesting the data into Snowflake's storage and querying it. Since Snowflake's pricing model is based on computation time, querying external tables are usually more costly because of worse performance.


And the network ingress/egress costs are higher for cross account/region transfers.


So the customer data is actually stored on Snowflakes AWS accounts?

What difference does it make what underlying storage / provider it uses then?

Also does that mean every data query to snowflake goes out/in to/from internet at egress/Ingress costs?


> So the customer data is actually stored on Snowflakes AWS accounts?

Yes.

> Also does that mean every data query to snowflake goes out/in to/from internet at egress/Ingress costs?

Yes. It's covered comprehensively in their docs, along with the caveats.

> What difference does it make what underlying storage / provider it uses then?

"Snowflake does not charge data ingress fees. However, a cloud storage provider might charge a data egress fee for transferring data from the provider to your Snowflake account."

unsaid: "...and you have to pay for that".

Note that when they say 'your Snowflake account' they mean our cloud account which we own, and which we run our workloads in, which we refer to as 'your' snowflake account.

Tangibly speaking, what means is that if you want to check up your billing; you go through snowflake; you can't login to a cloud console and see the actual charges the cloud vendor is charging.

> What difference does it make what underlying storage / provider it uses then?

They pass the specific underlying cloud vendor costs on to you (with, I guess, some markup, though you have no way of know what that is :)


Ironically this model most resembles Teradata, which used to sell their own proprietary hardware/software combination at exorbitant rates.

Snowflake compute instance types cost about $.30-.40 an hour on EC2, so it's quite a markup.

As far as security I do believe they allow the customers to set their own storage keys, so there may be some isolation from a global breach.


Snowflake usually unloads data to an internal stage bucket in the same region as your snowflake account. If you use an s3 gateway endpoint getting that data is free of egress charges.


At the snowflake size you get custom price lists from cloud operators.

But I think there was also support for peering with client VPCs (or equivalents) which is why they support AWS, Azure, and GCP - you choose the location that is most fitting for linking with your cloud/physical workloads.


This was the case in ~2019 but are you sure this is still true? I think you can “bring your own account” with Snowflake, but I’m honestly not certain because their docs aren’t exactly clear about it…


All storage/compute/networking etc. is handled snowflake side.

For various reasons, you're not getting to touch the actual DB bits.

You can, IIRC, use snowflake-hosted connectors to access external data though.

And there's a "data marketplace" of sorts where clients can publish/consume datasets.


Databricks -does- work that way, iirc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: