Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
When AWS, Azure, or GCP Becomes the Competition (gkogan.co)
186 points by gk1 on Oct 28, 2019 | hide | past | favorite | 89 comments


While AWS has ruffled a bunch of feathers with their Elasticsearch and Kafka managed offerings which can be easily construed as attempts to steam roll the respective open source-first entities, I am actually quite impressed by the mechanism that Azure has employed with Azure Managed Applications: https://docs.microsoft.com/en-us/azure/managed-applications/.... Hashicorp recently announced their collaboration with Azure to bring forth managed consul offering using this: https://www.hashicorp.com/blog/announcing-consul-service-on-...

This in my opinion is the right way to solve the problem, i.e. provide the customers that want a managed offering the means to get a managed version directly from the entity most active behind the project, (e.g. Hashicorp) while getting the most from the cloud provider's infrastructure. I would be willing to trust Confluent or Hashicorp with operating/managing my Kafka or consul cluster but taking another dependency on their respective cloud offerings should be of concern.

I am quite surprised that its Azure that has shown creativity here while AWS and Google have limited themselves to offering marketplace AMIs/images at ridiculous markups.


Google is doing something similar to what you design. See https://cloud.google.com/blog/products/open-source/bringing-...

> We’ve always seen our friends in the open-source community as equal collaborators, and not simply a resource to be mined. With that in mind, we’ll be offering managed services operated by these partners that are tightly integrated into Google Cloud Platform (GCP), providing a seamless user experience across management, billing and support. This makes it easier for our enterprise customers to build on open-source technologies, and it delivers on our commitment to continually support and grow these open-source communities


How does Azure's offering differ from GCP's fully integrated OSS products?

https://cloud.google.com/blog/products/open-source/bringing-...


I am not. Microsoft was always a partner driven company. Most of their sales are through their partners, and they have learned to respect that.

They have also learned to build platform, unlike MacOS/Android, windows will never pull APIs out form under you.


And windows basically sucks, in no small part due to the historical baggage it must carry. That and it’s not nix


That's a very narrow view.

Supporting historical baggage is why a lot of business people trust Windows.

Eg. You can still run VB 6 applications :p


True, but an operating system should be a solid foundation to build on, not quicksand.

We now have a situation where a well designed and complete 5 year old application will arbitrarily stop working because someone made an 'improvement' to the OS.


> I am quite surprised that its Azure that has shown creativity here while AWS and Google have limited themselves to offering marketplace AMIs/images at ridiculous markups.

Are you sure that is accurate for AWS?

https://docs.aws.amazon.com/marketplace/latest/userguide/sof...

This seems to have launched in 2015: https://aws.amazon.com/blogs/apn/saas-partner-program/

Azure Managed Applications launched in 2017: https://azure.microsoft.com/en-us/blog/managed-applications-...

Google's offering seems to have launched this year.


To add to my previous answer:

> While AWS has ruffled a bunch of feathers with their Elasticsearch and Kafka managed offerings which can be easily construed as attempts to steam roll the respective open source-first entities, I am actually quite impressed by the mechanism that Azure has employed with Azure Managed Applications:

Here ( https://aws.amazon.com/marketplace/pp/B01N6YCISK?qid=1572333... ) is Elastic Co's ElasticSearch SaaS offering on AWS: > Elasticsearch Service on Elastic Cloud > Sold by:Elasticsearch Inc. > The official hosted Elasticsearch & Kibana offering on AWS. Launch, manage, monitor and secure Elasticsearch and Kibana deployments with the latest versions, and add machine learning and powerful hot-warm architecture with optimized templates.

The other popular product/software to hate on AWS about is MongoDB, which we find here: https://aws.amazon.com/marketplace/pp/B077D557RX?qid=1572333...

> MongoDB Atlas for AWS > Sold by:MongoDB > MongoDB Atlas delivers the world's leading database for modern applications as a fully automated cloud service with operational and security best practices built in. Easily deploy, operate, and scale MongoDB on AWS by letting Atlas take care of time-consuming administration tasks.

I haven't used any of these, but the reviews seem to show that people are running them.

So, your assertion that Azure has shown creativity, when AWS launched the ability for partners to offer SaaS solutions on AWS (with recent improvements such as PrivateLink to allow SaaS partners to offer managed services inside customer VPCs) 2 years before Azure indicates that seem to not be familiar enough with AWS to be making these statements.


When you use the marketplace, you get support directly from the vendor. What’s the diffference?


How does that compare to AWS's Marketplace?


> While AWS has ruffled a bunch of feathers with their Elasticsearch and Kafka managed offerings

Also DynamoDB (vs MongoDB)


Instead of DynamoDB, perhaps you meant DocumentDB which is AWS's managed MongoDB compatible database service.


Whoops, right! Hard to keep track of all those AWS services


I think you mean DocumentDB, not DynamoDB.

https://aws.amazon.com/documentdb/


Not quite the same situation. Amazon published the original Dynamo whitepaper with DynamoDB being a non-opensource implementation. Many other NoSQL datastores were built since the whitepaper's publication in 2007.


> Think Twice Before Open-Sourcing

I think it is a problem that it is very hard to monetize open source infrastructure libraries. In the past, you could try to do consulting or else offer paid hosting. Now with the cloud providers doing "zero management" hosting of open source libraries, both those avenues have been seriously curtailed.

Honestly, it seems like if you are the primary author of a popular open source library, your best bet is try to leverage that notoriety into getting hired by one of the cloud companies, and try to work on the open source library as a side project. Trying to use that open source library to support yourself any other way, seems destined for penury and misery.


These days, I wouldn't open source anything other than a small hobby project or utility library/app.


Well.. there is the AGPL.


There are other licenses you could choose.

Of course that will make your project less popular.


no, this is not so black & white. Lets take it in two parts 'Free' and 'Open Source'. One use of Open Source is to participate in science; define, expand, fork, integrate, extend. A different use of 'Free' is to become the standard upon which the infrastructure depends; hard game, but big results.

One author may steer or originate, but over time, Free and Open Source, is the work of many authors, many architectural components, and many network nodes.

Said another way, to understand the motivations of Free and Open Source, one must be able to see beyond one author, to the effects on systems, and systems of systems. Much of the benefits of both Free and Open Source, are not mainly to one author, but to systems, and over time.

This is important, as much of basic computing would not exist as it is now, except for sacrifice and insights by earlier authors and teams.


If it is being heavily used just invest in making a nice website where you can offer professional services and SaaS deployment with advanced features.


This is quite literally what the parent is arguing that the cloud providers are making obsolete.


For a large, VC-backed open-source startup with many employees, yes, the cloud providers have made this obsolete to a large extent, Open Core not withstanding.

For a single author of a popular open-source product or library, it's still very possible to earn a good living working on the thing that you built.


If you're doing anything ML related your best bet is to niche down and solve a very narrow problem for a specific industry / market segment. Building generic AI tooling sounds great but rarely solves anyones problems.


As an ML practitioner in a large ecommerce company, I totally agree. When I evaluate third party tools, 99% have no use at all for me. Things like Algolia or Rekognition or Clarifai.

Even Sagemaker and Fargate are mostly not useful to me as my company can easily operate its own k8s platform.

Open source tools for model building are great and a small team of engineers can write whatever odds & ends that aren’t covered, or make scalable implementations.

Notebook platforms like Databricks are similarly useless. A notebook is not a frontend to a cluster and notebooks are universally poor development environments even for the very tasks they are advertised.

The main thing that matters for ML development is how quickly & easily you can put an arbitrary container or VM into a deployment environment. You’re always going to need to rapidly change your container or VM, have common ancestors that get specialized to support GPU training or some web server application layer. Making the shortest possible distance between the definition of this environment and deployment of it is the whole game.

The type of tools I’m willing to pay for are specialized database engines, rapid data annotation tools, and specialized hardware environments.

I have no time for some “platform” tool for data provenance, notebook environments, model tracking, model-as-a-service APIs like Rekognition, or anything where I upload data and get back a model.


Could you care to elaborate further on specialized database engines and rapid data annotation tools? There are plenty of data annotation tools out there that seem to do the job so not sure what rapid would mean in this instance.


I mean things like vertica or kdb+ that have specialized performance properties for some use cases. Also to some minor extent managed cluster pipeline tooling like Spark. I don’t mind paying for managed versions of these (not Databricks though). For annotation tools I mean things like Prodigy.


Thanks. Why the interest in managed cluster pipeline but not Databricks? They seem to be the big name in the game for Spark. Would some tooling around snapshotting, sequential AB testing, staged rollouts, ghost models be of use?


Databricks is a notebook frontend to cluster computing. Cluster computing is useful and I’m willing to pay for it as long as I have total & complete control of the environments and tooling used in the cluster workflow.

A notebook frontend however is less than useless, and is actively harmful by propagating poor notebook environments even further into aspects of computing where they cause harm and hurt reproducability and code factoring.

Given this, even if Databricks offered perfectly complete features for all aspects of cluster computing, it would still be inferior to just my own managed EC2 or EMR clusters or equivalent with other providers, where there is no “notebook as control plane” garbage.

But when you add to that the fact that Databricks lacks full features for me to totally own every customized detail of my cluster environment (e.g. how can I run plain Python multiprocessing tasks with zero Spark in Databricks? How can I bring my own custom defined GPU container with custom compiled Tensorflow?) it makes the deal even worse.

Databricks is just another Spark / Hadoop style snake oil seller banking on capturing a bunch of data science teams before people realize that it’s a conceptually junk way to work.

As for the other tooling you mention, I’d almost always say to build it in house. For example, I don’t know of a single A/B testing provider that actually uses frequentist sequential testing to correctly avoid early stopping bias. You actually need real statisticians hired in-house to solve these problems, and the engineering work to set up an extensible A/B test as a service internal tool is not bad. (I’ve built Bayesian A/B test frameworks with teams of 2-3 engineers in 3 different companies). It’s just not cost effective to outsource it on the false hopes of not needing to hire your own in-house statistics experts. Just pony up the dough and hire them.


> notebooks are universally poor development environments even for the very tasks they are advertised

unnecessarily dismissive, grandstanding.. Bravado of the rejection, not supported at all

Rich-content Notebooks tell a story; well-worn Notebooks are a familiar context for experiment and viz.


Notebooks are a maintainability maze waiting to happen.

They're incredibly non-portable; they don't play nice with version control; I know how they're supposed to be used, but I'm yet to see them actually used in a way that isn't a breeding ground for the worst software engineering I've seen.


Yet the same could be said of Excel, which is running the whole financial industry.


It can be both a bad environment and very successful.

Personally I would be happy with just proper version control support.


How do you train your models? With Databricks (which is just a nice notebook UI for Apache Spark) I don't have to spend time setting up VMs and libraries for the distributed training.


Your first sentence seems unrelated to your second sentence, but it seems like you think they are related?

Any platform that advertises something like you don’t have to spend time defining your training environment, whether it’s Databricks or an out of the box deep learning VM on GCP, is a liability waiting to happen. You always need to define your own training environment, especially because you’ll almost always need specific (likely pinned) versions of all your dependencies, including system dependencies, to manage model training as part of a production life cycle. Very often you also need e.g. custom compiled TensorFlow, custom GPU settings or drivers, etc. It’s very foolish to base that environment on whatever comes out of the box.

Spark also is not universally useful. For training small models many times, like a workload that trains hundreds of small models all day (this was a production use case I had before that my company pilot tested with Databricks), the overhead of Py4J connector is insanely bad. It’s a really terrible paradigm for Python software, meanwhile Scala is a miserable ecosystem for production machine learning models. On top of all this, Spark MLib has huge gaps in functionality and whole classes of problems (e.g. large scale MCMC inference) are not solvable in a way in Spark that is seriously comparable to other tools like STAN & pymc running on not-Spark with simple multiprocessing.


Interesting, sounds like you have a very specific usecase. I'm mostly dealing with huge datasets and Spark is a lifesaver.


My use case is just general computing on huge datasets for ML and statistics.

Spark only serves a few special use cases that usually don’t justify its cost, similar to map reduce on Hadoop.

Databricks though (distinct from Spark) serves no use cases.


Agreed. One of the possible differentiators I list in the article is being an "end-to-end solution." It's going to be a long while before AWS builds AI for accounting, or for optimizing industrial operations, or for... you get it.


I have yet to hear of an example of a factory or plant using AI for optimizing industrial operations. Anybody known of any?


I imagine there are some, a lot of factory operation optimization uses heavy operations research which isnt a huge jump to some sort of AI/ML



Interesting, as I went down this route: made a niche ML product, and did not open source. As it is a framework-level library, I also think quite a bit of time will pass before AWS or Azure will even notice the opportunity, because you need to be technical to see it.

Actually, Azure already has ML.NET, but it is about classical machine learning, and their own deep learning product, CNTK, lost the marketing fight with TensorFlow and PyTorch. Which is where I come in :)


I think a lot of this depends on what your product or audience is. If you're a domain-expert in your niche then lots of folks would rather go with you since they'll have you in their corner versus someone just trying to get it running half-assed. There's also the "you're not amazon so we're going with you" mentality amongst a lot of folks out there. Simply not being one of these giants can be a differentiator, especially for folks outside the US.

My anecdotal story is when GCP announced support for running chromium out-of-the-box. I thought that that would be the end of browserless.io. I held back my knee-jerk reaction to shutdown, and after a few months I believe we only lost a single customer. There's probably been a few others that we've lost along the way since the announcement, but it didn't seem to stifle our growth in any significant way.

In any case it really depends on what your product or service is. Just wanted to chime-in and say that the writing isn't necessarily on the wall if this happens to you. Stay the course and offer them the much-needed competition.


Support is where the money is at. Every major cloud vendor has garbage support and smaller, niche companies value their customers more, especially if you're not a huge account.

Also, If I'm using your service, and a competitor announces a similar service, unless they provide something that's much much cheaper, I will probably not switch.


It misses a couple of big ones:

1. Offer a great user experience. Many cloud providers are just pumping products out the door with little to no concern for UX.

2. Have great personality. That could be through offering outstanding support, or through being the kind of company that people root for. (Honeycomb.io is a good example of this.)


Of the many people who hated Microsoft during their rise to power, one of the most coherent groups were people who were mad because MS put a competitor with an arguably better product out of business.

That is in fact how FUD got into the vernacular. They would make vague promises that you couldn't quite call outright lies about having a product coming in that market, and a year later they'd put out something tepid (thereby proving they were lying about their progress earlier), but with some bulk licensing deal.

By the time the MS product came out the competitor would already be experiencing reduced sales growth because people would play wait-and-see for the MS product.

There is nothing stopping any of these folks from pulling the same trick. Having a better product will not save you. A brilliant product might, but even that's not a given.


Who is gonna play wait-and-see in this day and age? We're not in the 90's where one had to wait for the product to be announced, then built, then shipped on a CD.


If a product has lots of integrations within your company, or if you store lots of data in it which would need to be migrated to another product (e.g. a database), or if lots of people in your company need to receive training for the product, then it might make sense to wait 6 months to see what product the big player is going to come out with, rather than choosing a smaller niche vendor.


At the level of five- and six-figure software licenses, "personality" doesn't matter much to buyers.

True UX—not UI, which often gets confused with UX—could matter only if it affects org-wide adoption or ease of use for non-technical users.


UX doesn't usually factor into purchasing decisions at bigcorps anyway. It's usually a matrix with products on rows and desired features in columns. The chosen product is the one that ticks all the boxes. Engineers who bring actual domain knowledge into the discussion are ignored because more complex background information and criteria means more work and more discussions with the board.


Curious, why do you say that about honeycomb specifically? Not that I've heard of them just seems pretty random.


Good question.

While I think a bunch of nice people work there I have the feeling that the industry sees their opinions as quite controversial.


Ah, I had glanced at their landing page before posting above and just saw the product. I didn't realize they were outspoken about things. This makes more sense to me now.


Yes. They're all "own your Ops process, deploy on fridays" etc.


Sounds like a lot of free marketing! If you're well positioned and a bigger company enters your space take it for what it is they're advertising to people in your space. Sure some will just default to one of them but most will shop around and because these guys are going to be dumping money into your space - you'll see more eyes on your product. Be on your A game it'll be good for you. Maybe they'll acquire you because you do it better.


I wonder if this is a driving force behind why Cognitect has kept Datomic closed source. I’ve seen people rally against Datomic more than once for that reason, but honestly, how can you open-source something truly unique and monetize without a large risk of cloud provider hijacking.


RICH HICKEY: To those who think that Datomic ought to be open source: We don't see a viable economic model there.

The source is an r/clojure flamewar which I'm not going to link.


There are many more examples that I did not include. Like Redis (and AWS Elasticache), MongoDB, Cassandra, Akamai (and AWS Cloudfront), etc.

Any others?


Timescale/Influx, Neptune/multiple graphdbs, activeMQ/MQ, Kafka, Kubernetes, Hadoop/EMR, Athena/presto, Spark/Glue+Kinesis, Iron.io, even CloudWatch log insights eating other log platforms.


Auth0 (and others) and Cognito (which is a good example of the incumbents thriving because I don't know anybody who's that happy with Cognito)


gitlab


Uses want to pay for (managed) services, not software. The rise of AWS and cloud and downfall of on-prem licensing models over the last decade has proven this beyond any doubt.

It's completely a failure of vendors to keep trying to sell software instead of competing on the services. The ones who have rolled out their own offerings are generating plenty of revenue, but the overall breadth, flexibility and quality of is still rather poor. I don't understand why these vendors are moving slowly here. They're the best at running their own software. Writing some blog posts isn't going to fix anything.


This is a good post from a16z -

https://a16z.com/2019/10/04/commercializing-open-source/

"As software has eaten the world, open source is eating software.

Today, almost every major technology company, from Facebook to Google, is written on the backs of open source software. Increasingly, these companies are building their own open source projects as well – Airbnb, for example, has more than 30 open source projects, and Google more than 2000!

In the future, the virtuous cycle will continue. Technologically, AI, open source data, and block chain are some examples of emerging innovations. The next generation of business models may include ad-supported OSS, as when a large proprietary enterprise supports open source projects; data-driven revenue; and crypto tokens, which monetize blockchain.

I believe Open Source 3.0 will expand how we think of and define open source businesses. Open source will no longer be RedHat, Elastic, Databricks, and Cloudera; it will be – at least in part – Facebook, Airbnb, Google, and any other business that has open source as a key part of its stack. When we look at open source this way, then the renaissance underway may only be in its infancy. The market and possibilities for open source software are far greater than we have yet realized."


Wait, isn't it like any other market which already has long lasting players?

If you want no competition -- find a new market (and win it). Otherwise you have to be smarter and faster than existing players.


You can probably draw a parallel between the retail sector and the phenomenon discussed in the article. For example:

Decades ago (before Walmart, Target, etc), it was common to have many small specialized stores that did just 1 thing really well (ie. selling just books, selling just shoes, selling just office supplies). Over time, Walmart/Target came along and bundled all of those specialty stores into 1 giant store that sells everything, causing a lot of the small specialty stores to eventually fade away.

AWS/GCP/Azure is sort of like the "Walmart of B2B Internet Services" in that they are in direct competition with companies like MongoDB, Akamai, and the other examples from the article re: offering comparable managed services.

As a market matures, the market consolidates into a small number a big players. To survive, specialty shops need to do things that the big players won't or can't, which I think the article does a good job discussing.


Oligopoly is common, that does not mean oligopoly is good.


The other option is to partner up with Big Cloud. They all offer marketplace programs that let customers install third-party software quickly and easily, and take care of the billing too (for a price, of course).


I address this in the article. Unfortunately (for the startup) being in one of the Big Cloud partner programs does not prevent them from launching directly competitive products.


It probably only makes it easier for them as well since they can see what add ons/tools are the most consumed. Then they can use that data to figure out what features they should build next for their tools


There will not be another Dropbox.


What does that mean?


Is there anyway to create your own AWS/Azure/GCP on your own server?


AWS has outpost, which is "EC2 mechanics in your own datacenter": https://aws.amazon.com/outposts/


Azure has Azure Stack, you need a deep wallet https://azure.microsoft.com/en-gb/overview/azure-stack/


Interesting, because I was seriously thinking of building a product where you get a good amount of features from these cloud providers and run it on your own servers... although obviously there's going to be some drawbacks, you may save a lot of money.



Oh. Haha. Looks like I learned something new today. Thanks for the info!


Google has Anthos (kubernetes on prem, with access to GCP marketplace).


This one is easy: patents



Nice plug in the middle of middle of the article, exactly at the right spot:

> By the way, I write an article like this every month or so, covering lessons learned from growing B2B software startups. Get an email update when the next one is published

This is the closest I got to subscribing to a newsletter.


> This is the closest I got to subscribing to a newsletter.

Almost got you, huh? :)

I've A/B tested various locations and the inline-embed is the best, by far. This is also why many news sites now include related stories in the middle of the article, not at the end.


Wish the rss/atom was better represented, rather than only existing in metadata on your site. I must be a dying breed, but I refuse to sign up for email newsletters. I refuse to hand over that much of my attention.

Edit: https://www.gkogan.co/feed.xml


That’s interesting. So RSS didn’t stick because it didn’t allow advertisers to harvest email addresses.

Which is stupid because you still can’t reach to your audience if you just buy harvested addresses, so if it’s just people who subscribed to your newsletter, you’re just better off letting them subscribe to your RSS. Perhaps the quantifiability of newsletters vs RSS makes email win. If only RSS had embedded ways to count views and see what users looked at.


Pro-tip: create a dedicated email for newsletters so you don't feel like they're spamming you or clogging your inbox. Ever since I did this I really enjoy finding the occasional unique newsletter.


Terrible blog post, they completely forget to mention Mesos which owned the orchestration market before Docker Swarm and Kubernetes


Yikes, that’s a little harsh considering the good discussion going on before and after your comment.


There is good discussion but there is a lot of hyperbole, how many companies can you name that have been killed by these hyperscale cloud providers?

Many seem to be doing well: https://a16z.com/2019/10/04/commercializing-open-source/

Some just may not be worth their oversized valuations in the long run but made choices to be valued much larger than their actual worth.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: