Databricks vs Snowflake: A Rivalry to Last or Lunch for Cloud Vendors? | Blog

In the latest tech industry rivalry, the competition between Databricks and Snowflake in the cloud data and analytics space is getting a lot of attention. It joins the other famous marquee rivalries over the past 100 years, such as those between IBM and HP, SAP and Oracle, or AWS and Azure. To learn more about the similarities and differences between these two big data service providers and how to make better buying decisions when choosing between the two, read on. 

What do Databricks and Snowflake do?

For the uninitiated, Databricks focuses on analyzing data at scale regardless of its location. It can broadly be considered a data and analytics platform that helps enterprises extract value from their data. Snowflake is a cloud-based data warehousing platform that positions itself as being a simple replacement to other complex offerings from traditional vendors such as Oracle and even cloud vendors such as AWS, Microsoft, and Google.

Both the platforms apply AI to data issues for enterprises. Therefore, they are Enterprise AI companies that plan to transform the usage of data in enterprises. It could be using AI to integrate data lakes and warehouses, crunching massive scale data to make decisions, or just being an intelligent analytics platform.

Where are the firms today?

Snowflake went public in 2020, making it the largest software IPO in history at a valuation of US$33 billion. Databricks, on the other hand, continues to be private and recently reached US$38 billion in valuation. While money is less of a problem, mindshare, being first to market, and the threat from cloud hyperscalers are bigger challenges. Both vendors struggle from the significant talent demand-supply mismatch, as we covered in our research earlier.

The management of both companies has a strong respect for each other. Databricks, for example, understands that Snowflake had a head start. On the other hand, Snowflake realizes some features of Databricks need to be built for its platform as well.

What is happening?

The two vendors are well covered in the public arena, and many have written almost with a romantic spin about their roots, success, and management background. Both firms have different management styles, with Snowflake run by a professional and Databricks by the founder. However, clients are least bothered about the internal operating model of vendors. They are more concerned about whether to bet on these firms, given cloud vendors have been reshaping the industry. In addition, these two companies are dependent on cloud vendors for their own platforms.

Both the vendors have taken potshots at each other with competing offerings with similar-sounding names such as Data Ocean from Snowflake and Data Lakehouse from Databricks. They also collaborate and have connectors to each other’s platforms while they keep developing their versions of these offerings. The sales and technical teams of these vendors bring out challenges in each other’s platforms to clients, such as how Databricks focuses on Snowflake’s proprietary model versus their open-source platform. Snowflake emphasizes how its compute scaling is faster and data compression is better.

What will happen?

Developers, operators, and data professionals have strong views on which platform(s) they plan to leverage. Given Snowflake’s view on building platforms from a warehousing perspective, enterprises find it easier to migrate. Coming from a data lakes perspective, Databricks has to fight a tougher battle. Moreover, Snowflake is perceived as simpler to adopt compared to Databricks. The bigger issue for both of these vendors is the threat from cloud providers. Not only do these vendors offer their platforms on cloud hyperscalers, but these hyperscalers have built their own suite of data-related offerings.

Both Snowflake and Databricks are losing money and running losses. Innovation will be needed to compete with cloud vendors, and innovation is costly. In addition to cloud, one other big challenge these two vendors face is the growing trend of decentralization of data. As data fabric and mesh concepts gain traction, building a lake or warehouse may lose relevance. Therefore, both of these vendors will need to meet data where it is generated or consumed. They need to make connectors to as many platforms as possible. Moreover, as more open-source data platforms see traction, the earlier powerhouse of Oracle, SAP, Microsoft, and IBM may decline, which will impact these two vendors as well unless they scale their offerings to these open-source databases, messaging, and event platforms.

What should enterprises do?

It’s a known fact that a large number of Databricks clients are customers of Snowflake as well. We recommend the following to enterprises:

  • Segregate the applications: With multi-cloud gaining traction, enterprises are fine investing in multiple data platforms as well. Enterprises need to segregate their workloads from classical Oracle, SAP, Teradata, and similar platforms as well as newer workloads they plan to build or modernize, generally on open-source databases. As the data type supported by applications evolve, enterprises will need help from data vendors
  • Evaluate partner innovation: In addition to the issues around talent availability, enterprises should evaluate the ecosystem around these two vendors. Innovation that other technology and service companies are building for these data platforms should be important decision criteria
  • Bet on architecture: Both Snowflake and Databricks have a fundamentally different view of the data market. Though their offerings may converge, one brings a warehouse perspective and the other a lakehouse. However, enterprises should think about their architecture for the future. With architectural complexity on the rise, enterprises should ensure their current data management bets align with their business needs 5-10 years down the road

The market is still divided on cloud’s role in data transformation, given the challenges around cost and latency. However, as these platforms bring down the total cost of ownership by segregating compute and storage, cloud data platforms will witness growing adoption.

The general questions on best sourcing methods will always persist irrespective of technology. Enterprises will need to answer some of these such as lock-in, security, risk management, spend control, and exit strategy in making their purchasing decisions.

What has your experience been in using Snowflake and Databricks? Please reach out to me at [email protected].

Subscribe to our monthly newsletter to get the latest expert insights and research.

How can we engage?

Please let us know how we can help you on your journey.

Contact Us

"*" indicates required fields

Please review our Privacy Notice and check the box below to consent to the use of Personal Data that you provide.