March 29, 2018

How can companies harness the power of big data?

Have you ever wondered how your Netflix home screen selects what programs to show you? The company collects information on what you watch, how long you watch, when you hit pause, if you stop watching, and how long you spend looking for something else — this is called big data. The information then gets stored and processed through an algorithm which selects recommendations just for you.

Another example is Spotify’s new Discover Weekly feature that uses big data capabilities to create personalised playlists for each of its 70m+ users weekly.

Whenever you go online you leave a trail of data behind — think how Amazon knows what products to recommend to you, and Facebook knows which friends’ posts you want to read — both examples of big data.

The amount of data being created daily is astonishing:

  • Google processes over 3.5 billion search queries per day
  • Facebook processes more than 600 terabytes of data per day
  • Walmart is building a private cloud to process 2.5 petabytes of data every hour

But how do companies manage all of this data?

Running in the background of your Netflix home screen or Facebook Newsfeed are sophisticated databases and technologies that are able to store, process and analyse enormous amounts of information instantaneously.

Traditionally, businesses would use expensive databases sold by Oracle and IBM to store their data, but increasingly companies such as Netflix are choosing to migrate to open source technologies such as Apache Cassandra and Apache Spark. The benefits of using open source software are that it’s free to use, it’s constantly being improved, and it prevents companies from being locked into expensive long-term contracts with big tech companies.

However, the challenge with open source software is that it is complex to manage and operate, and most companies don’t have the resources or expertise to manage it in-house.

Instaclustr to the rescue

One of Bailador’s portfolio companies, Instaclustr, is helping to solve this problem. Instaclustr uses its proprietary platform to remove the pain of designing, installing, configuring, managing and scaling open source technologies so that customers can focus on their core business.

It has already helped global companies such as Atlassian, Sonos, Adstage, Blackberry and Campaign Monitor power their big data applications.

All companies need big data capabilities to stay relevant

Big data is no longer only the domain of cutting-edge tech companies such as Netflix, Facebook and Atlassian. It has now become table stakes for all companies who want to compete with these data driven disruptors.

Recently there have been examples of traditional companies such as TescoCitibank and BP leveraging the power of data to drive better consumer experiences and operational improvements.

Instaclustr is seeing more traditional companies and industries sign up its for big data technology solutions. Instaclustr recently signed up a 200+ year old textile manufacturer that is using one the most advanced open source databases to store its data.

The future of big data: we’re just getting started

The amount of data produced in the world is expected to double from 2017 to 2019, and then quadruple again by 2025.

Internet Trends 2017 — Code Conference, Mary Meeker, May 31 2017

The growth will be driven by increased personalisation across more industries, and the growing number of connected devices — such as autonomous vehicles — that will come online over the next few years.

Intel’s Bryan Krzanich has claimed that every autonomous car will generate the data equivalent of almost 3,000 people, and a conservative estimate of one million autonomous cars worldwide would equal the data of three billion people.

At Bailador, we believe that the explosion in data being produced will continue to drive demand for technologies and services that are able to store, process and analyse large amounts of information. Instaclustr will continue to be a beneficiary of this trend as it aims to become the platform of choice for all companies using open source big data technologies.