Data Streams Lead to the Cloud

Treasure Data allows massive torrents of data to be easily streamed into a cloud service for storage and analysis.

At some point during the past decade everyone forgot the hard-knock lessons from the pre-SaaS era and decided that running a highly-available, on premise big data infrastructure was a good idea. We heartily disagree. We have been looking for a cloud-based big data-as-a-service offering for a few years. The need was clear: it should allow easy data ingest from any device, in real-time, and with a simplicity that requires no more than  1/10th of an engineer to manage. A cursory glance at any 'Big Data Market Map' reveals that this is a very hard problem to solve. When we first met Treasure Data it was clear that the company occupies a unique position in the landscape. The Treasure Data offering has four characteristics that combine to form a very valuable service:

1. Simplicity

I added the Treasure Data software to one of my web applications and was streaming data into their cloud service within 5 minutes. In about the same amount of time I was able to connect from Tableau and generate visualizations of the data. Any company that empowers individuals to that extent has a strong wedge into a market. The service dramatically reduces both upfront capital expenditure and ongoing operational costs over time.

2. Easy Ingest to the Cloud

Treasure Data sponsors FluentD, the open-source data collector that has become the standard in the industry. Streaming data from myriad devices with intermittent web connectivity is a hard problem and Treasure Data has solved it elegantly.

3. High Volume Data Streams

Anytime engineers stream data on intervals approaching 1 second, the torrent of JSON blobs becomes staggeringly large. Handling a million incoming data records every second is a huge technical achievement. It is one that will provide a lot of value as corporations continues to connect sensors to the web, as mobile devices capture location and application usage data, and as web applications begin to capture not only system logs but also user activity logs.

4. Real-time, SQL Access

The team at Treasure Data has coupled a flexible schema with SQL access from all the usual business intelligence tools. Allowing easy query access into the data store is clever and will massively reduce the friction associated with adopting a big data platform.

 

It is far easier to describe these characteristics than it is to build them. At ScaleVP we like to invest in companies that are both well-aligned with technology trends and have a deep technical moat. It is wizardry to be able to handle large amounts of data and completely abstract away the complexity of the underlying architecture. The capital efficiency with which the Treasure Data team has accomplished this feat is rare and is a big part of our enthusiasm in partnering with them as an investor.

Originally published January 16, 2015.