Software companies are acquiring and accumulating ever-increasing sets of data driven by reduced costs and new database paradigms. But as my colleague Cack Wilhelm wrote earlier this week “‘Big data,’ however large, sitting idle in a data store is not adding value to an enterprise… Data must be consumerized easily for business stakeholders in order to uncover insights and drive predictions that are specific to solving each business problem.” Software companies are moving beyond just gathering retrospective analysis and instead are looking to leverage this data to better understand and predict what is to come – often by using machine learning. We call this trend of next generation set software companies “automation by algorithm” and I predict that it is going to eat up software.
The technology of using machine learning for predictive analytics is not new. However, quick glance at Google Trends shows an increasing interest in predictive analytics. It is both interesting and important to understand why this trend is growing.
Search Term: Predictive Analytics
The rise of big data (both the decreasing cost and non-relational database formats) is likely the largest catalyst. But there are other factors as well, such as the proliferation of the cloud, distributed computing, and better hardware that have made machine learning algorithms faster and easier to run. There is also an organizational shift at large with cutting-edge software companies being the first to adopt. Now, smaller organizations are following the trend as they acquire the talent from larger companies. There’s also been recent innovation in machine learning algorithms, with random forest (early 2000s) and deep learning (2000s) both being discovered in the past decade.
It is easiest to understand “automation by algorithm” by looking at examples such as our portfolio company Sailthru. It is common practice to optimize frequency of e-mail sends to open rates. This quickly becomes complex as you add dozens of possible variables (content of the email, time of day, subject, recipient demographic, pricing, etc). A data scientist may be able to optimize one instance, but the ongoing complexity means that only a machine-learning algorithm can continually optimize for the best result, incorporating feedback as performance changes. Sailthru has developed such technology (and more) allowing any customer to easily integrate this functionality without requiring in-house knowledge of machine learning. There are many alternatives when seeking an email vendor, but the ability to automate the algorithm and deliver it as an application / service is what distinguishes the company from its peers.
Rather than broad, horizontal solutions we believe the greatest near-term opportunity exists in specific use-case applications that leverages machine learning. This allows for the strongest selection and optimization of the machine-learning algorithm specific to the use case. Further, we have noticed that companies with the largest data sets have a clear advantage and therefore the most success. The data set can be accumulated, accessed through partners, or purchased- the source is less relevant than the volume (though source is a potential for long-term defensibility). Early adoption of automation by algorithm is evident in data rich areas such as marketing and finance. However, we ve also seen great use cases in human resources, sales, and industry specific vertical software. Without intending to be exhaustive, a few areas we have found particularly interesting are:
Predictive Pipeline Analytics
Back in April, my colleague Stacey Bishop blogged about our growing interest in this area. We saw that several of our portfolio companies had either too many leads or too few. Those with too many are left hopeless trying to figure out which leads to focus on. Those with too little are often challenged with understanding where they should find their next lead. We re really excited about companies using automation by algorithm to help companies focus on the right leads and identify new leads leveraging existing customer and pipeline data.
Managing hundreds or thousands customers is tough. Monitoring all the events that can lead to a churn event becomes impossible. Automation by algorithm can be used in retention software to predict the behavior of future customers and churn events. We are particularly excited about companies that not only leverage direct sources of data such as payment schedule, engagement, and feature requests, but also external data about that customer. For example, monitoring headcount might let you know that a sudden, significant reduction in forces could indicate financial hardship (and therefore a temporary easement on payment may prevent a churn event).
Reviewing thousands of resumes is inefficient and suffers from human bias. Meanwhile candidates are putting all their career data online, publishing their work online, and demonstrating qualifications in dozens of other ways across the Internet. The resulting data footprint holds a potential for strong insights into the recruiting process. We believe that there is a large opportunity for recruiting software to leverage automation by algorithm and change the way managers recruit.
Credit Risk Monitoring
Online fraud is exploding. Many first and second generation software solutions are struggling as location has become mobile and online personas have become more complicated to verify. At the same time, more behavioral, social, and other third party data is available than ever before. We are very excited about companies that are leveraging all of this data into their software and using automation by algorithm to help companies combat fraudsters.
Note: We were very fortunate to have Xiaonan Zhao join us for the summer. Her previous experience in machine learning while at Google and enthusiasm for the area were crucial in the development of thinking. She is currently a second-year at Harvard Business School and contemplating her post-graduation plans.