While big data scientists working in finance may be obsessed with detecting sentiment from a sea of diverse data sources, what could be a better indicator of confidence than investing your own money in a stock?

According to Kavout, a machine learning financial services startup, sentiment should be gleaned from both structured and unstructured data sets. Unstructured data sets include news and social media, while structured data sets include transaction data — for example, when CEOs buy or sell shares.

Kavout, which boasts PhDs and engineers who worked on Google search and Microsoft's machine learning division, combines signals from a range of categories. Kavout's CEO Alex Lu said: "We have signals from company fundamentals [and] we look at accounting information, we look at the balance sheet, we look at the cash flow; then we look at technical signals, which is based on pricing, trading volumes, transaction data, and then it comes down to the sentiment."

"Sentiment is something new to the financial industry because when we talk about sentiment we are looking at social media data, the news, blogs and other data. But we don't stop at unstructured data sets. We also look at the sentiment by transaction. For example, we look at insider trading and some other signals. Actually, this is also called sentiment and, based on transaction datasets, more than the textual datasets.

This sentiment factor is the attitude with which traders approach a stock, either positive or negative. As Lu points out, nothing could be more positive than showing that you are willing to get your money out and invest.

"So, building our sentiment model, we are leveraging not only social media and the news, but also the actual transaction data of the investors that we can obtain."

HFT and machine learning

Lu said his company has been approached by a number of high frequency trading (HFT) shops, which are different from traditional investors and traders because what they mostly are looking at are tick data.

"They look at a lot of the tick data and look at a lot of the real time streaming quotes, but they are very interested in leveraging machine learning to do the data mining of new insights and new trading strategies for HFT."

Lu said his team has been asked to come in to HFT trading firms and teach their traders about it. "It's definitely a very interesting area for us. We are not looking at HFT datasets; we are looking at more comprehensive and more sophisticated datasets and signals. Our sources are more diverse, whereas HFT shops use more unified data sources. Most of them are just looking at one single source and it's a very different game.

"But I believe machine learning can be leveraged in that area too. I know the likes of Bridgewater and Two Sigma are building machine learning teams to do this mining and some other hedge funds are looking at it too. I think it's just a very promising area."

"Kai" is the core AI and machine learning system that powers Kavout's main functionalities. From historical SEC filings to real-time stock quotes, Kai examines millions of data points every second to analyse stocks from a purely objective standpoint and rank stocks just like a human analyst.

In addition to sentiment analysis the company is about to launch something called chart pattern recognition, which uses pattern recognition of classical chart patterns, "head and shoulders, to identify double bottoms with double tops, flags, triangles - all these patterns that many technical traders use a lot every day. We are going to do that automatically by using our machine learning algorithms".

Deep learning v machine learning

Kavout is also in the process of deploying deep learning models to its data. IBTimes asked Lu to explain the difference between machine learning and deep learning. He said: "The advantage of deep learning compared to the traditional machine learning programs is, I think, something called auto-encoding; more feature representation or automatic feature selection by the deep learning models."

"Traditionally, when we do the machine learning, we have to label - especially for the supervised machine learning - we have to manually label training datasets. That's a lot of effort."

"The second thing we are going to do in the traditional machine learning is called feature engineering. So, similar to machine learning happening at Google and Microsoft and Baidu, you spend a lot of time to do this feature engineering and feature selection."

"But for deep learning the advantage is the feature engineering side; feature selection and feature representation is all done by this auto-encoding and neural networks, so they can learn this by themselves and also it does not need so many human labels training datasets. That's two advantages of deep learning mechanisms over traditional machine learning techniques."

While back-testing portfolios of stocks with different Kai scores, Lu found deep learning shows much better performance compared to traditional machine learning algorithms. "This is something we are building right now. We are going to integrate deep learning techniques into our models in the next couple of weeks."

While Kavout finalises the platform, the company will invite users for free and ask for their feedback. "So it's different from open source," said Lu. "Some of our features will be free, which means open to everybody. But for some of the features we will charge; premier features we are going to charge by subscription model, and also we are going to look for other opportunities to do this white labelling with institutions, we could also white label our platform for some institutions."

"In addition, we are also building up our wealth management and trading business. I think later this year, we are going to launch mobile trading apps. All of this is going to bring revenue to our company."