Much of the current machine learning revolution originated around applications like computer vision that have nothing to do with finance. It's an interesting question, the extent to which the latest artificial intelligence and deep learning techniques can crossover into finance.

Financial data modelling is beset by a low signal to noise ratio, whereas data used to teach a computer to identify a picture of a cat, for example, is unambiguous. The financial universe is a non-stationary environment with variable patterns of correlation between stocks, bonds and other instruments. Not least, the task in hand is essentially about predicting things that haven't happened yet.

For nearly 30 years now, UK hedge fund manager Man AHL has been trawling through enormous historical datasets trying to understand what is predictable and what's just noise. Today the company employs a team from a diverse range of scientific backgrounds and uses a combination of data science and machine learning techniques to manage significant amounts of client money.

Anthony Ledford, Man AHL's chief scientist, emphasises the importance of diversity in all things and knows never to have too much faith in any one prediction model.

Ledford said: "In that kind of data environment, where the 'grand truth' is changing, you can't just have a model that expects some static view, like learning the characteristics of a cat.

"Also, it's easy to look at historical data and to decompose it and say this is related to that, and do that retrospectively. But if you've actually got to make a snapshot based on the data you've got today and understand what predictability there is about what happens next, that is a much harder problem."

Ledford believes it's important to convey to people that this is not some magic black box, where nobody can understand what's going on inside.

"I think that's an unacceptable way of thinking of these things. You absolutely have to be able to understand how the algorithms are behaving; what they are actually learning and articulate that in ways that people can have some feeling for. In the past, there wasn't very much work done on this, whereas a lot of academic work now is looking specifically at the interpretability of these machine learning systems and I think that's a very good thing," he said.

Some financial data scientists also see as problematic the complexity of certain types of financial models constructed in the past couple of decades. Indeed, quantitative finance can be broadly divided into two categories: those who like writing down equations for how the world should behave, and those who look at data and try to detect patterns of behaviour. The former can come up with elegant formulas for pricing things, but the assumptions behind these equations may not hold robustly in the real world.

Anthony Ledford, chief scientist, Man AHL
Anthony Ledford, chief scientist, Man AHL

"You can end up with market shocks that are much larger than the assumptions of the underlying probability distributions," said Ledford. "It's because of having too much faith in your model. We realise this at Man AHL. If you ask me how much faith do I have in any particular model or being able to predict an individual price of a financial instrument, well I have very little faith in it. It's slightly more than a coin flip.

"The point is to understand the uncertainty and build the portfolio in a rational way. You would never bet the house on a very concentrated trade; you have to diversify by trading hundreds of individual instruments and you have to trade them for a long period of time.

"In every single instrument that we are trading, for every single trade that we put on, we seek to have a tiny edge that is working in our favour. The way you can turn that into something that makes sense from an investment point of view, is to distil those tiny statistical edges down into something that, at the portfolio level, makes sense as an investment product," he said.

A toolbox for learning

Starting out with data rather than a hypothesis is how machine learning works. It could be described as a toolbox for learning from data where you don't pre-define the shape of the model. The software has to try and fathom that out by itself, so an order of magnitude more intelligence or complexity is involved compared to standard models.

This approach is enhanced by Bayesian methodology, which is based on a theorem that updates belief probabilities as more evidence or information becomes available. Man AHL uses a branch of machine learning called Bayesian nonparametrics: methods that can determine an appropriate model complexity directly from data.

Ledford said: "In the Bayesian area, your numerical belief in the parameters updates as the data comes in. Now, there is a very genuine question as to whether you need to update those beliefs as every single data point comes in, or can you update them for example on a calendar period basis – once a month or once a year, or whatever. I could follow the full Bayesian prescription and have all these things update as every single new data point comes in, but it's important to assess whether the impact of updating at that rate is sufficient to be worth that computational effort?"

Summing up, Ledford said understanding the uncertainly of what you are doing is probably even more important than understanding the alpha generation side. Risk has to be managed every day; you can take too little risk or you take too much risk.

"It's tempting to have an idealised view of the world where everything is stationary and you get your risk right and then your returns per day come in steadily. But in reality you tend to see that the results of strategies arise in bursts; they don't come in uniformly. There are periods when it works and there are periods when it doesn't work.

"You can try and build systems that turn on when they see these things happening and turn off when they don't. That's quite hard to do. What you find easier to do is actually build systems that are capable of performing well when the effect they are trying to capture is there; and when it's not there, they are very lean in terms of the risk they take and amount of trading they do.

"It's really about understanding the nature of this effect we are trying to capture. And that goes right the way through from the first things we have been doing in the traditional momentum and carry space to our machine learning systems."

Anthony Ledford is a speaker at Newsweek's forthcoming Data Science and Capital Markets event in London 1,2 March.