How do data scientists differentiate themselves in a world of rapidly advancing artificial intelligence and machine learning? This is something that hedge funds and asset management firms, which are heavily invested in new technology, think about a great deal.
Separate components often referred to in order of importance are: data, algorithms and hardware. But while these are crucial, the secret sauce (a fitting epithet in this case) is often how these ingredients are combined.
It's well documented that data-hungry buyside firms scour land, sea and space looking for data sets that their peers haven't encountered yet. Meanwhile, Google's Tensor Processing Unit (TPU) and other application specific processers for machine learning are in a hardware arms race. Add to this the vibrant big data open source community, which has even been driven by hedge funds like AQR and Man AHL.
"The question is what will become a commodity and what will differentiate," said Sharad Sachdev, managing director, Accenture, Artificial Intelligence and Analytics.
"If everyone can afford the same hardware, is it the algorithm or the data – and to what extent does data also become a commodity? I think it's a trick question. All of these are important, but it's more about how you piece the data, hardware and algorithm together. Isn't that your secret sauce, the thing that differentiates you?"
Accenture will be speaking at Newsweek's next AI and Data Science in Capital Markets event in New York City.
Sachdev's colleague, Oscar Garcia, Accenture FS - head of Capital Markets Nordic and global lead for Innovation in Capital Markets, states that one differentiator amid all the objectively verifiable data is sentiment: the way in which news is interpreted and processed.
Garcia said: "Companies will achieve differentiation in how they interpret and process news, and information that is not objective. This is why human thinking and AI will become more and more relevant. The ability to process and understand news in real time will bring uniqueness and differentiation."
His point opens up practical and ethical concerns. If everyone ends up using the same systematic approach, especially in the world of trading, these factors get priced in as more models are formulated. "In theory, we will see a narrowing in pricing and this will have an effect on market making, especially where latency and speed are key competitive advantages," said Garcia. "However, I'm not convinced there will be multiple firms using the same algorithmics. There is always a human angle and differentiation brought by humans."
Robots and AI taking people's jobs is a well-known motif. The world awaits the impact on employment when AI becomes more extended and the transition in skills and functions that will accompany this. Another ethical question concerns inequality around access to information, which in the capital markets realm translates to the creation of wealth. In other industries, such as the health space, the benefits to society are obvious.
Another question concerns transparency and interpretability of the technology that firms are banking on. "Are we investing in technology that will favour of a lack of transparency?" asks Garcia. "Especially considering the regulatory wave we are seeing today. What regulators are trying in general to facilitate within the industry is transparency."
On the topic of black box ethics, Sachdev pointed out that in some cases there is less of a care about how something was done; the only concern is that it was done well. "So that's why you are seeing more adoption and maturity in certain areas, and less adoption in other areas like bank underwriting, insurance, health underwriting - just because of the lack of interpretability of these algorithms."
There is quite a bit of interest in how to train people to become good data scientists (really an intersection of disciplines like software engineering, statistics and data warehousing). Accenture has partnered with the likes of MIT and Stanford for research which is driving training programmes to its own staff. Sachdev said helpful pre-requisites include some maths and programming skills, but you can still get an understanding without these.
He said: "We have a lot of people now trained in machine learning. Also, we run an internal advanced machine learning forum every two weeks, where people can come and talk about the next machine learning algorithm they have written, how they wrote it, what worked and what didn't."
As far as building out machine learning capabilities at your firm, you can either do this from scratch "building these models in Python, in R, and using a library of algorithms that are now available open source, from machine learning to deep learning," noted Sachdev.
"Now, you need special skills to understand how to build these models and that's why the job of data scientist has been called the sexiest job yet the hardest job in the market, because there is a certain maths skill required and programming skill to do that.
"The second way to build these solutions is to use the platforms like Google Tensor Flow or IBM Watson," he said.
"Rather than start from scratch to build the model, you are using an API to call the underlying capabilities that Google or IBM has built. You need to know which API is the most relevant, and how to train the algorithm that gives you access to that API.
"Everyone has access to all of these libraries. So, with art as with science, it's about how you piece it together. With the same ingredients I could be a Michelin Star chef, or I could be running a fast food joint."