Barclays Techstars start-up Seldon drives open source machine learning

Artificial intelligence DeepMind Google — CC

Some 250 billion billion (250 x10<sup>18) transistors were produced in 2014. That means that every second of that year, on average, eight trillion transistors were produced - about 25 times the number of stars in the Milky Way (this statistic is from 2014, so according to Moore's Law production should now have doubled).

The enormous surge in computing power which we are witnessing heralds some other lapel-grabbing metrics: 58% of job activities can be automated; 47% of jobs will be lost to cognitive machines in the next ten years.

Taking advantage of this exponential is a wave of machine learning, deep learning and AI specialists. One such company is Seldon, a talented start-up selected to join the Barclays Accelerator powered by Techstars. Its enterprise-grade machine learning platform helps businesses deploy real-time recommendations and predictions.

Alex Housley, founder and CEO, Seldon, told IBTimes: "We started off as a recommendation engine- like you see on Amazon, where you purchase products that are recommended to you. That's big business. Recommendations are based on machine learning which requires skilled data scientists to not just build the algorithms and models but also scale these effectively. We built a recommendation engine focused on media and it worked with newspapers such as the Trinity Mirror to suggest relevant articles. Our API serves hundreds of millions of recommendations per month."

Alex Housley, CEO, Seldon

The current "AI summer" is being driven by gargantuan computational power being applied to larger and larger data sets. Housley said that around 2014 he saw a few different market forces at work, "an increasing commoditisation of machine learning and AI technology; popular big data technologies such as Apache Spark and Hadoop were bundling machine learning libraries as part of their systems."

He pointed to more of a social trend with consumers expecting smarter apps and increasing automation of work force activities which is driving big data analytics. "Most companies are sitting on massive silos of data. Not just their structured data - their website activities which are very highly ordered - but also all the documents that are flowing through their systems."

Seldon is also at the forefront of a trend among deep learning and machine learning providers towards open-source. In 2015 it released the platform as a fully open sourced project, effectively moving from pure SaaS where it provided a black box system, to an end-to-end open source machine learning platform. Its open-source platform is gaining traction, with over 600 stars and 100 forks on Github.

Last year companies like Google and IBM and others moved to open source machine learning: Google launched TensorFlow; IBM launched System ML; Microsoft has just released a distributed AI platform.

"It's a trend. It's a great thing," said Housley. "One of the things as a start-up you've got to think about is - why are they doing that? Google is doing that because it wants to grow the market and it wants that technology to come back to Google.

"It's great for recruitment. If people are using your platform, the next logical part is that they may well be interested in working with your company. And generally they also work better with the broader cloud ecosystem that each of those respective companies creates.

"I think as a start-up, even though it sounds slightly audacious, it's important to also try and grow the market, which you can do with open source. It saves people the time they would have otherwise spent building technology in-house. It means the whole industry moves faster as people build their own technologies on top."

As well as providing technology free of charge under a very permissive open source licence, Seldon continues to provide a SaaS platform. Housley said adding open source drives this massively and interestingly didn't lose any SaaS customers when it made open source available.

He explained that Seldon's architecture is very pluggable, using Microservices technology to enable third parties - other open source machine learning toolkits and libraries or closed proprietary software or APIs - to connect into it. "We are building a flexible ecosystem of the best machine learning tech and making it very easy for data scientists to deploy a solution which they can leverage.

To illustrate how the technology works Housley used the analogy of a mixing desk with levels like treble and bass; algorithms and their parameters are blended in a similar way.

"We build our recommendation models by combining algorithms with behavioural data. One of the algorithms we use for content recommendations finds common attributes and patterns and groups people into these logical clusters, then uses that to decide how to make a recommendation. You could be part of many groups so you are weighted across the groups.

"There's an algorithm called matrix factorisation which is all about building out a grid of possible actions and filling in the gaps mathematically — that works much better in a low churn environment. For instance Netflix would use that to recommend movies. On the other hand financial news uses a high churn algorithm because it's generally out of date the next day."

He said Seldon currently deals mostly with structured data, but is moving towards unstructured. "I'm a massive fan of deep learning; it's changing the world. But it can be very useful to be able to understand how the algorithm reached its conclusion — something that's more achievable with standard machine learning techniques than with deep learning — in some cases, such as the banking environment, it's important to have a layer of auditability in there. We are exploring this right now."

Artificial Intelligence AI