Bringing machine learning to the masses [Q&A]
Big businesses are increasingly recognizing the potential of data science and machine learning. Until recently, however, it hasn't been readily available to smaller organizations and individuals.
But now companies like Amazon and Google are beginning to make machine learning available more widely. Is this the start of a new trend? What will it mean for businesses, and will we see the rise of a new generation of ‘citizen data scientists’?
We spoke to Mike Weston, CEO of data science specialist Profusion to find out more.
BN: Isn't machine learning something that's only for techies and big organizations?
MW: No. This is a common misconception. You do not have to be a big organization to benefit from machine learning, you just have to be willing to invest time.
Machine learning is a branch of data science which focuses on using a form of artificial intelligence to help solve problems. Initially, a data scientist will train a computer with questions and answers relevant to a business' issues and the computer will then learn how to answer similar questions in the future. This training can be tailored to a business' needs and characteristics -- including its size.
BN: Are we seeing a move towards data science being available to all?
MW: Most definitely -- large tech companies are starting to take notice of machine learning and data science, and developing tools to make this more accessible to 'non-techies'. You have Amazon and Microsoft integrating machine learning tools within their AWS and CRM platforms while Google has recently released Tensorflow. This trend is only going to increase as more companies become aware of the importance and potential of data science.
I also expect that as these machine learning tools become more commonplace, we’ll see an increase in the number of 'citizen data scientists' within companies. These are people who are embedded in different departments and who have enough knowledge to understand the value and applications of data science and who can undertake rudimentary analysis. It will be these people who become data science’s greatest cheerleaders.
BN: Thanks to smart devices and the Internet of Things enterprises are collecting more data than ever. Is there a risk that much of this is being wasted?
MW: At the current time I am pretty certain that many businesses are wasting their data. This is only going to grow as smart devices and the IoT becomes more mainstream. The main issue is that many businesses do not see the full potential of the mountain of data they are sitting on and they do not know how to use it.
BN: What opportunities can machine learning bring to businesses?
MW: There are a great many opportunities. For example, a supermarket retailer could look at all their transactional data and see that eggs, milk, vanilla pods and flour are often bought in the same basket. Overlaying information on common recipes along with seasonal and weather data may then tell that retailer that all these items are commonly used to bake a cake, that baking increases when the weather is overcast and is highest in winter. The supermarket could then use this information to predict demand for certain items of stock, along with recommending baking goods and recipe books.
The advantage of using a machine learning algorithm for this is that the computer constantly learns and improves. Although a retailer may have many different products and millions of transactions, using an algorithm saves the time it would take a team of human beings to sift through all this data. As the algorithm learns from its mistakes, it gets increasingly accurate over time.
Data science can also be applied to other problems and industries. It has the potential to streamline an organization's supply chain, inform public services and infrastructure, be used for pharmaceuticals and in clinical trials and be used for employee wellness. Data science's potential is game-changing.
BN: On the other side of the coin what concerns is it likely to raise?
MW: Whenever businesses work with personal data, there are always concerns over the security and use of that data. It is important that organizations consider how they store their data, along with who has access to it. Companies will have to make some difficult decisions regarding how far they go in using their consumers' and staff data. I recommend that businesses create guidelines informing others on how the data will be used, and why.
If businesses fail to use data in an ethical way, they could face a backlash and loss of trust from their consumers and staff. This would be damaging to the entire tech industry and is definitely something we wish to avoid.
BN: Will we see simplified data science opening the door for more complex projects?
MW: Simplified data science will almost certainly open the door to more complex projects.
The machine learning platforms offered through Amazon and Microsoft have made data science more accessible, but only for a limited number of applications. With Amazon, the emphasis is on e-commerce and targets the users of Amazon Web Services. Meanwhile, Microsoft's system targets marketing and CRM.
Users of these systems will be able to detect fraud and fake reviews, predict if a customer is likely to leave and tailor a marketing strategy. However, these users will not be able to do more complicated data science. They will lack the technical knowledge required to write algorithms, clean data or know what technique to use to solve particular problems. In this way, businesses which see the benefits of using the simpler data science tools, may then turn to data scientists for more complex work.