Have you met that know-it-all expert always rushing to spoil the AI party by pronouncing that they have seen it all before in the 1980s, when expert systems were flying the space shuttle, DEC was using an AI configurator to reduce its errors in configuring complex mainframe systems, and American Express used AI to approve your credit card purchases in real-time? Actually, often this guy was me – I have indeed seen it all before, but it was very different then and it is very different now. How so? Well, let me explain the two branches of AI.
Way back then we had limited data and the knowledge that software needed to act intelligently was provided by humans in a “ready-made” encapsulated form – expert system rules or knowledge frames or, more recently, ontologies. This was completed with a “reasoning engine” which can derive useful actions from that knowledge, and the results were indeed spectacular but only within that very narrow field of expertise encoded within the explicit knowledge given to the machine. This was called “symbolic AI” and the pinnacle effort to expand the range of problems that a single AI system can be applied to was the CYC project led by Doug Lenat at Carnegie Mellon University – a well-funded and long-running effort to create a machine-readable encyclopedia of human knowledge. Check it out at www.cyc.com.
The reason we did not see many more applications of symbolic AI is simple; encoding all that knowledge by human experts was time-consuming and expensive, so the business case for such systems was quite limited. The situation changed at the turn of the 21st century with the advances in data warehousing and then big data storage platforms. Data was now cheap and plentiful. Computer scientists found a way to use it by deriving knowledge from the data itself instead of formally capturing the flow of a purchase order through our organization. We work with a list of the events happening to 1000 purchase orders, and the software will learn the process linking the order of these events together with all exceptions. This is Machine Learning (ML), the more popular type of AI today. So much so that many companies are considering it synonymous with AI, and are seldom aware of the older, symbolic type of AI.
There are two types of ML: unsupervised and supervised ML.
Unsupervised Machine Learning can uncover patterns in data and groups according to these patterns without human supervision. An example is clustering, which groups data based on similarities in values. This makes it ideal for exploratory data analysis, when we want to understand data better. Unsupervised ML is applied in several real-world applications, for example:
- News articles on the same story are grouped under one label in electronic newsfeeds without human intervention.
- Defining different types of customers according to their purchases and demographics allows us to understand their purchasing behavior and helps businesses align their messaging accordingly.
- Car configurators can recommend relevant add-ons for more effective cross-selling strategies.
- Credit card fraud can be detected by learning the pattern of use for a customer and generating an alert when a transaction falls outside of the pattern.
- In webshops, customers are directed to a relevant product or a hotel matching past purchases.
During the iHelp Project, we also use unsupervised ML. For example, the patients who are part of the Hospital de Dénia-MarinaSalud (HDM) pilot are clustered according to factors deemed relevant for the development of Pancreatic Cancer, such as weight, age, activity levels, and blood sugar levels. iHelp project partners do unsupervised ML on our HDM pilot, aiming to then detect inherent risks of developing pancreatic cancer for each of the clusters. The specific mechanism for clustering used is K-Means – check it out at https://en.wikipedia.org/wiki/K-means_clustering
Supervised Machine Learning is the subject of our next blog. For the moment let us just say that we now tell the software which of our patients has developed diabetes for example, and the software is trying to learn the pattern of other parameters of the data which indicates that the patient is likely to develop diabetes. We can then take this pattern we learned and apply it to new data where we do not know if patients will develop diabetes, but the pattern can predict which patients are more likely to do so.