Knowledge graph: an introduction for beginners
Author: Hongliu CAO, PhD
I become interested in KG (Knowledge Graph) since last year and I find it to be a very interesting and promising field. In this blog article, I’m going to introduce KG in a high level and less ambiguous way so that everyone can understand.
What’s KG?
The definition of Knowledge graph is a little tricky, because there is no agreement in neither industry nor academy. In the paper Toward a better definition of Knowledge graphs, the authors listed and discussed various definitions of KG in the literature (shown in Table below).
Good news is that, in this paper, the authors summarized and gave a definition to KG with three key parts: 1) integrates information, 2) ontology, 3) a reasoner to derive new knowledge.
Even though this definition is widely cited in many works, there are still two key points missing: 1) what’s knowledge? 2) why graph? Their definition neither talks about what’s knowledge nor mentions graph.
A simple definition
In this blog, I’ll try to give a simple, clear and less ambiguous definition of Knowledge graph in the following Figure: 1) Knowledge can be information, facts, understanding, perception, observations and so on. For example, a general world KG for Google search or Microsoft Bing contains a lot of facts about the world; while a KG for a company can contain various data they collected (user data, customer data, financial data, etc.). In either case, different data sources are needed and an ontology is necessary to define the shared vocabulary/definition/constraints of the knowledge. According to many studies, ontology also enables the ability of reasoning in the KG. Finally, the knowledge is represented in the form of graph with many reasons from database management to data representation (which will be talked in the future posts).
Basically, graph makes everything (well, many things) connected, using Google’s words: things not strings. For example, in the example above, Van gogh is not just a string, it’s an entity connected to many other entities with various relations such as birthdate, birthplace, artworks, etc. And when you search Van gogh using Google search, these general info is shown automatically (Figure 2) thanks to Knowledge graph.
Conclusion
Generally speaking, a Knowledge graph integrates the information from multi-sources with the help of ontology, and the knowledge is represented in the form of a graph. Why graph? There will be several future posts on this interesting topic. A knowledge graph makes your messy data silos/data lake transform into a uniformed, connected, meaningful, interoperable, reusable data with some advantages:
•Knowledge Graphs allow the easy integration of many different heterogeneous data from different providers in different quality.
•Knowledge graphs are easy to extend and scale.
•Users benefit from the fact that they can easily and conveniently retrieve all relevant data.
•The Integration of algorithms and rules additionally enables the execution of any kind of logic and arithmetic operations and thus saves programming, as this can be done directly during data retrieval.