Knowledge Graphs are currently on the technology hype train with various venders, projects and articles springing up all over the place. The hype has been pushed by large companies such as Google who have published their own knowledge graphs.
There has been significant traction around graph databases that use graph structures to store data. These tools and projects are used to model background knowledge and define relationships between them. These graphs can be used for augmenting training data when it is sparse, and general graphs are used in a new form of data analytics called Network Science which is used in tasks such as web search and information retrieval.
The Semantic Web uses mark-up languages as RDF (Resource Description Framework) and OWL (Web Ontology Language) to encode semantics; which is simply the meaning of data, with the data itself. Therefore the mark-up languages are used to formally represent metadata, which is data to describe data.
There are a number of technologies or formal models that are assumed to be part of the semantic web. These models include: ontologies and taxonomies. Ontologies at their most simple are undirected graphs with edges and nodes. The node represents an entity and the edges (connections) represent the relationship between the two connected entities. However it is not the mechanics of an ontology that are important, but more so what the ontology represents. A common definition of what an ontology represents is “a partial, simplified conceptualization of the world as it is assumed to exist by a community of users – a conceptualization created for an explicit purpose and defined in a formal, machine-processable language”.
Unlike simple graphs, ontologies can come with schemas, or Description Logic (DL). Schemas for Ontologies are often written in RDFs (RDF Vocabulary Description Language), and in this case a schema written in RDF: is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies”.
Description Logics on the other hand are different to standard schemas, even ones written in RDFs. Description Logics are a set of logical axioms which describe the application domain, and therefore includes assumptions and statements about roles and concepts. A Description Logic is used by reasoners, an inference tool, to make deductions about the domain which is described by an Ontology. Description Logics have two types of statements:
- TBox (Terminological Box)
- ABox (Assertional Box).
The TBox statements are sentences that describe concept hierarchies such as relations between concepts whereas ABox statements are ground sentences stating where in the hierarchy the individuals belong.
After these brief descriptions can it be concluded that knowledge graphs and ontologies are similar or at least the members of the same family? The short answer is no.
Knowledge Graphs are at best a very informal version of an ontology because it only holds data. It is a little surprising that the IT community has returned to graphs for data storage as this was tried before the advent of relational databases and was shown to be less than effective. Ontologies however can be seen as metadata, i.e. data describing data – and the ontology can then be populated with facts which conform to the meta-data.
Because ontologies are metadata descriptors they can be seen as a tightly defined definition of a domain. Whereas knowledge graphs are more like a data lake where data can be dumped without thinking about its role in the wider domain. Consequently Knowledge Graphs are less reliable in tasks where data concepts and their relationships have to be clearly defined and known to all of the actors in the domain such as ad-hoc data integration and discovery.
The Berners-Lee idea of a machine readable web is some way off, however the recent activity of large companies in producing knowledge graphs is likely to push the adoption of semantic web tools such as ontologies – the limitations of knowledge graphs will be hit when software professionals discover their limitations.