Podcast

Evolution of Knowledge Graph

The “Evolution of Knowledge Graphs” webinar, held virtually on August 24 at 12:00 PM EDT, brought together a distinguished panel of experts from various data-related domains. The event aimed to delve deep into the world of knowledge graphs, discussing their innovations, practical applications, and the challenges they present. With speakers hailing from renowned organizations such as Google, WordLift, Infoloom, Yext, Eccenca GmbH, The QA Company, and Semantic Web Company, the webinar provided a comprehensive overview of the field. Topics ranged from cutting-edge developments in knowledge graph technologies to the enhancement of language models using knowledge graphs, emphasizing the role of knowledge graphs in driving data-driven cultures and fostering data coherence and consistency.

Lets stay in contact

Full transcript:

Dhaval: Thank you, everyone. Really appreciate you making time for this experience.

We have the best people from around the planet joining us and ready to share their knowledge on knowledge graphs. Knowledge graphs are such a fascinating field. I had just a little bit of exposure to it while I was working at a Large bank and it was such an interesting experience that I’ve been studying them since then and I love the topic of knowledge graph so we have assembled a team of people who are well known in this space and Abdul who’s gonna be the moderator of this experience will be leading us through this.

So Abdul, please go take it over from here

Abdul: Thank you so much. Thank you Dhaval so let’s just get started As you all know the topic of our you know webinar today is evolution of knowledge graphs From theory to real world applications, right? So without further ado i’m going to start introducing our panelists again I will really thank you guys for coming out today, you know sharing knowledge I’m going to start off with Sireesha she’s the tech lead at knowledge graph at google And she has over a decade of experience with data, from data analysis to data warehousing.

Not only that, she also shares her knowledge and she talks at different events. We thank her for her time. Moving on to Andra. Andra is CEO of WordLift and co founder of Inside out 10 red link. He’s a visionary entrepreneur and he has been working for the last 25 years with semantic web Seo and AI he also has various publications and he regularly speaks at different conferences

Moving on to Michel Biezunski He is founder of InfoLoom, he’s an innovator who pioneered knowledge graph applications. So that is awesome. we got the, top G’s and he’s also the co creator of TopicMaps Standard. So, you know, when I read about his profile, I was amazed, he started off with scientific publishing, then he became a professor, got into consultancy.

And then he moved into being an entrepreneur entrepreneur. One of his notable achievements are, text map for the IRS. So awesome. Awesome. Thank you Michael, for your time. Next stop is Michael Lannelli. Michael Lannelli is working as a senior data analyst at Yext. And he’s passionate about machine learning currently working on knowledge graph, augmented large language models from analyzing astronomical data gathered from telescopes to working with networks.

And as an instructor, he packs for style experience.

Moving to Chris Brockmann. Chris is founder and CEO of Eccenca, as well as Brock’s IT Solutions. He has over 30 years of experience and he has been working with data compliance supply chain and IOT. Next up is Dennis. Dennis is CEO and CDO of QA company. He’s also a PhD researcher and entrepreneur. He has over 20 publications.

And he speaks regularly at conferences. Next up is Martin. He is co founder and managing partner at semantic web company. He has been leading several national and international projects both with the industry and government. He has been working with enterprise knowledge graphs metadata and interoperability topics.

He’s also a lecturer, and he speaks at national and international conferences. So let’s just get started. So my first question is for Michael. So Michael, how does natural language search play a role in enhancing the utility of knowledge graphs?

Michael Lannelli: Hello, everyone. My name is Michael Lannelli. Just to add some stuff to that great introduction. I’m a senior data scientist at Yext. My background actually starts in engineering, and then I made my way to government research, edtech, and now I’ve found my way to Yext about two years ago where I’ve been ever since.

Here we build brand management software, and two of the core products here are A knowledge graph management suite, which many customers use in a search engine that you can build and customize that operates over that knowledge graph that you build. So you can imagine it’s full of a lot of interesting problems that I get to study and look at.

so just to repeat the question, how does natural language search enhance the utility of knowledge graphs? Right now, knowledge graphs in the current form, they’re kind of this large investment. Sure, they’re worth it, but you have to spend significant time to design it, to figure it, to curate it, and sometimes you even have to hire specialists to do it. it’s not really a surprise to anyone here, but they’re also very difficult for the lay person to use. Sometimes the user needs to formulate this question in the form of a sparkle theory oftentimes. And, that can be somewhat of a large hurdle to get past. Meanwhile, everyone uses Google and now a lot of people actually use Bing as well this year, surprisingly.

we would like to get to that experience, the experience of using Google and using Bing. over a knowledge graph. So natural language search makes using the knowledge graph ergonomic at both these ends, right at the creation and curation of the knowledge graphs as well as the consumption of the knowledge graphs.

the obvious way and consumption. It just allows you to access the information contained in the knowledge graphs with plain English. It’s something that we all use every day. so ask it a question if the answer is contained in the knowledge graph. Uh, you get an answer back also in plain English.

at the other side, search is also this ingredient. So we use search as a product, but it’s also an ingredient in our knowledge graph creation and curation algorithms. These could be things like schema alignment, entity resolution, even the CRUD operations that you do to manage those entities, create, read, update, delete within the knowledge graph.

It all starts with a search. If you want the layperson using it, you have to do a search to get that entity, and then you can start interacting. With that entity, one of the products we actually develop here chat you actually have this autonomous agent that allows you to interact with your knowledge graph entities.

Abdul: Perfect. Perfect. Awesome. So there is anything else you would like to add at this moment, you know, from your experiences, anything else?

Michael Lannelli: yeah. So, so the key point here is that, the knowledge graph can be a little bit difficult to use for the layperson, right? Natural language gives us this interface, much like LLMs and autonomous agents that are popular now, in which we can go and interact with the knowledge graph and overcome that hurdle. something like the product operations, imagine if you were to go in rather than having to go and write sparkle, or maybe your knowledge graph is contained in a regular relational database, right?

In SQL you can go in, you can tell your agent, okay. agent, I’d like to find a certain pair of shoes, right? And then you go, maybe the customer is this brand, you go, you find new shoes and you say, okay, tell me how many are left in stock. Where can I find them? Which places have them right now in a size 10? That is the kind of goal that is where we would like to get to where we can go and talk to our knowledge graphs. Okay. Thank you. Okay. Thank you. Awesome. Awesome. Thank you. Thank you, Michael. So moving on to Martin Martin from your perspective, how have linked data and metadata evolved in shaping foundation of knowledge graphs?

Martin Kaltenbock: Yeah, thank you Abdul and hello everybody. So Martin Kaltenbock CFO and co founder of the Semantic Web Company. Co product as you see on my t shirt is Poolparty Semantic Suite. So here we are going into this question. Thank you very much for the question and for having me. So I would like to go a little bit to play the historian here because you, you mentioned linked data and linked open data.

And I’m also it’s nice to, to meet Michael because we even used topic maps some time ago before we switched to RDF. So that’s also very interesting, but I think it’s nothing new for the people here in the panel, but maybe for the audience, interesting when you say, okay, what happened? Coming to this knowledge graph movement that we have in principle, it was all about semantic web.

So what, what happened when did it start in principle, when Tim Berners Lee started his vision of the semantic web in 1991, it was nice at CERN in Switzerland, he always put. The vision of the semantic web on the notes. It was not the worldwide web as we know it today, where we just interlink documents.

It was a vision of interlinking data. And in principle, this semantic web principles that we had at that time, switched them to link data. You just can put the open as you put into your question into brackets. there was a big movement of open data in that time. And we wanted to link the open data mainly provided by the government together to provide knowledge.

And in knowledge graphs, at least we here at semantic web company, we exactly using these principles. So we’re coming from semantic web technology. And what happened there, we had done this linked data principles that was around 2010, 2011. That means to get a little bit more into detail, you have to use URIs, Uniform Resource Identifier, as names for things.

The second one is you use HTTP URIs so that people can look up these names. So you have a thing and you look it up and you can look it up in the web. The third one is when someone looks up a URI, you have to provide useful information and recommendations that you use standards for that. And then you include links to other URIs so that people can discover more things, we said at that time, follow your nose, right?

You go from one resource to the other. And these are the basic principles that Tim Berners Lee already put on these notes when he has designed the World Wide Web. But then we started, as said before, to interlink documents instead of interlink the data that we have inside of the web pages. And this is there when, then a little bit later, The World Wide Web Consortium, W3C put out or published a recommendation, a standard that we call the Semantic Web Stack.

It’s also called the Semantic Web Cake or Semantic Web Layer Cake that illustrates all these architecture, all these technologies like Topic Maps, or RDF that is in, or Spark, or our, Uh, basic query language and so on and so forth. Don’t want to go into all these details going along in the history.

And this is something where Sireesha comes in Google then stated in 2012, a very nice term they said, Oh, there’s a knowledge graph. And what does it mean? It came up and they say, we are working with things and not strings. So that means if I have a thing here, for example, my water bottle, that’s my stupid example that I always use, you say, in principle, this thing is unique, but it can have a lot of different names.

But the thing will get the URI and identifier, this bottle that I have in my place. Where is it? That is not in front of my face. The bottle that I have here, but it can have different names. If you give a machine a string. And I say water bottle or bottle of Flasche in German, as we call it. The machine already starts to struggle because the strings are different.

Flasche and bottle, two different strings. The machine starts to struggle. But if I give this thing that I have here, a clear URI and identifier, the machine easily finds out that this is a unique thing, no matter how we call it, no matter how it looks like and so on and so forth. So this is the basics of this, this thing of history that we came in. What we use also today. For knowledge graph developments, as I said, as least in semantic, technologies. And then we had a lot of big developments in these technologies in the last decade, I would say. So it’s about availability. It’s about performance, stability, security. If you have a look at graph databases and, evolution of graph databases or RDF graphs or property graphs, for example, onto text graph be in place.

You see that. This is stable. In the meantime, 10 years ago, 15 years ago, we really had research projects. We struggled a lot. So I would say the whole technology, the whole, technology stack matured over time. Also graph modeling, uh, software, as we have here, colleagues in the room that are providing such software.

That really works out. And the visualization of knowledge graphs is another topic. So that’s where I would say the linked open data evolves into knowledge graphs. And you also mentioned metadata, and I would like to use that maybe as a second statement. So in metadata, I remember very well, we had systems in place, document management system, content management systems, where we really, we maintained.

The object in a content management system manually and individually. What does it mean? I opened the document and then I added classes from a taxonomy or even not a taxonomy. So I annotate a document manually. That brings then, that was already done by taxonomies mainly in the English speaking world.

So German speaking world taxonomies are coming up more and more and also graph models are coming up more and more, but now we’re using these taxonomies or classification systems in metadata management more automatically. So we don’t have to do it manually and we do it centrally. So we have a knowledge graph in the middle or a central knowledge model to say it like this.

And when you annotate. Documents objects in lots of different systems, so you enrich the metadata to say it like this. Then you use the centralized system automatically, and you add to all these objects your metadata from a knowledge graph, thereby you link. Your document or your object into that knowledge graph, what is very nice.

And you can do changes in a central point. So you don’t go into 25 systems and you open up all the objects that I’ve really seen before in 15 years ago. You just do it once, you just make one change. You add one thing in the center of the knowledge graph. That means you have. You save a lot of cost, you save a lot of effort. I think it’s called in the meantime, Gartner calls it dynamic or augmented metadata, what we do.

So we all send automatically the machine into a data set, into a document. This is analyzed. We find all the nice, interesting. concepts, the things as to say like this, we find the strings, but we map it to a thing. And we link it with the knowledge graph and by that we have a very dynamic metadata or augmented metadata in place, so it’s automatically done.

And as I said before, it enables then powerful applications, having this metadata in place like search or recommender systems. And I would say that a knowledge graph and the knowledge graph principles for metadata management is something that you really should think about.

Have a look at it because as I said before, it lowers the costs. It lowers the effort that you need to do your metadata management. So I hope that is a little bit insightful regarding data management and knowledge graphs. Thank you.

Abdul: It is. Thank you so much, Martin. Thank you. That was very insightful.

So moving forward to Sireesha. So Sireesha, given your experience at Google Knowledge Graph, how does data storytelling impact the way we perceive and interact with knowledge graphs?

Sireesha: Hello, everyone. I’m Sireesha. I work for Google, and I support teams that manage the knowledge graph within Google or with analytics.

So coming to the question of how we can perceive knowledge graphs in terms of data to storytelling, which means like how we would interact with it and consume it.

Taking a step back when you look at data storytelling it is a process that involves building a narrative, adding context to the data in order to turn it into knowledge. Which can then drive action. So if you look at the data life cycle and its value chain, much of the knowledge generation usually happens towards the end and outside the data stores and is often only captured in slide decks or wherever you’re presenting your story, right?

But knowledge graphs embed knowledge. It’s what it says, like, which means semantics and the context along with the data. that’s moving the value generation upstream. So once you have all the interconnected data with the context and a way to interact with it, to ask complex questions, you can then make informed decisions and solve problems faster. So at Google, the Knowledge Graph helps improve the search results and enable better user experience by providing related and contextual information. So it also helps with providing answers to the direct questions that users post. So pretty much everything beyond the web results that you see on the search results page comes from Knowledge Graph, right?

So beyond that, Google Cloud has an offering in this space, Enterprise Knowledge Graph, which is in preview right now. And it enables organizations to build their private, enterprise knowledge graphs and leverage Google’s KG, to identify and reconcile entities. So, which is key for building products such as Customer 360, Supplier 360, or Product 360 that uncover a lot of use cases for organizations for improving customer journeys, product design, and so on.

So, which is actually a little bit difficult to achieve in non draft, solutions. So coming to interacting with KG, what I see is mostly in the form of OLTP or building applications on top of over KG to address specific use cases such as showing related entities or providing answers to the questions or identifying anomalies or fraud analytics or detecting fraud. So these are, as I said, it’s OLTP or online transaction processing use cases. that provide fast real time results over a very small set of graph. There are also tools that allow users to visually explore the KG, to navigate the relationships and find hidden patterns. But I feel like it’s not completely mainstream.

There are limitations to that kind of consumption. So as Michael pointed out earlier, like the consumption itself needs to be simplified as the technologies and tools evolve. So I believe more powerful stories can be told with knowledge graphs by looking at the graph at an aggregated level. That is, you would like to be able to run some analytical OLAP workloads, which is Online Analytical Processing workloads on knowledge graphs.

Looking at the entire KG or just a, even a huge portion of KG. So in the relational world, you can create OLAP, you call them OLAP cubes. or the so called semantic models that embed the relationships between the various relational tables you have to whatever extent is possible and available.

And then you run business intelligence queries over it. So, and ask questions about aggregated metrics over multiple dimensions you slice and dice. In the KG world, we can still do that. That is, you can run the traditional OLAP queries to aggregate the aggregated metrics. That is still beneficial compared to doing the same on the relational platforms. Because the virtue of like diverse set of interconnected data and the relations embedded within KG, which are not possible in the relational structures. However, I feel and believe that we need not stop there. Similar to relational OLAP queues, we can create maybe OLAP networks or groups. We should look at OLAP knowledge graph models as aggregated networks over multiple dimensions. where the relationships are aggregated in addition to the entities which could be abstracted or condensed. So the nodes would be the abstracted or condensed entities and even the edges or the relationships would represent the weight of that relationship in an aggregated manner so this is useful in cases where the aggregated network structure awareness is needed.

Beyond just the numerical applications, we just don’t want to count off some number of entities and so on. So where does this network structure helpful in applications such as targeted marketing on social network platforms, for example? And there are approaches like KG OLAP and GraphCube that address these OLAP use cases.

This is still an active research area. So I’m very excited about this line of research and watching this space closely because I believe that the aggregated networks open up more complex analytical use cases and enable us to. Tell more powerful stories. Perfect. That was so insightful Sireesha. Is there anything else you would want to add at this point?

I want to plus one to what Martin shared earlier about the metadata and imports. And I believe that Knowledge Graphs is the best platform for us to build the data catalog embed all the metadata and the relationships. So that we can leverage that metadata. You can easily discover that metadata and understand the business data within the context of the metadata. So there are two ways to do that so you can either build your data catalog as a standalone kg. So all you have in there would be the assets and all the business and technical metadata.

Another way of doing it is you have your business kg with the business concepts as entities and relationships. And you add this metadata, some of the metadata which is relevant as context to this graph so that you have both the business data and the metadata in the same graph and you can use this context to better understand the quality.

And also when you think about business metadata, it could be the way you classify your business data, right? So it could be PII, PHI, some other classification that is relevant to your business. So that’s another area that I’m very excited about where I feel like it should become mainstream. And I would like to end with one observation that when you say, unified interconnected enterprise data that is a single source of truth, which has been the promise of the centralized enterprise data warehouse for decades, but actually, I believe that in its truest sense, it can only be achieved through knowledge graphs because of the limitations of, how many kinds of relationships you can actually model into a relational structure.

So, yeah, you all know it’s the KG is flexible. You can evolve the schema without disrupting the existing. You can onboard new and diverse data sources, including unstructured data sets, where these NLP and LLMs will help getting that knowledge into the knowledge groups.

Abdul: Awesome. Thank you, Sireesha. Thank you. So my next question is to Martin, so how do you human in the new mechanism and has the efficiency of knowledge graphs, especially in industries and public administration.

Martin Kaltenbock: Thank you very much for that question. In principle, human in the loop mechanisms have become very famous over the last years, but we had a little bit of a game changer, but I think that will be also in this discussion. So now we have generative AI that can a little bit also replace human in the loop. But I don’t want to jump into this too deeply because I know LLMs will be a part of this discussion a little bit later on. Nevertheless, it would be very interesting to learn from the group also how you, how you see still the importance of human in the loop or if LLMs can take over the humans. I hope not, but I don’t think so at the very beginning. So what can a human in the loop Just provide to the knowledge graphs for sure, modeling, but here exactly LLMs is a discussion. So can an LLM replace a person that brings in the domain knowledge into your knowledge graph, or at least into your core model that you need in place? By this, for sure, it’s about data quality or about quality of the knowledge graph. So a human being still would say has the more advanced knowledge, taking a look at a K graph and then saying this is correct or this is incorrect. Then it’s all about also about languages or multilingualism, I would say, because we have a lot of translation system machine translation systems. But if you go into a domain specific area. I’ve seen them fail heavily. So a human can still help you when you want to translate knowledge graphs, or you say, I want to have a multilingual knowledge graph a system in, let’s say, 24 languages that we have in the European union. So we enable that all the citizens can really work on a topic in government. For example, that’s a big topic. We call it European language equality so that in every country you are in European union, and you don’t speak the language that you can really also understand the systems of the health. For example, the health systems or the insurance systems or the government, the public administration systems.

So you need this in different languages, not only in English. And that’s a little bit the issue we have mainly English in place in all that systems at the moment. from our perspective. So what when we Do the modeling first approach modeling and knowledge graph. I would say 90, 95% is something we can do fully automatically in the meantime.

Nevertheless, the five to 10% we need a human, or we want to have a human making the decisions and also doing the modeling decisions. That would be interesting to learn from others, how they see. And then it’s for sure something that we call responsible or explainable AI. So human in the loop is a basic for such systems because machines can easily. Push a lot of bias into your graph or into your AI system that we can avoid hopefully by humans that we have in place that do the modeling. So explainably I and responsibly I something where I would say knowledge graph anyway. Could do a lot of work, but we still need humans to do that properly because otherwise, you know, you train that system and you do the knowledge graph fully automatically, but it depends how you have trained your machine learning or your AI system. Is there bias inside? What was the training data? Which algorithms do you use? And so on and so forth. When you have a reference able human that has modeled this, you at least can say, okay, there was an expert doing that. Finally. One word nevertheless regarding, generative AI. So I think that, I would call it maybe, a human prompt engineer, a human knowledge graph prompt engineer.

So, in principle, your knowledge graph can help you a lot to do prompt engineering into an LLM. So we do this with a little demo. We call it pool party meets chat GPT. You can type in a question about ESG. We take the question, we enrich it with a knowledge graph in the back. So we do prompt engineering using the knowledge graph, and then we push the question into chat GPT. And we see that the enriched question using a knowledge graph gives much better answers than we have seen just a human being typing that in. And there could be a position maybe in the middle. That’s just maybe Martin’s future vision, a person that helps. To do this, in the area. And finally I think you knowledge graphs are very important also for interoperability to enable interoperability of systems, mainly data integration. And here we need an interoperability. You need the human in the loop because interoperability is about semantics. So it’s about meaning, but it’s also about legal or organizational topics. And I don’t think that the machines can do that fully. So for a short example, if we have a few minutes time, we organized a workshop last year on interoperability in data spaces together with the World Wide Web Consortium and the International Data Space Association. And the output was we are missing vocabularies and knowledge graphs for a lot of domains, and we need the human beings to help to do that. And we missed it in languages. I said that before, but repeating it because I think it’s important. There’s a lot of stuff in English, but there’s not a lot of stuff in other languages around the world.

And we have a lack of standardization. So human in the loop can help to close these gaps that we have identified. This is my opinion here in my statement. So we still need the humans for modeling quality control to ensure the understanding. Of machine suggestions for decision making. If you just give me information and I don’t use my brain, I might be, I might do a very wrong decision that runs or that, that ends up in, let’s say cost increases or costs that I don’t want to have in that. So that’s my statement on human in the loop, for information, we’re doing the second workshop with people that are interested interoperability in data spaces on 20th of September at semantics conference in Leipzig. So people just join in happy to share information about that.

Abdul: Awesome. Awesome. Thank you so much. Thank you, Martin. So moving on to Andrea. So how does integrating LLMs with knowledge graphs enhances AI’s generative capabilities?

Andrea: I think that’s an area where we see a clear advantage of a systematic approach to the human in the loop that, Martin was talking about. so we have Developed a workflow, inside our WordLive platform that allows editors iteratively to pull data from the knowledge graph and create dynamic prompts based on this data. And these dynamic prompts Then can be sent to possibly a fine tune model. And also when fine tuning the model, I was just testing the fine tuning on a ChatGPT 3.5. I also use knowledge graph because with knowledge graph, we can quickly look at what portion of the data we need to,define the training dataset. And so with this, workflow, we take the data from the knowledge graph, and then we can create these dynamic prompts with templating language and then we can generate the content.

But still there’s no enough quality into this generation. Unless we start adding some alignment. And so once again we use the knowledge graph for AI alignment. And that means that practically the editor can define validation roles on top of the generated content and also potential fixes to the generated content that will again, get the data from the graph and feed it into, into the prompt that will fix.

The potential, mistake done during the generation. So there is a kind of a synergy between these semantic memory that a knowledge graph represent and the ability of a language model to reason. So one scenario is that of course we can blend the knowledge graph and the language models for creating content.

And that scales extremely well because we can then create content every time that we have a new campaign or every time that there is a seasonal change or every time that the business requires to, reshuffle part of the catalog, but we can do it with the quality that an editorial team would bring and the human in the loop is involved because of course there are helping us, finding the training dataset for creating the model in the first place. And then they are defining the validation rules that will be applied to looking at to validate the generated content. But when we look at AI agent that truly represent one of the most interesting aspect of today’s use of LLM then once again, the knowledge graph allow us to create this data repository. That an LLM can reason on so I can, have all the people data in one side and then I can have all the, you know, company data on the 1 other side. And these are just queries out of the same graph, but then the agent can reason and can look at these data sets. As sources for providing, answer in a more traditional rock system or a retrieval augmented generation as it is called. So in retrieval, augmented generation, once again, you need to get the right data. And feed it into the model. So I see these continuous interplay.

Okay.

Abdul: Okay. That was very insightful. Andrea is there anything else you would want to add at this point? Or maybe later?

Andrea: I think that what is interesting is really the ability that language model has to understand the structure of the data. And to, , integrate the structure of the data, we can also start creating data agents that are actually, enabling the vision that Michael introduced, you know, of a user that goes on and say, okay, find me all the entities that deal with these problem and find me all the potential solution to this problem within the context of the graph. and language model is very well trained for creating either sparkle queries or GraphQL queries, and then combine all these data and bring it back to the human. And I think this is also a very interesting frontier where we see that the interaction with the knowledge graph it’s extremely fruitful.

Abdul: Awesome. Thank you. Thank you, Andrea. So my next question is for Michael, Michael. So with your passion for machine learning and its integration with knowledge graphs. How do you see this shaping the future of AI driven solutions?

Michael Lannelli: So broadly speaking we use machine learning to improve the ergonomics of using the knowledge graph. I touched upon that a little bit before. but it’s used to, or at least the way we use it is to create and curate the knowledge graph. So we can open it to a broader set of users.

There’s a lot of machine learning used for something like open information extraction, where you are given some input in some mode. It could be audio, it could be an image, it could be text. and you want to use it to put in edges in the knowledge graph, which then you can use for your other downstream products.

And it’s useful for if you consider those edges facts, it’s useful for searching over those facts, verifying those facts. So you can go and continue building and curating a knowledge graph. You can use it to verify the output of your models. I think someone asked that earlier. Something that comes out of your gen AI system, for example, shouldn’t contradict information in the knowledge graph.

It’s someone that’s put it there. Also. The same way search is a key ingredient in our knowledge graph creation algorithms, machine learning is a key ingredient in search. You can’t have semantic search without machine learning. So you need machine learning to actually build the vectors that go into the vector databases that everyone’s using right now.

You need it. To actually go and retrieve those vectors, you need it for the re ranking, the second and third stage re ranking that happens after you do the search of the vector database. And then finally, you need it to take those documents and convert it to a direct answer. Let’s say you are using an LLM, you want to take that context that you’ve added to the prompt and turn it into a generative direct answer. You also need machine learning for that. So there,machine learning is pretty much. everywhere here. It’s almost like electricity, right? Like you, it’s, you can’t have anything without it in our knowledge graph workloads. Another thing I just mentioned there because of GPT another big popular topic right now is vector databases. they’ve been raising a lot of money this year. They’re very important for generative AI because in the workflow, they enable you to do that prompt engineering step where you go and grab context of user’s database so you can take the user’s documents and use the LLM as a reasoning agent. To return an answer to the user the query that you enter into the system itself at the beginning is turned into a vector and then search against the vector database. What I see happening with respect to knowledge graphs is shortly in the future, we won’t just be using vector databases, but we’ll be using knowledge graphs as the backend as well along with, you know, standard lexical search, like elastic and anything that’s TF IDS based. And then that goes into the topic of knowledge graph embeddings. They’re very useful, but they’re not used in search at the moment because they’re not very popular. And they’re a little bit difficult to get into the same space as natural language. So very few people are using them in the vector database examples. In fact, like the use cases, they always show you the simplest use case, and very few people are actually using the knowledge graph.

In fact, I encounter a lot of people surprised that knowledge graph embeddings even exist at the moment. I think they will be good in the future and we eventually will be using them in semantic search and generative direct answers.

Abdul: Perfect. Perfect. Awesome.

Okay. So moving forward to Michel Biezunski.

So what do you believe are some unmet expectations users have from graph based technologies and how can we address them?

Michel Biezunski: Okay. So, first, thank you for inviting me. So from the point of view of the users I think there is some kind of a deception from knowledge graphs in the sense that if you think about the way our brain functions, We are able to associate anything with anything and it’s free association and nobody can really imagine what anybody else is thinking and you’re doing association which are completely out of the blue for other people. And so the question is that. The knowledge graph promise is to be able to connect basically anything to anything. However, when you start looking into the most existing technology, and I’m not talking about all of them, but many of them, you’re still prisoner in some kind of a schema, which I would believe in is inherited from the object database period where you have to define an object with properties and then you have to define rules on the objects and so you have to have an anthology behind it, etc.

So there’s a lot of things going on and you cannot really define what you want. The way that it appears and the problem here is long term maintenance of knowledge graphs, because if you start feeding your knowledge graph with some kind of a taxonomy, which is already outdated, you have new information coming in, it doesn’t completely feed and the rules don’t completely apply, then you’re faced with a problem.

So either people are actually forcing their way into you. cheating a little bit about the nature of the information. So, well, it’s similar to that. So let’s put it there. But when people, or they actually force you to just revamp your application, and it can lead to a very costly situation, because sometimes a lot of stuff is built on top of existing schemas.

So anyway, so you’re faced to a real problem here. Which is basically the inability to keep up with evolving data and heard of data stuff that, nobody could have imagined would happen before, etc. So, so for me, that’s a problem and in a way, so this comes from database tradition where everything has to have a place and, well defined.

So I am pushing to respect, the complexity of the world we live in. And especially the allowing for messy information to co exist with non messy information. So, you were talking about my experience. I was working with the U. S. government, especially with the IRS to improve their customer service.

And you would think this is a well organized information, but it’s Not really. It’s very tricky. It’s very complex and it’s not rational at all. It’s just things. There are many, many different kind of, subtle differences. And so in order to handle that, we had to basically. Allow for messy information to coexist with non messy information and to try to create some, I would say iterative create curation processes where you decide to clean your graph. One sector at a time. So you decide to focus on one part, but the rest is still messy. And if you think about it, it’s like the way we leave. When we are in our house there are places in the house that are messy and it’s okay. And we decide to, maybe sometime we clean a part of it, so it’s exactly the same thing. So what I am pushing for is trying to. Make the information repositories and data systems compatible with our usual way of life and not try to adapt to the limitation of computer systems, which require things to be pretty well defined, but instead to create hospitable systems that would be allowing us to deal with this messy situation where we are in all the time. That’s basically mine.

Abdul: So do you think is it gonna happen anytime soon? Or is it a long shot?

Michel Biezunski: The interesting thing is that there are people who are basically saying now, we don’t want it. Take care of all of that. So they rely on LLMs and AI without even knowing what they’re doing. And they say, okay, well, we don’t have to do anything. The system is going to do it for us. The problem with that is that you’re losing control over your data. And it really depends the system for data management. I think you can divide them simplistically in two parts. One part is when you are looking for information you don’t know about.

So you’re looking for, I don’t know, new customers or, so, so you’re exploring data sets on which are provided by other sources and it’s the domain of big data. So that’s one, one side. The other side is when you want to manage your own data. So when like your company’s internal data or your own product, you know what you’re talking about, you know exactly what they are, and you want to make sure to describe them the best possible way and to allow for multiple ways to access them so that people who basically need them don’t miss them.

So these are two different kinds of situations, and I think that The first thing to understand is which situation we are. So are we mastering our own data in that case? we could use some more flexible architectures and some other things. Whereas if we are in the middle of, you know, just exploring things we don’t know about, this is different. So I would say this would be the first step. And so it requires some kind of a change in the approach, and new questions that you won’t want to ask about, who owns the data, who is in control, who, et cetera.

Abdul: All right. Thank you so much. , so my next question is for Chris. So, Chris, your venture is sensor promises a unified semantic data source. How does this address the challenge of data loss and inconsistency in the industry?

Chris Brockmann: Thank you, everyone for having me. I’m Chris Brockman, the CEO and co founder of Eccenca. I would like to make a couple of remarks about knowledge graphs before I go into your question, if I may, because as an advocate of this technology, I’d like to contextualize it a little bit. And I think a lot of the people that have spoken before me have already given us a lot of that, but I want to summarize this a little bit. The key concepts of knowledge graph that excite me about it is the traceability, the linking ability to evidence, explainability, sustainability, ownership of intellectual property. It’s a big thing with these days. Governance and quality assurance shackle shapes and stuff like that. Scalability and of course, the reasoning in deductive and inductive ways, I think is very, very powerful and also very heavily underrated. I know that our society is excited, almost obsessed with the idea that generative AI can create, it can be created from deep learning alone. Well, generally I can be created from deep learning alone.

When my kids asked me about it, I said this one could argue that humans have intelligence, right? We all know that humans are smart. And actually, we would even say that probably our intelligence is still a thousand times more advanced than what we see with these exciting LLMs but still at the same time we make our kids go to school. But why do we do this? Because, with great power comes great responsibility, blah, blah, and great opportunity. And we want to make sure that our children use this power and these opportunities responsibly. By the way, this is also the reason why we take them to church or we teach them family values or we teach them algebra and other sciences. Right? Why do we do this? Because we want to make sure their future decisions are built on the solid and reliable foundation of values and knowledge. And I’m actually seeing this in the customers that we work for. They want their AI to be trustworthy and they wanted to be built on a traceable. it’s what they wanted to be evidence related.

They wanted to be explainable. They want to have complete ownership. Of the IP that’s involved in the decision processes and they want to have full governance. And and of course, Martin pointed this out in an old graph, you can manage this knowledge centrally and disperse it across, you know, 25, 50, 102, 000 different decision processes and manage it centrally still. So while managing an old graph is still a critical problem that requires great tools. And that is the answer to the question, what do we do? Yes, we help maintain the knowledge graph and T silo the data. But it’s really critical to keep these things in line. And I think this we want this for our Children. And we believe that these concepts are important. And we want to make sure that our Children taught these necessary basics so that they will use their powers and opportunities responsibly. And I think we want that same thing. For AI, knowledge graphs are the treasure trove for humanity’s greatest achievements. We are able to digitize knowledge, experience, morale, values, societal rules, right? And if we translate them in a machine, we can scale it amazingly. And I think we had a discussion today. Also how we can enhance the prompt engineering, how we can use them to validate decision making. But what I see in my clients is that, you know, they understand that they, if they want the machines to be producing outcomes that they can trust or that they want to believe in, or they that they want to trust their company with, or maybe even.

In a governmental scenario, maybe, make decisions about shooting a rocket or not or something like that they want to be sure that they’re not making mistakes and knowledge graph are really exciting technology to assure this capability, right? A sensor has built this arguably a world leading platform to fuse knowledge and data. And build this other side to the coin. And I think this is something that we don’t really see in the conversation very much, but I think, like muscles can go without bones and bones can move without muscles. I think knowledge graphs and generative ai, I think are really two sides to one coin.

And, They can be brought together in a wonderful way. That is very, very powerful. We have the opportunity to work with some of the world’s largest organizations on some of the greatest problems. And it’s also, I think, very important for the audience to understand the knowledge graph. I think in the introduction, there was a discussion about whether knowledge graphs are niche technology, but they may be a niche technology, but I think they shouldn’t be. Not only because of their ability to be the other side to the coin in AI, but also in and of itself I think knowledge graph can do incredibly powerful, full things. And we see this in amazing ways for instance just yesterday, I had a call with a customer who reported that they actually use a knowledge graph to achieve about a half a billion dollars in annual operational savings.

So knowledge graphs are not only a store for knowledge, but they are also the, a very, very, very powerful tool for the best,to get even better. So, despite my enthusiasm for the potential of AI, I want to make sure that when I allow an AI to make a decision for me, That AI takes into account all the facts, all my values, all my rules.

I want to guarantee that this is, I will be able to explain why and how it came to a certain conclusion. And if you are in this audience and you want in your organization, to leverage AI, to grow, to compete for the future, then you will want. This AI to guarantee for you that the decisions are made well in consideration of your knowledge and of your values.

And, so even if you don’t really know why you need the knowledge graph in the very first place, or you haven’t really found the proper place. I think it is time for you to start building this type of an infrastructure, because at the end of the day, if you believe that AI will be relevant to your organization success in the future, I think knowledge graphs will be relevant to your organization and you will have to build it now because others are doing it and having great success with it. And so in a competitive scenario it’s time to start using it. So thank you for that. And yes, a sensor builds a tool and sensor dot my, you’re even able to use it freely. It’s available as a sandbox complete with the training program, et cetera. So, because we wanted people to be able to build their knowledge graphs without code, even without knowing sparkle and all that.

That’s why we provide it for free. you know, because I think it’s such a critical element in our way forward in embracing AI in the future.

Abdul: Awesome. Awesome. Thank you. So I’m going to move forward to Dennis. So Dennis question answering over knowledge graphs. Has been transformative, right? So how do you see the role of platforms like QAnswer evolving?

Dennis: Hi Abdul, thank you for your question. When we started with QAnswer, basically our main goal was to make enterprise knowledge accessible via natural language.

And we started with knowledge graphs. And, if you think about it, where knowledge graphs where are they used today? Where do they touch our lives? And one of the main touching point is Google search. So in fact, we are accessing a knowledge graph every day. And probably most of the people are so very, very sure.

Most of the people do not know. So if you ask a question, like a director of a film, or if you Google a person and you’ve got this. Nice knowledge panel. And, with the description and the image, all this is fueled from the knowledge graph, and basically we see that question answering or search is in fact, one of the biggest use case for knowledge graphs.

So with this idea in mind, we started Q answer, which is basically a platform today that enables you to take your property knowledge graph, for example, in enterprise, but it could be also for an individual to basically index it and to get the search experience as Google search has basically on top of your own knowledge graphs.

And, this is, I think one of the biggest use case for knowledge graphs. And that’s why we invested so much on it. And then at the same time so while knowledge graphs are the perfect data repository for structured data, we saw that in enterprises, there is a lot of unstructured data in the form of PDF word, websites and so on.

And so already some years ago, we started in investigating new technology to access, unstructured data. And then there was this big change with a large language model some months ago. And, basically these instruction based large language model understand language to a new degree. And we applied this immediately because it was easier to our unstructured part. And Basically today. You can do things that one year ago, we didn’t thought was possible. You can answer questions on top of unstructured data, which is very accurate, very nice to read, very user friendly. And you need to set this up.

You need really minutes. So you can take your content, you can take your website and you can create on top of it very easily search or chatbot assistance. This technology is basically even it is very disruptive because this chatbot and this search domains or technology are basically merged to the same thing.

And we do this in house. So we use our own large language models. They are not as large as other big known language models, but they are good enough to fulfill this goals. And the nice thing is you can run it on your enterprise infrastructure and the data will never leave your infrastructure.

And, still you will get this amazing search experience and after learned this, after having investigated how the combination between large language model and unstructured data works, this is relatively easy. Now we are basically moving and refactoring our technological stack for the structured part, because we strongly believe that knowledge graphs are the perfect data repository for structured data in an enterprise.

For example, we are constructing a very known graph, the EU knowledge graph. It’s a graph that is maintained for the European Commission, and it aggregates information about countries, capital, different member states. It has geometries of region, it has populations, it has projects that are financed by the Commission. There are beneficiaries, all this integrated in one huge graph that is basically creating interoperable vocabularies for all the member states, and it is very successful. And we see a lot of traction and interest in different parts of the commission and reusing the data and adding the data and making it more interoperable. So we see really that to aggregate institutional or enterprise knowledge graphs. structured data knowledge graphs are really, really good place. And currently we would like basically to, reinvent our technological stack for question answering over knowledge graphs, because we think that there is a lot to do there, basically asking questions, generating analytical query over this data, but also visualization and basically making all this data that is still accessible very easily. By decision makers by citizens and in general by people that are non expert and do not know basically anything about knowledge graphs.

Abdul: Perfect. Perfect. Dennis thank you. Thank you.moving forward. So, My question to all of you is, where do you see knowledge graphs in the next five years?

We’re going to start off with Sireesha. So where do you see five years from now? What will be the future of knowledge graphs?

Sireesha: I would expect to see knowledge graphs become more mainstream as tools and methodologies evolve. So main thing is, needs to be easier to consume only when it is easy to consume.

Then we can build more applications, more use cases on knowledge graphs, and eventually even building the knowledge graphs would also be made simpler compared to today. So in terms of becoming mainstream, a couple of areas that I see or expect some progress would be in terms of OLAP knowledge graphs.

And also, maybe we already have it today maybe we see that expert systems of 70s and 80s. , making a comeback, right? So those were all rule-based. Now we have knowledge graphs. And with the help of Gen AI and LLMs, we can build this domain specific expert systems powered by knowledge graphs that can help various areas like diagnosis and all the things that expert systems were originally thought of addressing. So that’s one thing. And another piece is, again, with Gen AI and LLMs, I think, and we’ve talked about it today as well, like they’re complementary to each other, KGs and the LLMs, because you can use LLMs to build and enhance KGs by looking into the unstructured datasets and increasing the quality of KG itself. And also, you can use KG to augment the response of LLMs because we all know at present that LLMs hallucinate when they don’t know certain things. So I think the best way to counter that is to target with knowledge groups.

Abdul: Perfect. Thank you.

so I’m going to move forward with Andrea. So what do you think? Five years from now, where, where do knowledge graphs are going to stand?

Andrea: I think knowledge graph will be more common, you know, as, as we say, primarily there will be a way for humans to transfer knowledge, to advance, AI systems,

so the problem of alignment, it seems to me, one of the biggest that a knowledge graph can tackle. That also means that I do expect to see the development of personal knowledge graph, which right now start to exist, but haven’t been properly formalized. And I think to these regard.

Also, I do expect,the vision that Michael introduced of these schema less. graphs to finally become a reality, at the moment, it is true that any way of formalizing knowledge representation will go through some level of structure, which doesn’t fit well with the idea of the way in which the human brain works.

And so I do expect that, personal knowledge graph on one side and schema less graph will advance, but still there will be a way for us to control more advanced system or to build more advanced system. And so that’s, that’s how I see them in the next next five years, of course, we’re very highly focused on SEO.

So we just want to understand how we can use these graphs for making, Google being other AI system more aware of, the company, the people’s content, and that’s why I’m particularly interested in personal knowledge graph and how of course, interoperability will make this happen because, things have to be interoperable.

We’ve seen that any knowledge graph that were live creates does as some level of impact on Google’s knowledge graph. We do not know exactly what’s happening in between, but we know that the use of structured data. It’s people in a generative AI ecosystem where the content can be generated on the edge and it’s dynamic but it’s generated based on some data, some structured knowledge that it’s passed to the machine.

So that’s the way you see it.

Abdul: Awesome. Awesome. So moving forward to Michel Biezunski, since you are one of the early people who got into it very early. So where do you think it will be in five years?

Michel Biezunski: yeah, one possible way would be to abolish the distinction between what is currently called meta data as opposed to data, because when you start integrating things together one person’s made a data is. Someone else’s data and it’s creates kind of a messy and tricky situations. And actually I would generalize that to processes and semantic relations for example, when someone creates a data, it’s a process of creation, but it’s also a relation between this person and the data point. And so if you start to integrate all of that together. And basically level the difference between, process relationship on one side, data and metadata on the other, then you have an unlimited number of possibilities that appear.

And of course, the problem with that is that it requires much more computing power. Because you will generate lots and lots of data. So what I think is that when computers will get more powerful, maybe quantum computers or something, which will just get us to another generation of computing we may be able to actually get things integrate and transform in a different way that we do today.

And so I think that today we’re still sticking to some distinctions without many differences. So that’s my perspective on where it goes.

Abdul: Okay, awesome. Awesome. So, Michael Lannelli what do you think, five years from now, knowledge gaps. Where are we at?

Michael Lannelli: Sure. So I think several speakers have already mentioned it. They’re going to be more popular and they’re certainly going to be more popular than they are now. And the way that’s going to happen is through tighter and tighter integration with LLMs. It’s the way every industry is heading right now. Every many different industries are going to be transformed by LLMs.

Knowledge graphs will be no different. You’ll interact with your knowledge graph, through an agent in natural language. You’ll query your knowledge graphs through natural language. You’ll edit the entities and the edges in natural language, and you’ll consume the output via an LLM. It’s the direction that we’re heading here.

It’s the problems that I work on. That’s where we see it in the near future.

Abdul:

Chris Brockmann: I think the merging of the two domains, I think is it’s going to happen and I think it’s going to accelerate the adoption of knowledge graph to the point where I think people won’t really be talking so much about knowledge graphs anymore.

It’s going to be much more commoditized, but it’s now it’s a bit big topic because you have to speak sparkle and the owl and those. Kind of things, but since the LLMs will abstract all of that away, basically the, you need, I don’t think that we will be speaking much about knowledge graphs, it’s just the way we store certain types of knowledge in our agents and how we train systems or how we share knowledge in between those agents or systems, and, it’ll be just a commodity like a data or knowledge operating system that you’ll just have.

And five years the crazy thing with the LLMs is that it’s going to accelerate this whole thing in an exponential fashion. So I don’t think we’ll be talking much about knowledge graph in five years. It’ll be there.

Abdul: So Dennis, what do you think five years from now?

Dennis: my guess is that knowledge graphs are backbones for data storages for many enterprises. I, my intuition says that basically you start small in a department or in a particular area, you start adoption of this knowledge graph technology. And then The fact that the data is open and exposed will trigger the fact that you can, that others are interested, other ones to basically participate to this.

And then you will create a positive effect in enterprises so that basically you have this layer of semantics that is interconnecting different parts of enterprises, different departments, different people. And also I think that there will be a very big interplay between natural language and knowledge graphs, meaning knowledge graphs will be used to annotate documents.

They will be used to index documents. We will access them. We are a natural language. We will. Partially construct them from unstructured text. So there will be an interplay between NLP and, knowledge graphs, which is much, much stronger than today. Just because natural language processing become became more stronger.

Abdul: Thank you, Dennis. That was insightful. So moving to Martin, Martin, what do you think?

Martin Kaltenbock: thank you very much. Lots of things have been said. I think we’re on a journey. Somebody said a long time ago, a little semantics goes a long way, if you remember. So now a little knowledge graph goes a long way, but we’re on the journey and I see it very, very similar.

I would say knowledge graphs will be used across all industries and in all these applications and business cases. I see it mainly, but that’s so broad metadata, data, information, knowledge management solutions. We will have a backbone and I see it as also, it has been said before as a part of semantic AI, you call it hybrid AI.

So I think there will be a merge between statistically AI and, symbolic AI? So knowledge graphs and, we’ll bring that together. I think language technology as an NLP will also have a big part in that game. And then I see it as a backbone. So knowledge graph will be inside solutions are all solutions regarding data integration explainable responsibly.

I and also said before about interoperability or semantic interoperability. And I think it will merge into the systems. Hopefully we still call it knowledge graphs, but maybe it disappears. We find a new way or a new name.

Abdul: Awesome. Awesome.

So Thank you all for making time We all are in different time zones and we really appreciate you coming out today Talking to us, helping educate people about, knowledge graphs and much more. So I’m personally thankful that you all guys, make time, spend a lot of time with us.

And it is going to be insightful for people in our audience And further, we are going to have more events coming up. we have a hackathon coming up, so yep. Keep an eye out on our socials and that’s it

Show full transcript