Finding Representative Entities From Entity Graph By Using Neighborhood Based Entity Similarity
Shingavi, Ankit Anil
MetadataShow full item record
Several applications deploy the use of large entity graphs. Given the entirety of its application scope, it is challenging to select a single entity graph for a particular need from numerous data sources. For a comprehensible overview of the entity graph, we may project a preview table for compact representation of an entity graph. Each preview table represents a single entity type in the dataset. We need to find the representative entities for a given entity type from the entity graph to show the coverage of a dataset. In this paper, we propose a method to find representative entities for a given entity type from the entity graph. Each entity of the same type is represented by a multi-dimensional label vector using neighborhood nodes. We apply the k-means clustering algorithm on the generated label vectors of the same entity type. The clustering algorithm divides a set of entities into k disjoint clusters. The nearest entity to the centroid of each cluster is used as the representative entity for the given entity type. We have performed experiments on the Freebase dataset, based off of which, we got diverse and important representative entities for the tv, film and location domain. We can use these representative entities in the generation of preview tables. This helps the data worker understand the coverage of a particular entity type in the dataset.
Showing items related by title, author, creator and subject.
Ashman, Jared M. (Computer Science & Engineering, 2011-03-03)Identifying the semantic similarity between named entities has many applications in NLP, including information extraction and retrieval, word sense disambiguation, text summarization and type classification. Similarity ...
Gupta, Mahesh (Computer Science & Engineering, 2013-03-20)The World Wide Web today has evolved into a rich repository of entities where many knowledge bases containing entity-related information are directly available. Such knowledge bases are often in the form of entity-relationship ...
Syed, Abu Ayub Ansari (2017-08-14)Fact-checking in real-time for events such as presidential debates is a challenging task. These fact-checking processes have a difficult and rigorous task in having the best accuracy in classifying facts, finding topics, ...