Easily labelling hierarchical document clusters.
Easily labelling hierarchical document clusters.
Author(s): MOURA, M. F.; MACACINI, R. M.; REZENDE, S. O.
Summary: One of the problems of automatic models that generate topic taxonomies is the process of creating the most significant term list that discriminates each document group. In this paper, a new method to label document hierarchical clusters is proposed, which is completely independent from the clustering method. This method automatically decides the number of the words in each label list, avoids word repetitions in a tree branch and provides a kind of cutting for the cluster tree. The obtained results were tested as search queries in a retrieval process and showed a very good performance. Additionally, the use of the method was experimented by some specialists in the text collection domain, trying to evaluate their understanding and expectations over the results.
Publication year: 2008
Types of publication: Paper in annals and proceedings
Keywords: Cluster analysis, Dados semânticos, Taxonomia, Taxonomy
Observation
Some of Embrapa's publications are published as ePub files. To read them, use or download one of the following free software options to your computer or mobile device. Android: Google Play Books; IOS: iBooks; Windows and Linux: Calibre.
Access other publications
Access the Agricultural Research Database (BDPA) to consult Embrapa's full library collection and records.
Visit Embrapa Bookstore to purchase books and other publications sold by Embrapa.