Artikel

Correlated concept based dynamic document clustering algorithms for newsgroups and scientific literature

Increase in the number of documents in the corpuses like News groups, government organizations, internet and digital libraries, have led to greater complexity in categorizing and retrieving them. Incorporating semantic features will improve the accuracy of retrieving documents through the method of clustering and which will also pave the way to organize and retrieve the documents more efficiently, from the large available corpuses. Even though clustering based on semantics enhances the quality of clusters, scalability of the system still remains complicated. In this paper, three dynamic document clustering algorithms, namely: Term frequency based MAximum Resemblance Document Clustering (TMARDC), Correlated Concept based MAximum Resemblance Document Clustering (CCMARDC) and Correlated Concept based Fast Incremental Clustering Algorithm (CCFICA) are proposed. From the above three proposed algorithms the TMARDC algorithm is based on term frequency, whereas, the CCMARDC and CCFICA are based on Correlated terms (Terms and their Related terms) concept extraction algorithm. The proposed algorithms were compared with the existing static and dynamic document clustering algorithms by conducting experimental analysis on the dataset chosen from 20Newsgroups and scientific literature. F-measure and Purity have been considered as metrics for evaluating the performance of the algorithms. The experimental results demonstrate that the proposed algorithm exhibit better performance, compared to the four existing algorithms for document clustering.

Sprache
Englisch

Erschienen in
Journal: Decision Analytics ; ISSN: 2193-8636 ; Volume: 1 ; Year: 2014 ; Issue: 1 ; Pages: 1-21 ; Heidelberg: Springer

Klassifikation
Wirtschaft
Thema
Static and dynamic document clustering
MAximum resemblance data labeling (MARDL) technique
Term frequency
Inverse document frequency (TFIDF)
Concepts
Semantic similarity

Ereignis
Geistige Schöpfung
(wer)
Jayabharathy, Jayaraj
Kanmani, Selvadurai
Ereignis
Veröffentlichung
(wer)
Springer
(wo)
Heidelberg
(wann)
2014

DOI
doi:10.1186/2193-8636-1-3
Handle
Letzte Aktualisierung
10.03.2025, 11:43 MEZ

Datenpartner

Dieses Objekt wird bereitgestellt von:
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften - Leibniz-Informationszentrum Wirtschaft. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.

Objekttyp

  • Artikel

Beteiligte

  • Jayabharathy, Jayaraj
  • Kanmani, Selvadurai
  • Springer

Entstanden

  • 2014

Ähnliche Objekte (12)