Logo

Inference Algorithms in Latent Dirichlet Allocation for Semantic Classification

Mohammad Zubir, W.M.A. and Abdul Aziz, I. and Jaafar, J. and Hasan, M.H. (2018) Inference Algorithms in Latent Dirichlet Allocation for Semantic Classification. Advances in Intelligent Systems and Computing, 662 . pp. 173-184.

Full text not available from this repository.

Official URL: https://www.scopus.com/inward/record.uri?eid=2-s2....

Abstract

There are existing implementations of Latent Dirichlet Allocation (LDA) algorithm as a semantic classifier to arrange the data for efficient retrieval. However, the problem of learning or inferencing the posterior distribution of the algorithm is trivial. Inferencing directly the prior distribution could lead to time taken to increase exponentially. It is due to the coupling of the hyperparameters. Several inference algorithms have been implemented together with LDA to solve this issue. The inference algorithm used in this research work is Gibbs sampling. Research using Gibbs sampling shows promising results in comparison to other inference algorithms, especially in the performance of the algorithm. It still takes a long time to compute the topic distribution of the data. There are still room for improvement in the time taken for the algorithm to complete the topic distribution. Using two datasets, an evaluation of the performance of the algorithm has been conducted. Results show that Gibbs sampling as the inference algorithm provides a better prediction on the optimal number of topic of the data in comparison to Variational Expectation Maximization (VEM). © 2018, Springer International Publishing AG.

Item Type:Article
Impact Factor:cited By 1; Conference of International Conference on Computational Methods in Systems and Software, CoMeSySo 2017 ; Conference Date: 12 September 2017 Through 14 September 2017; Conference Code:197849
Uncontrolled Keywords:Classification (of information); Computational methods; Information retrieval; Maximum principle; Sampling; Search engines; Semantics; Statistics; Text processing, Expectation - maximizations; Latent Dirichlet allocation; Latent dirichlet allocations; Posterior distributions; Semantic classification; Text classification; Topic distributions; Topic model, Inference engines
ID Code:21887
Deposited By: Ahmad Suhairi
Deposited On:01 Aug 2018 01:14
Last Modified:01 Aug 2018 01:14

Repository Staff Only: item control page