Few-shot Classification of Smartphone Photos using Hidden Markov Model and Siamese Network
Downloads
Images from the increasing use of smartphones are so large that they are nearly impossible to handle by hand. The problem arises when a person needs to classify these photos into groups or classes. Smartphones are low-performance devices in contrast to desktop or cloud-based computers. Many solutions of image classification using various types of Convolutional Neural Network (CNN) are performed on massive cloud-based supercomputers. These computers often equipped with very high-end additional specialized graphics processing units (GPUs) at remarkable prices. In fact, to implement classification in most smartphones currently on the market, we need an algorithm that has less computation. Based on this fact, we propose HMM that requires fewer parameters. The aim of this research is to examine HMM method for classification of photos taken with a smartphone. For a comparison we also outline the results from Siamese CNN. The same data are used for training and testing for both models. For HMM, we use Discrete Cosine Transform (DCT) to extract salient features of images. The number of training examples is very small compared to the test set. Here we carried out few-shot classification method. In the training phase, we used Maximum Likelihood (ML) criterion-based, Baum-welch algorithm. Two versions are used; isolated training is applied first and later followed by jointly-embedded Baum-welch estimation of parameters. For recognition of the HMM, Viterbi algorithm is applied. Performances of both procedures were measured. Based on the test results, HMM achieves 0,94 precision, 0.85 recall, F1 score 0.89 and accuracy 0.90 while Siamese claims 0.87, 0.98, 0.92 and 0.91. The result shows that HMM, which has advantage over Siamese in term of fewer parameters number, still competes Siamese CNN with only slight decrease in performance. We conclude that HMM are suitable over Siamese CNN to be implemented in low-performance devices such as cellphones.
Copyright (c) 2025 Zulkarnaen Hatala, Muhammad Hudzaly (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).





