A Comparative Study of Machine Learning Methods for Automated Customer Service Dialogue Quality Assessment

Main Article Content

Yajing Zhang

Abstract

The rapid expansion of digital customer service channels has created an urgent need for automated quality assessment methods capable of evaluating dialogue interactions at scale. This paper presents a comprehensive comparative study of machine learning approaches for automated assessment of customer service dialogue quality, examining traditional machine learning algorithms, deep learning architectures, and transformer-based models. A multi-dimensional quality assessment framework is proposed, incorporating three primary evaluation categories: information accuracy, communication appropriateness, and process compliance. An experimental evaluation on a customer service dialogue dataset demonstrates that BERT-based models achieve superior overall classification accuracy (94.2%), while traditional methods offer advantages in computational efficiency and interpretability. The analysis reveals significant performance differences across service defect categories, with transformer models excelling at detecting subtle compliance violations and attitude-related issues. These findings provide practical guidance for enterprises seeking to implement standardized quality-monitoring systems that align with consumer protection regulations.

Article Details

Section

Articles

How to Cite

A Comparative Study of Machine Learning Methods for Automated Customer Service Dialogue Quality Assessment. (2026). Journal of Science, Innovation & Social Impact, 2(1), 317-327. https://sagespress.com/index.php/JSISI/article/view/113

References

1. A. O. Afolabi and J. K. Alhassan, "NLP techniques for automating responses to customer queries: A systematic review," Discover Artificial Intelligence, vol. 3, no. 1, p. 65, 2023. https://doi.org/10.1007/s44163-023-00065-5

2. S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, "Deep learning--based text classification: A comprehensive review," ACM Computing Surveys, vol. 54, no. 3, pp. 1-40, 2021. https://doi.org/10.1145/3439726

3. S. Mohamad Suhaili, N. Salim, and M. N. Jambli, "Service chatbots: A systematic review," Expert Systems with Applications, vol. 184, p. 115461, 2021. https://doi.org/10.1016/j.eswa.2021.115461

4. H. Chen, X. Liu, D. Yin, and J. Tang, "Recent advances in deep learning based dialogue systems: A systematic survey," Artificial Intelligence Review, vol. 55, no. 3, pp. 2387-2444, 2022.

5. M. Nuruzzaman and O. K. Hussain, "A survey on chatbot implementation in customer service industry through deep neural networks," in 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE), pp. 54-61, 2018. https://doi.org/10.1109/ICEBE.2018.00019

6. R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi, "A survey of methods for explaining black box models," ACM Computing Surveys, vol. 51, no. 5, pp. 1-42, 2018. https://doi.org/10.1145/3236009

7. Z. Li, C. Yang, and C. Huang, "A comparative sentiment analysis of airline customer reviews using BERT and its variants," Mathematics, vol. 12, no. 1, p. 53, 2024. https://doi.org/10.3390/math12010053

8. T. A. Al-Qablan, M. H. Mohd Noor, M. Al-Betar, and A. Khader, "A survey on sentiment analysis and its applications," Neural Computing and Applications, vol. 35, no. 11, pp. 8013-8034, 2023. https://doi.org/10.1007/s00521-023-08334-5

9. F. Wei et al., "Empirical study of LLM fine-tuning for text classification in legal document review," in 2023 IEEE International Conference on Big Data, pp. 2786-2792, 2023. https://doi.org/10.1109/BigData59044.2023.10386911

10. M. Durairaj and A. Chinnalagu, "Transformer based contextual model for sentiment analysis of customer reviews: A fine-tuned BERT," International Journal of Advanced Computer Science and Applications, vol. 12, no. 11, pp. 423-432, 2021.

11. B. Galitsky and D. Ilvovsky, "A review of natural language processing in contact centre automation," Pattern Analysis and Applications, vol. 26, no. 3, pp. 823-846, 2023. https://doi.org/10.1007/s10044-023-01182-8

12. S. Bharati, M. R. H. Mondal, and P. Podder, "A review on explainable artificial intelligence for healthcare: Why, how, and when?" IEEE Transactions on Artificial Intelligence, vol. 5, no. 4, pp. 1429-1442, 2023. https://doi.org/10.1109/TAI.2023.3266418

13. A. Zangari, M. Marcuzzo, M. Schiavinato, A. Gasparetto, and A. Albarelli, "Are we really making much progress in text classification? A comparative review," ACM Computing Surveys, vol. 56, no. 8, pp. 1-38, 2024.

14. A. Patel, P. Oza, and S. Agrawal, "Sentiment analysis of customer feedback and reviews for airline services using language representation model," Procedia Computer Science, vol. 218, pp. 2459-2467, 2023.

15. H. Mohammadi, A. Bagheri, A. Giachanou, and D. L. Oberski, "Explainability in practice: A survey of explainable NLP across various domains," arXiv preprint arXiv:2502.00837, 2025.