Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models

Farhod Rahimi; Fayzali Saduiioevich Komiliyon; Manuchehr Farhodovich Rahimov; Mehrdod Rahmatullovich Yorov

doi:doi:10.11648/j.acm.20261503.11

Research Article |

| Peer-Reviewed

Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models

Farhod Rahimi^*

, Fayzali Saduiioevich Komiliyon, Manuchehr Farhodovich Rahimov, Mehrdod Rahmatullovich Yorov

Published in Applied and Computational Mathematics (Volume 15, Issue 3)

Received: 26 March 2026 Accepted: 25 April 2026 Published: 11 May 2026

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

In the context of the ongoing digital transformation of governmental and corporate information systems, the development of intelligent document management solutions capable of efficient processing, structuring, and analysis of textual data has become increasingly important. Particular challenges arise in the processing of multilingual data and low-resource languages, such as Tajik, due to the limited availability of annotated corpora. The aim of this study is to develop and formalize a mathematical model of an intelligent document management system based on microservice architecture and transformer-based natural language processing techniques. The proposed approach integrates a distributed microservice architecture using gRPC with a named entity recognition (NER) model based on multilingual BERT. To address data scarcity, a synthetic data generation mechanism is introduced to augment the training corpus. The NER task is formulated as a probabilistic sequence labeling problem, and the training procedure includes fine-tuning of the transformer model and comparison with baseline approaches, including rule-based methods, Conditional Random Fields (CRF), and BiLSTM-CRF models. Experimental evaluation is conducted on a curated corpus of Tajik-language documents, divided into training, validation, and test subsets. The results demonstrate that the proposed model achieves an F1-score of 0.93, outperforming all baseline methods. In addition, the system exhibits near-linear scalability under horizontal scaling conditions and ensures fault tolerance through a hybrid mechanism that switches to a rule-based extractor in case of service unavailability. The proposed model provides a scalable and robust framework for intelligent document processing systems and can be effectively applied in governmental and corporate environments undergoing digital transformation.

Published in	Applied and Computational Mathematics (Volume 15, Issue 3)
DOI	10.11648/j.acm.20261503.11
Page(s)	68-74
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2026. Published by Science Publishing Group

Keywords

Microservice Architecture, Intelligent Document Management, Mathematical Modeling, BERT, Named Entity Recognition, Distributed Systems

References

[1]	Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. Attention Is All You Need. NeurIPS, 2017, pp. 5998–6008.
[2]	Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL, 2019, pp. 4171–4186.
[3]	Dragoni, N., Giallorenzo, S., Lafuente, A. L., Mazzara, M., Montesi, F., Mustafin, R., & Safina, L. Microservices: Yesterday, Today, and Tomorrow. In Present and Ulterior Software Engineering. Springer, 2017, pp. 195–216.
[4]	Yorov, M. R., & Komiliyon, F. S. Application of a Mass-Service System in Online Request Processing. Bulletin of the Tajik National University. Natural Sciences Series, 2023, no. 2, pp. 42–53.
[5]	Yorov, M. R., & Komiliyon, F. S. Ensuring Information Security of Operating Systems for Their Efficient Use. Polytechnic Bulletin. Intelligence, Innovation, Investment Series, 2022, no. 3(59), pp. 58–63.
[6]	Komiliyon, F. S., & Yorov, M. R. Computer Modeling of a Network Service System in Discrete Time with Inversion Order and Random Priority in the PD KOA Mode. Bulletin of the Tajik National University. Natural Sciences Series, 2020, no. 2, pp. 68–79.
[7]	Komiliyon, F. S., & Rahimov, M. F. Implementation of Microservice Architecture for Optimizing the Distribution of Information Resources. Science and Innovation. Geological and Technical Sciences Series, 2024, no. 2, pp. 71–79.
[8]	Komiliyon, F. S., & Rahimov, M. F. Microservice Architecture: From Monolith to Flexible Distributed Systems. Reports of the National Academy of Sciences of Tajikistan, 2023, vol. 66, no. 11–12, pp. 659–667.
[9]	Komiliyon, F. S., & Rahimov, M. F. Microservice Optimization of Information Resource Distribution Using a Clearly Defined API. In Modern Problems of Mathematical Modeling and Its Application: Proceedings of the 12th International Scientific and Practical Conference. Dushanbe, 2024, pp. 28–32.
[10]	Lafferty, J., McCallum, A., & Pereira, F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proceedings of ICML, 2001, pp. 282–289.
[11]	Mikolov, T., Chen, K., Corrado, G., & Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv: 1301.3781, 2013.
[12]	Newman, S. Building Microservices: Designing Fine-Grained Systems. Sebastopol: O’Reilly Media, 2015. 280 p. Available at: https://martinfowler.com/articles/microservices.html (accessed 15.03.2026).
[13]	Pennington, J., Socher, R., & Manning, C. GloVe: Global Vectors for Word Representation. Proceedings of EMNLP, 2014, pp. 1532–1543.
[14]	Pires, T., Schlinger, E., & Garrette, D. How Multilingual Is Multilingual BERT? ACL, 2019, pp. 4996–5001.
[15]	Ratner, A., Bach, S., Ehrenberg, H., et al. Snorkel: Rapid Training Data Creation. VLDB, 2017, vol. 11, no. 3, pp. 269–282.
[16]	Rahimov, M. F., & Komiliyon, F. S. Analysis of the Characteristics of Monolithic and Microservice Architectures. Proceedings of the National Academy of Sciences of Tajikistan. Department of Physical-Mathematical, Chemical, Geological and Technical Sciences, 2023, no. 4(193), pp. 44–54.
[17]	Richardson, C. Microservices Patterns: With Examples in Java. Shelter Island: Manning Publications, 2018. 520 p.
[18]	Tjong Kim Sang, E. F., & De Meulder, F. Introduction to the CoNLL-2003 Shared Task. Proceedings of CoNLL-2003, 2003, pp. 142–147.
[19]	Xu, Y., Li, M., Cui, L., et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding. KDD, 2020, pp. 1192–1200.
[20]	Fowler, M., & Lewis, J. Microservices: A Definition of This New Architectural Term. martinfowler.com, 2014.
[21]	Huang, Z., Xu, W., & Yu, K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv: 1508.01991, 2015.
[22]	Chiticariu, L., Li, Y., & Reiss, F. Rule-Based Information Extraction. Proceedings of EMNLP 2013, 2013, pp. 827–832.
[23]	Erl, T. Service-Oriented Architecture: Concepts, Technology, and Design. Upper Saddle River: Prentice Hall, 2005. 760 p.

Cite This Article

Plain Text BibTeX RIS

APA Style

Rahimi, F., Komiliyon, F. S., Rahimov, M. F., Yorov, M. R. (2026). Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models. Applied and Computational Mathematics, 15(3), 68-74. https://doi.org/10.11648/j.acm.20261503.11

Copy | Download

ACS Style

Rahimi, F.; Komiliyon, F. S.; Rahimov, M. F.; Yorov, M. R. Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models. Appl. Comput. Math. 2026, 15(3), 68-74. doi: 10.11648/j.acm.20261503.11

Copy | Download

AMA Style

Rahimi F, Komiliyon FS, Rahimov MF, Yorov MR. Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models. Appl Comput Math. 2026;15(3):68-74. doi: 10.11648/j.acm.20261503.11

Copy | Download

@article{10.11648/j.acm.20261503.11,
  author = {Farhod Rahimi and Fayzali Saduiioevich Komiliyon and Manuchehr Farhodovich Rahimov and Mehrdod Rahmatullovich Yorov},
  title = {Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models},
  journal = {Applied and Computational Mathematics},
  volume = {15},
  number = {3},
  pages = {68-74},
  doi = {10.11648/j.acm.20261503.11},
  url = {https://doi.org/10.11648/j.acm.20261503.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.acm.20261503.11},
  abstract = {In the context of the ongoing digital transformation of governmental and corporate information systems, the development of intelligent document management solutions capable of efficient processing, structuring, and analysis of textual data has become increasingly important. Particular challenges arise in the processing of multilingual data and low-resource languages, such as Tajik, due to the limited availability of annotated corpora. The aim of this study is to develop and formalize a mathematical model of an intelligent document management system based on microservice architecture and transformer-based natural language processing techniques. The proposed approach integrates a distributed microservice architecture using gRPC with a named entity recognition (NER) model based on multilingual BERT. To address data scarcity, a synthetic data generation mechanism is introduced to augment the training corpus. The NER task is formulated as a probabilistic sequence labeling problem, and the training procedure includes fine-tuning of the transformer model and comparison with baseline approaches, including rule-based methods, Conditional Random Fields (CRF), and BiLSTM-CRF models. Experimental evaluation is conducted on a curated corpus of Tajik-language documents, divided into training, validation, and test subsets. The results demonstrate that the proposed model achieves an F1-score of 0.93, outperforming all baseline methods. In addition, the system exhibits near-linear scalability under horizontal scaling conditions and ensures fault tolerance through a hybrid mechanism that switches to a rule-based extractor in case of service unavailability. The proposed model provides a scalable and robust framework for intelligent document processing systems and can be effectively applied in governmental and corporate environments undergoing digital transformation.},
 year = {2026}
}

Copy | Download

TY  - JOUR
T1  - Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models
AU  - Farhod Rahimi
AU  - Fayzali Saduiioevich Komiliyon
AU  - Manuchehr Farhodovich Rahimov
AU  - Mehrdod Rahmatullovich Yorov
Y1  - 2026/05/11
PY  - 2026
N1  - https://doi.org/10.11648/j.acm.20261503.11
DO  - 10.11648/j.acm.20261503.11
T2  - Applied and Computational Mathematics
JF  - Applied and Computational Mathematics
JO  - Applied and Computational Mathematics
SP  - 68
EP  - 74
PB  - Science Publishing Group
SN  - 2328-5613
UR  - https://doi.org/10.11648/j.acm.20261503.11
AB  - In the context of the ongoing digital transformation of governmental and corporate information systems, the development of intelligent document management solutions capable of efficient processing, structuring, and analysis of textual data has become increasingly important. Particular challenges arise in the processing of multilingual data and low-resource languages, such as Tajik, due to the limited availability of annotated corpora. The aim of this study is to develop and formalize a mathematical model of an intelligent document management system based on microservice architecture and transformer-based natural language processing techniques. The proposed approach integrates a distributed microservice architecture using gRPC with a named entity recognition (NER) model based on multilingual BERT. To address data scarcity, a synthetic data generation mechanism is introduced to augment the training corpus. The NER task is formulated as a probabilistic sequence labeling problem, and the training procedure includes fine-tuning of the transformer model and comparison with baseline approaches, including rule-based methods, Conditional Random Fields (CRF), and BiLSTM-CRF models. Experimental evaluation is conducted on a curated corpus of Tajik-language documents, divided into training, validation, and test subsets. The results demonstrate that the proposed model achieves an F1-score of 0.93, outperforming all baseline methods. In addition, the system exhibits near-linear scalability under horizontal scaling conditions and ensures fault tolerance through a hybrid mechanism that switches to a rule-based extractor in case of service unavailability. The proposed model provides a scalable and robust framework for intelligent document processing systems and can be effectively applied in governmental and corporate environments undergoing digital transformation.
VL  - 15
IS  - 3
ER  -

Copy | Download

Author Information

Farhod Rahimi

Physical-Technical Institute, National Academy of Sciences of Tajikistan, Dushanbe, Tajikistan

Contact Email

http://orcid.org/0009-0004-4010-2754
Fayzali Saduiioevich Komiliyon

Faculty of Mathematics, Tajik National University, Dushanbe, Tajikistan
Manuchehr Farhodovich Rahimov

Institute of Mathematics, National Academy of Sciences of Tajikistan, Dushanbe, Tajikistan
Mehrdod Rahmatullovich Yorov

Faculty of Mathematics, Tajik National University, Dushanbe, Tajikistan

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Rahimi, F., Komiliyon, F. S., Rahimov, M. F., Yorov, M. R. (2026). Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models. Applied and Computational Mathematics, 15(3), 68-74. https://doi.org/10.11648/j.acm.20261503.11

Copy | Download

ACS Style

Rahimi, F.; Komiliyon, F. S.; Rahimov, M. F.; Yorov, M. R. Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models. Appl. Comput. Math. 2026, 15(3), 68-74. doi: 10.11648/j.acm.20261503.11

Copy | Download

AMA Style

Rahimi F, Komiliyon FS, Rahimov MF, Yorov MR. Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models. Appl Comput Math. 2026;15(3):68-74. doi: 10.11648/j.acm.20261503.11

Copy | Download

@article{10.11648/j.acm.20261503.11,
  author = {Farhod Rahimi and Fayzali Saduiioevich Komiliyon and Manuchehr Farhodovich Rahimov and Mehrdod Rahmatullovich Yorov},
  title = {Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models},
  journal = {Applied and Computational Mathematics},
  volume = {15},
  number = {3},
  pages = {68-74},
  doi = {10.11648/j.acm.20261503.11},
  url = {https://doi.org/10.11648/j.acm.20261503.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.acm.20261503.11},
  abstract = {In the context of the ongoing digital transformation of governmental and corporate information systems, the development of intelligent document management solutions capable of efficient processing, structuring, and analysis of textual data has become increasingly important. Particular challenges arise in the processing of multilingual data and low-resource languages, such as Tajik, due to the limited availability of annotated corpora. The aim of this study is to develop and formalize a mathematical model of an intelligent document management system based on microservice architecture and transformer-based natural language processing techniques. The proposed approach integrates a distributed microservice architecture using gRPC with a named entity recognition (NER) model based on multilingual BERT. To address data scarcity, a synthetic data generation mechanism is introduced to augment the training corpus. The NER task is formulated as a probabilistic sequence labeling problem, and the training procedure includes fine-tuning of the transformer model and comparison with baseline approaches, including rule-based methods, Conditional Random Fields (CRF), and BiLSTM-CRF models. Experimental evaluation is conducted on a curated corpus of Tajik-language documents, divided into training, validation, and test subsets. The results demonstrate that the proposed model achieves an F1-score of 0.93, outperforming all baseline methods. In addition, the system exhibits near-linear scalability under horizontal scaling conditions and ensures fault tolerance through a hybrid mechanism that switches to a rule-based extractor in case of service unavailability. The proposed model provides a scalable and robust framework for intelligent document processing systems and can be effectively applied in governmental and corporate environments undergoing digital transformation.},
 year = {2026}
}

Copy | Download

TY  - JOUR
T1  - Mathematical Modeling of an Intelligent Document Management System Based on Microservice Architecture and BERT Models
AU  - Farhod Rahimi
AU  - Fayzali Saduiioevich Komiliyon
AU  - Manuchehr Farhodovich Rahimov
AU  - Mehrdod Rahmatullovich Yorov
Y1  - 2026/05/11
PY  - 2026
N1  - https://doi.org/10.11648/j.acm.20261503.11
DO  - 10.11648/j.acm.20261503.11
T2  - Applied and Computational Mathematics
JF  - Applied and Computational Mathematics
JO  - Applied and Computational Mathematics
SP  - 68
EP  - 74
PB  - Science Publishing Group
SN  - 2328-5613
UR  - https://doi.org/10.11648/j.acm.20261503.11
AB  - In the context of the ongoing digital transformation of governmental and corporate information systems, the development of intelligent document management solutions capable of efficient processing, structuring, and analysis of textual data has become increasingly important. Particular challenges arise in the processing of multilingual data and low-resource languages, such as Tajik, due to the limited availability of annotated corpora. The aim of this study is to develop and formalize a mathematical model of an intelligent document management system based on microservice architecture and transformer-based natural language processing techniques. The proposed approach integrates a distributed microservice architecture using gRPC with a named entity recognition (NER) model based on multilingual BERT. To address data scarcity, a synthetic data generation mechanism is introduced to augment the training corpus. The NER task is formulated as a probabilistic sequence labeling problem, and the training procedure includes fine-tuning of the transformer model and comparison with baseline approaches, including rule-based methods, Conditional Random Fields (CRF), and BiLSTM-CRF models. Experimental evaluation is conducted on a curated corpus of Tajik-language documents, divided into training, validation, and test subsets. The results demonstrate that the proposed model achieves an F1-score of 0.93, outperforming all baseline methods. In addition, the system exhibits near-linear scalability under horizontal scaling conditions and ensures fault tolerance through a hybrid mechanism that switches to a rule-based extractor in case of service unavailability. The proposed model provides a scalable and robust framework for intelligent document processing systems and can be effectively applied in governmental and corporate environments undergoing digital transformation.
VL  - 15
IS  - 3
ER  -

Copy | Download