Learning Path Generation of ITS Using Markov Decision Process

Song-Hwan Kwon; Jong-Nam Rim; Chung-Song Ko; Un-Song Ryu; Yong-Jin Pak; Hyon-Il Son

doi:doi:10.11648/j.sr.20261401.11

Research Article |

| Peer-Reviewed

Learning Path Generation of ITS Using Markov Decision Process

Song-Hwan Kwon^*

, Jong-Nam Rim, Chung-Song Ko, Un-Song Ryu, Yong-Jin Pak, Hyon-Il Son

Published in Science Research (Volume 14, Issue 1)

Received: 27 October 2025 Accepted: 17 November 2025 Published: 30 January 2026

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Probabilistic and stochastic models such as Bayesian and Hidden Markov models can cope well with system uncertainties, but there is a problem of how learning state prediction and learning path generation are performed independently and how to connect them, and the overall effect of the system may be lost even after the connection. Using a Markov Decision Process, a kind of reinforcement learning model, not only can the prediction of the learning state of a student and the generation of a path be implemented simultaneously in a single model, but also the overall error can be reduced. In this paper, we propose to build an intelligent tutoring system into a Markov Decision Process model, an reinforcement learning model, with the aim of reducing learning path generation error and improving system performance by using Markov decision Process model in intelligent tutoring system. In addition, we propose a learning state evaluation method using a Markov Decision Process model to simultaneously proceed the student’s learning state estimation and the system’s action selection. We also propose a method to apply the value-iteration algorithm to action selection computation in a Markov Decision Process model. Comparison with previous models was carried out and its effectiveness was verified.

Published in	Science Research (Volume 14, Issue 1)
DOI	10.11648/j.sr.20261401.11
Page(s)	1-13
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2026. Published by Science Publishing Group

Keywords

Reinforcement Learning, MDP, Learning Path Generation

References

[1]	Russell, S. J., & Norvig, P. (2022). Artificial Intelligence: A Modern Approach. Prentice Hall. Retrieved from http://www.amazon.com/Artificial-Intelligence-ApproachStuart-Russell/dp/0131038052
[2]	Burhan Aji S. (2022). Intelligent Tutoring System Design Using Markov Decision Process. Emerging Information Science and Technology. Vol. 3, No. 1, pp. 1 9-28.
[3]	Iglesias, A., Martinez, P. and Fernandez, F. (2003). An experience applying reinforcement learning in a web-based adaptive and intelligent educational system. Informatics in Education. vol. 2, no. 2, pp. 223-240.
[4]	Litman, A. J. and Silliman, S. (2004). Itspoke: An intelligent tutoring spoken dialogue system. in Proc. Human Language Technology Conference 2004.
[5]	Sarma, B., and Ravindran, B. (2007). Intelligent tutoring systems using reinforcement learning to teach autistic students. Home Informatics and Telematics: ICT for The Next Billion, Springer, vol. 241, pp. 65-78.
[6]	Iglesias, A., Martnez, P., and Fernndez, F. (2009). Learning teaching strategies in an Adaptive and Intelligent Educational System through Reinforcement Learning. Journal Applied Intelligence, Volume 31 Issue 1, August 2009. 58.
[7]	Ai, H., Litman, D. J., Forb es-Riley, K., Rotaru, M., Tetreault, J., and Purandare, A. (2006). Using system and user performance features to improve emotion detection in sp oken tutoring dialogs. In Proceedings of the International Conference on Spoken Language Processing (Interspeech 2006 (ICSLP), pages 797–800, Pittsburgh.
[8]	Janarthanam, S., Hastie, H., Lemon, O., and Liu, X. (2011). The day after the day after tomorrow: a machine learning approach to adaptive temporal expression generation: training and evaluation with real users. SIGDIAL ’11 Proceedings of the SIGDIAL 2011 Conference (Pages 142-151), Stroudsburg, PA, USA 2011.
[9]	Pan, Y. Lee, H., and Lee, L. (2012). Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process. IEEE Transactions on Audio, Speech, and Language Processing archive Volume 20 Issue 2, February 2012, Page 632-645.
[10]	Chi, M., Lehn, K., Litman, D., and Jordan, P. (2011). Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Model User-Adap, Kluwer Academic, pp. 137-180.
[11]	Iglesias, A., Martnez, P., Aler, R., and Fernndez, F. (2009). Learning teaching strategies in an adaptive and intelligent educational system through reinforcement learning. Applied Intelligence, vol. 31, no. 1, pp. 89-106.
[12]	Williams, J. D., and Young, S. (2007). Partially observable Markov decision processes for spoken dialog systems. Elsevier Computer Speech and Language, vol. 21, pp. 393-422.
[13]	Theocharous, G., Beckwith, R., Butko, N., and Philipose, M. (2009). Tractable POMDP planning algorithms for optimal teaching in SPAIS. in Proc. IJCAI PAIR Workshop 2009.
[14]	Rafferty, A. N. et al. (2011). Faster teaching by POMDP planning. in Proc. Artificial Intelligence in Education (AIED) 2011, pp. 280-287.
[15]	Chinaei, H. R., Chaib-draa, B., and Lamontagne, L. (2012). Learning observation models for dialogue POMDPs. Canadian AI’12 Proceedings of the 25th Canadian Conference on Advances in Artificial Intelligence, Springer-Verlag Berlin, Heidelberg, pp. 280-286.
[16]	Folsom-Kovarik, J. T., Sukthankar, G., and Schatz, S. (2013). Tractable POMDP representations for intelligent tutoring systems. ACM Transactions on Intelligent Systems and Technology (TIST) -Special Section on Agent Communication, Trust in Multiagent Systems, Intelligent Tutoring and Coaching Systems Archive, vol. 4, no. 2, p. 29.
[17]	Julien Seznec. (2020). Sequential machine learning for intelligent tutoring systems. Machine Learning [cs.LG]. Université de Lille, 2020. English. ffNT: LILUI084ff ffel-03490620f
[18]	Whiteley, W. (2005). Artificially Intelligent Adaptive Tutoring System, Education, IEE Transactions on, Volume 48, Issue 4.
[19]	Jeremiah, T. F., Gita, S., and Sae, S. (2013). Tractable POMDP representations or intelligent tutoring systems. ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent ystems, intelligent tutoring and coaching systems archive, Volume 4 Issue 2, March 2013.
[20]	Wang, F. (2018). Reinforcement Learning in a POMDP Based Intelligent Tutoring System for Optimizing Teaching Strategies. International Journal of Information and Education Technology, Vol. 8, No. 8, August 2018, pp 553-558.
[21]	Hamid, R. C., Brahim, C., Luc, L. (2012). Learning observation models for dialogue POMDPs. Canadian AI’12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence, Pages 280-286, Springer-Verlag Berlin, Heidelberg 2012.
[22]	Shen, S. (2020). Empirically Evaluating the Effectiveness of POMDP vs. MDP Towards the Pedagogical Strategies Induction, AIED 2020, LNAI 10948, pp. 109–113. https://doi.org/10.1007/978-3-319-93846-2_21
[23]	Li, J., and Hou, L. (2017). Review of knowledge graph research. J. Shanxi Univ., Natural Sci. Ed., vol. 40, no. 3, pp. 454–459, Mar.
[24]	Chen, Z. et al. (2020). Knowledge Graph Completion: A Review. IEEE Access, 2020.3030076, VOLUME 8.

Cite This Article

Plain Text BibTeX RIS

APA Style

Kwon, S., Rim, J., Ko, C., Ryu, U., Pak, Y., et al. (2026). Learning Path Generation of ITS Using Markov Decision Process. Science Research, 14(1), 1-13. https://doi.org/10.11648/j.sr.20261401.11

Copy | Download

ACS Style

Kwon, S.; Rim, J.; Ko, C.; Ryu, U.; Pak, Y., et al. Learning Path Generation of ITS Using Markov Decision Process. Sci. Res. 2026, 14(1), 1-13. doi: 10.11648/j.sr.20261401.11

Copy | Download

AMA Style

Kwon S, Rim J, Ko C, Ryu U, Pak Y, et al. Learning Path Generation of ITS Using Markov Decision Process. Sci Res. 2026;14(1):1-13. doi: 10.11648/j.sr.20261401.11

Copy | Download

@article{10.11648/j.sr.20261401.11,
  author = {Song-Hwan Kwon and Jong-Nam Rim and Chung-Song Ko and Un-Song Ryu and Yong-Jin Pak and Hyon-Il Son},
  title = {Learning Path Generation of ITS Using Markov Decision Process},
  journal = {Science Research},
  volume = {14},
  number = {1},
  pages = {1-13},
  doi = {10.11648/j.sr.20261401.11},
  url = {https://doi.org/10.11648/j.sr.20261401.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sr.20261401.11},
  abstract = {Probabilistic and stochastic models such as Bayesian and Hidden Markov models can cope well with system uncertainties, but there is a problem of how learning state prediction and learning path generation are performed independently and how to connect them, and the overall effect of the system may be lost even after the connection. Using a Markov Decision Process, a kind of reinforcement learning model, not only can the prediction of the learning state of a student and the generation of a path be implemented simultaneously in a single model, but also the overall error can be reduced. In this paper, we propose to build an intelligent tutoring system into a Markov Decision Process model, an reinforcement learning model, with the aim of reducing learning path generation error and improving system performance by using Markov decision Process model in intelligent tutoring system. In addition, we propose a learning state evaluation method using a Markov Decision Process model to simultaneously proceed the student’s learning state estimation and the system’s action selection. We also propose a method to apply the value-iteration algorithm to action selection computation in a Markov Decision Process model. Comparison with previous models was carried out and its effectiveness was verified.},
 year = {2026}
}

Copy | Download

TY  - JOUR
T1  - Learning Path Generation of ITS Using Markov Decision Process
AU  - Song-Hwan Kwon
AU  - Jong-Nam Rim
AU  - Chung-Song Ko
AU  - Un-Song Ryu
AU  - Yong-Jin Pak
AU  - Hyon-Il Son
Y1  - 2026/01/30
PY  - 2026
N1  - https://doi.org/10.11648/j.sr.20261401.11
DO  - 10.11648/j.sr.20261401.11
T2  - Science Research
JF  - Science Research
JO  - Science Research
SP  - 1
EP  - 13
PB  - Science Publishing Group
SN  - 2329-0927
UR  - https://doi.org/10.11648/j.sr.20261401.11
AB  - Probabilistic and stochastic models such as Bayesian and Hidden Markov models can cope well with system uncertainties, but there is a problem of how learning state prediction and learning path generation are performed independently and how to connect them, and the overall effect of the system may be lost even after the connection. Using a Markov Decision Process, a kind of reinforcement learning model, not only can the prediction of the learning state of a student and the generation of a path be implemented simultaneously in a single model, but also the overall error can be reduced. In this paper, we propose to build an intelligent tutoring system into a Markov Decision Process model, an reinforcement learning model, with the aim of reducing learning path generation error and improving system performance by using Markov decision Process model in intelligent tutoring system. In addition, we propose a learning state evaluation method using a Markov Decision Process model to simultaneously proceed the student’s learning state estimation and the system’s action selection. We also propose a method to apply the value-iteration algorithm to action selection computation in a Markov Decision Process model. Comparison with previous models was carried out and its effectiveness was verified.
VL  - 14
IS  - 1
ER  -

Copy | Download

Author Information

Song-Hwan Kwon

Department of Information Science, University of Science, Pyongyang, Democratic People’s Republic of Korea

Contact Email

http://orcid.org/0009-0005-2162-081X
Jong-Nam Rim

Department of Information Science, University of Science, Pyongyang, Democratic People’s Republic of Korea
Chung-Song Ko

Department of Information Science, University of Science, Pyongyang, Democratic People’s Republic of Korea
Un-Song Ryu

Department of Information Science, University of Science, Pyongyang, Democratic People’s Republic of Korea
Yong-Jin Pak

Institute of Information Technology, University of Science, Pyongyang, Democratic People’s Republic of Korea
Hyon-Il Son

Information Department, Sinuiju College of Industrial Technology, Sinuiju, Democratic People’s Republic of Korea

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Kwon, S., Rim, J., Ko, C., Ryu, U., Pak, Y., et al. (2026). Learning Path Generation of ITS Using Markov Decision Process. Science Research, 14(1), 1-13. https://doi.org/10.11648/j.sr.20261401.11

Copy | Download

ACS Style

Kwon, S.; Rim, J.; Ko, C.; Ryu, U.; Pak, Y., et al. Learning Path Generation of ITS Using Markov Decision Process. Sci. Res. 2026, 14(1), 1-13. doi: 10.11648/j.sr.20261401.11

Copy | Download

AMA Style

Kwon S, Rim J, Ko C, Ryu U, Pak Y, et al. Learning Path Generation of ITS Using Markov Decision Process. Sci Res. 2026;14(1):1-13. doi: 10.11648/j.sr.20261401.11

Copy | Download

@article{10.11648/j.sr.20261401.11,
  author = {Song-Hwan Kwon and Jong-Nam Rim and Chung-Song Ko and Un-Song Ryu and Yong-Jin Pak and Hyon-Il Son},
  title = {Learning Path Generation of ITS Using Markov Decision Process},
  journal = {Science Research},
  volume = {14},
  number = {1},
  pages = {1-13},
  doi = {10.11648/j.sr.20261401.11},
  url = {https://doi.org/10.11648/j.sr.20261401.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sr.20261401.11},
  abstract = {Probabilistic and stochastic models such as Bayesian and Hidden Markov models can cope well with system uncertainties, but there is a problem of how learning state prediction and learning path generation are performed independently and how to connect them, and the overall effect of the system may be lost even after the connection. Using a Markov Decision Process, a kind of reinforcement learning model, not only can the prediction of the learning state of a student and the generation of a path be implemented simultaneously in a single model, but also the overall error can be reduced. In this paper, we propose to build an intelligent tutoring system into a Markov Decision Process model, an reinforcement learning model, with the aim of reducing learning path generation error and improving system performance by using Markov decision Process model in intelligent tutoring system. In addition, we propose a learning state evaluation method using a Markov Decision Process model to simultaneously proceed the student’s learning state estimation and the system’s action selection. We also propose a method to apply the value-iteration algorithm to action selection computation in a Markov Decision Process model. Comparison with previous models was carried out and its effectiveness was verified.},
 year = {2026}
}

Copy | Download

TY  - JOUR
T1  - Learning Path Generation of ITS Using Markov Decision Process
AU  - Song-Hwan Kwon
AU  - Jong-Nam Rim
AU  - Chung-Song Ko
AU  - Un-Song Ryu
AU  - Yong-Jin Pak
AU  - Hyon-Il Son
Y1  - 2026/01/30
PY  - 2026
N1  - https://doi.org/10.11648/j.sr.20261401.11
DO  - 10.11648/j.sr.20261401.11
T2  - Science Research
JF  - Science Research
JO  - Science Research
SP  - 1
EP  - 13
PB  - Science Publishing Group
SN  - 2329-0927
UR  - https://doi.org/10.11648/j.sr.20261401.11
AB  - Probabilistic and stochastic models such as Bayesian and Hidden Markov models can cope well with system uncertainties, but there is a problem of how learning state prediction and learning path generation are performed independently and how to connect them, and the overall effect of the system may be lost even after the connection. Using a Markov Decision Process, a kind of reinforcement learning model, not only can the prediction of the learning state of a student and the generation of a path be implemented simultaneously in a single model, but also the overall error can be reduced. In this paper, we propose to build an intelligent tutoring system into a Markov Decision Process model, an reinforcement learning model, with the aim of reducing learning path generation error and improving system performance by using Markov decision Process model in intelligent tutoring system. In addition, we propose a learning state evaluation method using a Markov Decision Process model to simultaneously proceed the student’s learning state estimation and the system’s action selection. We also propose a method to apply the value-iteration algorithm to action selection computation in a Markov Decision Process model. Comparison with previous models was carried out and its effectiveness was verified.
VL  - 14
IS  - 1
ER  -

Copy | Download