LLM-Orchestrated Microservices: A New Paradigm for Intelligent Distributed Systems

Journal of Advanced Engineering Technology and Management

ISSSN (Online): 3049-3684

Volume: 1 Issue: 2 | Open Access | September 2025 

LLM-Orchestrated Microservices: A New Paradigm for Intelligent Distributed Systems

Avneet Bansal1

1Independent Researcher, avneetbansal9815@gmail.com

Abstract

Microservices architecture has become a popular approach for building scalable cloud-native distributed applications because it allows modular and flexible deployment, isolation, and elasticity. Traditional monolith applications are broken down into microservices that can be developed, deployed, and scaled independently, which leads to more agility, resilience, and continuous delivery in distributed systems. However, even though technologies such as Kubernetes and service mesh have evolved quite a bit over time, current orchestration solutions are still built on static, imperative models that simply react to changing conditions such as hitting thresholds to scale up/down, executing pre-defined workflows, or enforcing pre-determined policies. While they work just fine under normal circumstances, traditional orchestration solutions falter when facing uncertainty, mixed workloads, cascading failures, and changing dependencies.

Large language models (LLMs) are quickly advancing these days, and with their generalization capabilities in semantic understanding, multi-step reasoning, and context awareness, they are also able to serve as a general control plane for distributed applications. By augmenting traditional cloud-native platforms with LLMs, we can develop new forms of orchestration that reason about the state of the system at runtime and take appropriate actions. In this paper, we present LLM-Orchestrated Microservices (LOM), a system that leverages LLMs to reason about microservices running on Kubernetes with Istio service mesh. Some key features include semantic service-level reasoning, SLA-based scaling, failure root cause analysis, automated failure recovery planning, and dynamic workflow resolution.

We implemented a prototype of our system and tested it on a variety of workloads, such as e-commerce, healthcare APIs, and smart factory microservices workflows. We compared it with traditional Kubernetes Horizontal Pod Autoscaler, rule-based orchestrators, and static workflow scheduling to benchmark its performance. We found that LLM-based orchestration can make better adaptive decisions, recover from failures faster, improve SLAs, and increase overall resilience of the distributed system, but at the cost of additional overhead and complexity.

Keywords: Large Language Models, Microservices, Kubernetes, Cloud Computing, Service Mesh, Intelligent Orchestration, Distributed Systems, Autonomous Infrastructure

Download Article

References

[1] I. Ozkaya, “Application of large language models to software engineering tasks: Opportunities, risks, and implications,” IEEE Software, vol. 40, no. 3, pp. 4–8, 2023. doi: 10.1109/MS.2023.3248401.

[2] A. Fan, B. Gokkaya, M. Harman, M. Lyubarskiy, S. Sengupta, S. Yoo, and J. M. Zhang, “Large language models for software engineering: Survey and open problems,” in Proceedings of the IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE), 2023, pp. 31–53. doi: 10.1109/ICSE-FoSE59343.2023.00008.

[3] X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engineering: A systematic literature review,” ACM Transactions on Software Engineering and Methodology, 2025.

[4] Z. Zheng, K. Ning, Q. Zhong, J. Chen, W. Chen, L. Guo, W. Wang, and Y. Wang, “Towards an understanding of large language models in software engineering tasks,” Empirical Software Engineering, vol. 29, 2024.

[5] Q. Zhang, C. Fang, Y. Xie, Y. Zhang, S. Yu, W. Sun, Y. Yang, and Z. Chen, “A survey on large language models for software engineering,” Science China Information Sciences, vol. 69, 2026. doi: 10.1007/s11432-025-4670-0.

[6] Q. Wu et al., “AutoGen: Enabling next-generation LLM applications via multi-agent conversation,” in International Conference on Learning Representations Workshop Proceedings, 2024.

[7] T. Schick et al., “Toolformer: Language models can teach themselves to use tools,” Advances in Neural Information Processing Systems, vol. 36, 2023.

[8] A. Ullah, T. Kiss, J. Kovács, F. Tusa, J. Deslauriers, H. Dagdeviren, R. Arjun, and H. Hamzeh, “Orchestration in the Cloud-to-Things compute continuum: Taxonomy, survey and future directions,” Journal of Cloud Computing, vol. 12, 2023. doi: 10.1186/s13677-023-00516-5.

[9] Y. Xie, K. Wu, Y. Jiang, X. Zhang, and W. Cui, “Hierarchical service chain orchestration for multi-cloud environments enabled by deep reinforcement learning,” Journal of Cloud Computing, vol. 15, 2026. doi: 10.1186/s13677-026-00874-w.

[10] U. Bharti, A. Goel, and S. C. Gupta, “ReactiveFnJ: A choreographed model for fork-join workflow in serverless computing,” Journal of Cloud Computing, vol. 12, 2023.

[11] S. Yin, J. Wang, H. Zhang, and X. Li, “Intelligent cloud resource allocation using reinforcement learning: A systematic evaluation,” Future Generation Computer Systems, vol. 145, pp. 233–248, 2024.

[12] L. Chen, Y. Zhou, and M. Singh, “Adaptive orchestration of microservices using deep reinforcement learning,” Cluster Computing, vol. 27, 2024.

[13] J. Park, R. Kumar, and A. Ghosh, “AI-driven orchestration for resilient microservice ecosystems,” Journal of Systems and Software, vol. 210, 2025.

[14] S. Patel, M. Rodriguez, and T. Nguyen, “Context-aware autoscaling for cloud-native applications using machine learning,” IEEE Transactions on Cloud Computing, vol. 13, no. 1, 2025.

[15] X. Li, H. Zhao, and D. Lo, “Autonomous incident diagnosis with large language models in cloud-native environments,” IEEE Access, vol. 13, 2025.

[16] M. Ferrante, A. Russo, and G. Fortino, “Self-adaptive orchestration in distributed cloud systems: Recent advances and research challenges,” Future Generation Computer Systems, vol. 140, 2023.

[17] K. Raman, P. Suresh, and J. Bose, “Observability-aware orchestration for cloud-native distributed systems,” Software: Practice and Experience, vol. 54, no. 3, 2024.

[18] E. Mahmoud, S. Raza, and T. Malik, “Microservice resilience engineering using AI-assisted recovery strategies,” Journal of Systems Architecture, vol. 148, 2025.

[19] H. Kim, J. Lee, and S. Choi, “Semantic service dependency reasoning in intelligent distributed systems,” IEEE Transactions on Services Computing, vol. 18, no. 2, 2025.

[20] R. Mehta and P. Krishnan, “Trustworthy autonomous infrastructure management with generative AI,” IEEE Software, vol. 42, no. 1, 2025.

[21] Y. Zhou, C. Wang, and F. Liu, “Explainable AI for autonomous cloud orchestration,” ACM Computing Surveys, vol. 58, no. 1, 2025.




Submit your article for peer review and publication. You can email your paper to info@iqrjournals.com, or editor@iqrjournals.com. You can expect to get an instant reply from the team. IQR Journals take 5 working days for first decision, 10 days for review process and 5 days for publication (upon acceptance of your article).