检索增强生成（RAG）驱动的知识服务：原理、范式及评估

Abstract
Figure/Table
References
Related Citation (3)

Download: PDF (1596 KB) HTML (0 KB)
Export: BibTeX | EndNote (RIS)

Abstract

This paper examines the limitations of artificial general intelligence (AGI) in professional knowledge service domains, particularly its inability to reconcile the breadth of general-purpose corpora with the depth required for specialized expertise, a challenge exacerbated by the static nature of training data and the inherent trade-offs between generalization and precision. These limitations stem from the AGI’s dependence on open-source, nonspecialized training data, which exclude high-value, peer-reviewed resources, and its static architecture, which struggles to adapt to the dynamic evolution of domain knowledge. To address these challenges, this study proposes Retrieval-Augmented Generation (RAG), a hybrid framework that integrates the semantic comprehension and generative fluency of LLMs with the authority and precision of structured knowledge bases. RAG operates through three interconnected phases: vectorization, where domain-specific texts and user queries are transformed into high-dimensional embeddings to capture semantic nuances; retrieval, which employs similarity search algorithms to extract contextually relevant knowledge snippets from vectorized databases, ensuring alignment with professional standards; and generation, where LLMs synthesize retrieved content with user inputs to produce outputs that balance readability with factual accuracy. The implementation of a RAG requires meticulous attention to knowledge base construction, digitization and standardization of domain content through metadata tagging, ontology development, and knowledge graph integration to ensure semantic consistency. Model selection further influences performance: open-source options such as BGE offer flexibility for niche domains but may lack scalability, whereas commercial solutions such as Aliyun’s text embedding provide robust multilingual support at higher costs. LLM selection must align with application needs: models such as DeepSeek-V3 excel in Chinese-language contexts because of localized optimization, whereas GPT-4 proves advantageous for multilingual tasks despite privacy concerns. Experimental validation via simulated datasets—professional technical manuals, educational quizzes, and popular biographies—demonstrated RAG’s efficacy. In professional scenarios, the RAG algorithm achieves excellent accuracy by leveraging structured knowledge bases. However, in educational and popular contexts, accuracy has decreased slightly, but it is still acceptable. For the publishing industry, a RAG offers transformative potential but demands strategic adaptations. Infrastructure localization is paramount for safeguarding proprietary content; hybrid cloud architectures can balance cost efficiency with data security, whereas blockchain integration ensures immutable copyright tracking. Workflow optimization should automate metadata tagging during editorial processes and integrate consistency checks into proofreading stages, reducing manual labor. Data standardization must address multimodal challenges—e.g., aligning image annotations with textual descriptions—to support emerging applications such as interactive textbooks. Copyright protection requires granular access controls and encryption, particularly for subscription-based services. Despite these advancements, the RAG algorithm faces unresolved challenges: multimodal data integration remains computationally intensive, real-time updates strain system latency, and conflicting knowledge sources necessitate advanced conflict-resolution frameworks. Future research should explore adaptive retrieval algorithms, federated learning for decentralized knowledge bases, and hybrid human-AI validation mechanisms to increase reliability. By bridging AGI’s generative capabilities with domain expertise, the RAG not only elevates the precision and adaptability of knowledge services but also catalyzes innovation in digital publishing, enabling industries to harness their authoritative content as dynamic, interactive assets in an increasingly data-driven world.

Key words： knowledge service retrieval-augmented generation RAG artificial intelligence evaluation

Published: 30 April 2025

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Liang WANG

Cite this article:

Liang WANG. Retrieval-Augmented Generation (RAG)-Driven Knowledge Service: Principles, Paradigms, and Evaluation[J]. Science-Technology & Publication, 2025, 44(4): 37-46.

URL:

http://kjycb.tsinghuajournals.com/EN/ OR http://kjycb.tsinghuajournals.com/EN/Y2025/V44/I4/37

1	国务院关于印发新一代人工智能发展规划的通知[EB/OL]. [2025-02-08]. https://www.gov.cn/zhengce/content/2017-07/20/content_5211996.htm.
2	贾同兴. 人工智能与情报检索[M]. 北京: 北京图书馆出版社, 1997: 15- 103.
3	文森, 钱力, 胡懋地, 等. 基于大语言模型的问答技术研究进展综述[J]. 数据分析与知识发现, 2024, 8 (6): 16- 29.
4	张新新, 丁靖佳. 生成式智能出版的技术原理与流程革新[J]. 图书情报知识, 2023, 40 (5): 68- 76.
5	Medec：a benchmark for medical error detection and correction in clinical notes[EB/OL].[2025-02-08]. https://arxiv.org/abs/2412.19260.
6	朱飞, 张煦尧, 刘成林. 类别增量学习研究进展和性能评价[J]. 自动化学报, 2023, 49 (3): 635- 660.
7	易龙. 从数字出版到智能出版：知识封装方式的演进[J]. 出版科学, 2023, 31 (1): 81- 90.
8	崔浩男. 多模态档案知识服务平台的基本特征与价值取向：基于国内外20个案例的分析[J]. 档案学通讯, 2024 (1): 70- 78.
9	新一代通用向量模型BGE-M3：一站式支持多语言、长文本和多种检索方式[EB/OL]. [2025-02-08]. https://hub.baai.ac.cn/view/34816.
10	阿里云计算有限公司. Embedding模型[EB/OL]. [2025-02-12]. https://help.aliyun.com/zh/model-studio/user-guide/embedding.
11	OpenAI. 嵌入指南（Embeddings Guide）[EB/OL].[2025-02-12]. https://www.openaidoc.com.cn/docs/guides/embeddings.
12	CSDN. 机器学习：回归模型和分类模型的评估方法介绍[EB/OL]. [2025-02-12]. https://blog.csdn.net/rubyw/article/details/142828639.

关于我们

联系我们