Get GenAI guide

Access HaxiTAG GenAI research content, trends and predictions.

Friday, August 2, 2024

Enterprise Brain and RAG Model at the 2024 WAIC:WPS AI,Office document software

August 02, 2024

The 2024 World Artificial Intelligence Conference (WAIC), held from July 4 to 7 at the Shanghai World Expo Center, attracted numerous AI companies showcasing their latest technologies and applications. Among these, applications based on Large Language Models (LLM) and Generative AI (GenAI) were particularly highlighted. This article focuses on the Enterprise Brain (WPS AI) exhibited by Kingsoft Office at the conference and the underlying Retrieval-Augmented Generation (RAG) model, analyzing its significance, value, and growth potential in enterprise applications.

WPS AI: Functions and Value of the Enterprise Brain

Kingsoft Office had already launched its AI document products a few years ago. At this WAIC, the WPS AI, targeting enterprise users, aims to enhance work efficiency through the Enterprise Brain. The core of the Enterprise Brain is to integrate all documents related to products, business, and operations within an enterprise, utilizing the capabilities of large models to facilitate employee knowledge Q&A. This functionality significantly simplifies the information retrieval process, thereby improving work efficiency.

Traditional document retrieval often requires employees to search for relevant materials in the company’s cloud storage and then extract the needed information from numerous documents. The Enterprise Brain allows employees to directly get answers through text interactions, saving considerable time and effort. This solution not only boosts work efficiency but also enhances the employee work experience.

RAG Model: Enhancing the Accuracy of Generated Content

The technical model behind WPS AI is similar to the RAG (Retrieval-Augmented Generation) model. The RAG model combines retrieval and generation techniques, generating answers or content by referencing information from external knowledge bases, thus offering strong interpretability and customization capabilities. The working principle of the RAG model is divided into the retrieval layer and the generation layer:

Retrieval Layer: After the user inputs information, the retrieval layer neural network generates a retrieval request and submits it to the database, which outputs retrieval results based on the request.
Generation Layer: The retrieval results from the retrieval layer, combined with the user’s input information, are fed into the large language model (LLM) to generate the final result.

This model effectively addresses the issue of model hallucination, where the model provides inaccurate or nonsensical answers. WPS AI ensures content credibility by displaying the original document sources in the model’s responses. If the model references a document, the content is likely credible; otherwise, the accuracy needs further verification. Additionally, employees can click on the referenced documents for more detailed information, enhancing the transparency and trustworthiness of the answers.

Industry Applications and Growth Potential

The application of the WPS AI enterprise edition in the financial and insurance sectors showcases its vast potential. Insurance products are diverse, and their terms frequently change, necessitating timely information for both internal staff and external clients. Traditionally, maintaining a Q&A knowledge base manually is inefficient, but AI digital employees based on large models can significantly reduce maintenance costs and improve efficiency. Currently, the application in the insurance field is still in the co-creation stage, but its prospects are promising.

Furthermore, WPS AI also offers basic capabilities such as content expansion, content formatting, and content extraction, which are highly practical for enterprise users.

The WPS AI showcased at the 2024 WAIC demonstrated the immense potential of the Enterprise Brain in enhancing work efficiency and information retrieval within enterprises. By leveraging the RAG model, WPS AI not only solves the problem of model hallucination but also enhances the credibility and transparency of the content. As technology continues to evolve, the application scenarios of AI based on large models in enterprises will become increasingly widespread, with considerable value and growth potential.

compared with office365 copilot,they have some different experience and function.next we will analysis deeply.

The Future of Large Language Models: Technological Evolution and Application Prospects from GPT-3 to Llama 3

June 19, 2024

At the 2024 Zhiyuan Conference, Meta research scientist and the author of Llama 2 and Llama 3, Dr. Thomas Scialom, delivered a keynote speech titled "The Past, Present, and Future of Large Language Models." In his presentation, he thoroughly discussed the development trajectory and future prospects of large language models. By analyzing flagship products from companies such as OpenAI, DeepMind, and Meta, Thomas delved into the technical details and significance of key technologies like Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) used in models like Llama 2. He also shared his views on the future development of large language models from the perspectives of multimodality, Agents, and robotics.

Development Trajectory of Large Language Models

Thomas began by highlighting the pivotal moments in the history of large models, reflecting on their rapid development in recent years. The emergence of GPT-3, for instance, marked a milestone indicating that AI had achieved functional utility, thereby broadening the scope and application of AI technology. The development of large language models can essentially be seen as a collection of weights based on the Transformer architecture, trained through self-supervised learning to predict the next token with minimal loss based on vast amounts of data.

Two Ways to Scale Model Size

There are two primary ways to scale the size of large language models: increasing the number of model parameters and increasing the amount of training data. In their research on GPT-3, OpenAI discovered that enlarging the model parameters significantly enhanced performance, prompting a substantial increase in model size. However, DeepMind's research highlighted the importance of training strategies and data volume, introducing the Chinchilla model, which optimizes computational resources to achieve excellent performance even with smaller parameter sizes.

Optimization of the Llama Series Models

In the training process of the Llama series models, researchers rethought how to optimize computational resources to ensure efficiency in both training and inference phases. Although Llama 2's pre-training parameter scale is similar to Llama 1, it includes more training data tokens and employs a longer context length. Additionally, Llama 2 incorporates SFT and RLHF technologies during the post-training phase, further enhancing its ability to follow instructions.

Supervised Fine-Tuning (SFT)

SFT is a method used to align models with instructions by having annotators generate content based on given prompts. Thomas's team invested significant resources to have annotators produce high-quality content, which was then used to fine-tune the model. Although costly, SFT significantly improves the model's ability to handle complex tasks.

Reinforcement Learning from Human Feedback (RLHF)

Compared to SFT, RLHF involves annotators comparing different model-generated answers and selecting the better one. This feedback is then used to train a reward model, which improves the model's accuracy. By expanding the dataset and adjusting the model size, Thomas's team continuously optimized the reward model, ultimately achieving performance that surpasses GPT-4.

Combining Human and AI Capabilities

Thomas emphasized that the real strength of humans lies in judging the quality of answers rather than creating them. Therefore, the true magic of RLHF is in combining human feedback with AI capabilities to create models that surpass human performance. The collaboration between humans and AI is crucial in this process.

The Future of Large Language Models

Thomas believes that the future of large language models lies in multimodality, integrating images, sounds, videos, and other diverse information to enhance their processing capabilities. Additionally, Agent technology and robotics research will be significant areas of future development. By combining language modeling with multimodal technologies, we can build more practical Agent systems and robotic entities.

Importance of Computational Power

Thomas stressed the critical role of computational power in AI development. As computational resources increase, AI model performance improves significantly. From the ImageNet competition to AlphaGo's conquest of Go, AI technology has made rapid strides. In the future, as computational resources continue to expand, the AI field is poised to witness more unexpected breakthroughs.

Through Thomas's insightful speech, we not only gained a comprehensive understanding of the development trajectory and future direction of large language models but also recognized the pivotal role of technological innovation and computational resources in advancing AI. The research and application of large language models will continue to have profound impacts across technological, commercial, and social domains.

Menu

GenAI and LLM USAGE

LLM and GenAI Usage, suite, Best Practices for Diverse industry applicaiton