Get GenAI guide

Access HaxiTAG GenAI research content, trends and predictions.

Showing posts with label enterprise application of LLM. Show all posts

Saturday, July 12, 2025

From Tool to Productivity Engine: Goldman Sachs' Deployment of “Devin” Marks a New Inflection Point in AI Industrialization

July 12, 2025

Goldman Sachs’ pilot deployment of Devin, an AI software engineer developed by Cognition, represents a significant signal within the fintech domain and marks a pivotal shift in generative AI’s trajectory—from a supporting innovation to a core productivity engine. Driven by increasing technical maturity and deepening industry awareness, this initiative offers three profound insights:

Human-AI Collaboration Enters a Deeper Phase

That Devin still requires human oversight underscores a key reality: current AI tools are better suited as Augmented Intelligence Partners rather than full replacements. This deployment reflects a human-centered principle of AI implementation—emphasizing enhancement and collaboration over substitution. Enterprise service providers should guide clients in designing hybrid workflows that combine “AI + Human” synergy—for example, through pair programming or human-in-the-loop code reviews—and establish evaluation metrics to monitor efficiency and risk exposure.

From General AI to Industry-Specific Integration

The financial industry, known for its data intensity, strict compliance standards, and complex operational chains, is breaking new ground by embracing AI coding tools at scale. This signals a lowering of the trust barrier for deploying generative AI in high-stakes verticals. For solution providers, this reinforces the need to shift from generic models to scenario-specific AI capability modules. Emphasis should be placed on aligning with business value chains and identifying AI enablement opportunities in structured, repeatable, and high-frequency processes. In financial software development, this means building end-to-end AI support systems—from requirements analysis to design, compliance, and delivery—rather than deploying isolated model endpoints.

Synchronizing Organizational Capability with Talent Strategy

AI’s influence on enterprises now extends well beyond technology—it is reshaping talent structures, managerial models, and knowledge operating systems. Goldman Sachs’ adoption of Devin is pushing traditional IT teams toward hybrid roles such as prompt engineers, model tuners, and software developers, demanding greater interdisciplinary collaboration and cognitive flexibility. Industry mentors should assist enterprises in building AI literacy assessment frameworks, establishing continuous learning platforms, and promoting knowledge codification through integrated data assets, code reuse, and AI toolchains—advancing organizational memory towards algorithmic intelligence.

Conclusion

Goldman Sachs’ trial of Devin is not only a forward-looking move in financial digitization but also a landmark case of generative AI transitioning from capability-driven to value-driven industrialization. For enterprise service providers and AI ecosystem stakeholders, it represents both an opportunity and a challenge. Only by anchoring to real-world scenarios, strengthening organizational capabilities, and embracing human-AI synergy as a paradigm, can enterprises actively lead in the generative AI era and build sustainable intelligent innovation systems.

Rethinking Human-AI Collaboration: The Future of Synergy Between AI Agents and Knowledge Professionals

April 09, 2025

Reading and share my thinking about stanford article rethinking-human-ai-agent-collaboration-for-the-knowledge-worke

Opening Perspective

2025 has emerged as the “Year of AI Agents.” Yet, beneath the headlines lies a more fundamental inquiry: what does this truly mean for professionals in knowledge-intensive industries—law, finance, consulting, and beyond?

We are witnessing a paradigm shift: LLMs are no longer merely tools, but evolving into intelligent collaborators—AI agents acting as “machine colleagues.” This transformation is redefining human-machine interaction and reconstructing the core of what we mean by “collaboration” in professional environments.

From Hierarchies to Dynamic Synergy

Traditional legal and consulting workflows follow a pipeline model—linear, hierarchical, and role-bound. AI agents introduce a more fluid, adaptive mode of working—closer to collaborative design or team sports. In this model, tasks are distributed based on contextual awareness and capabilities, not rigid roles.

This shift requires AI agents and humans to co-navigate multi-objective, fast-changing workflows, with real-time alignment and adaptive task planning as core competencies.

The Co-Gym Framework: A New Foundation for AI Collaboration

Stanford’s “Collaborative Gym” (Co-Gym) framework offers a pioneering response. By creating an interactive simulation environment, Co-Gym enables:

Deep human-AI pre-task interaction
Clarification of shared objectives
Negotiated task ownership

This strengthens not only the AI’s contextual grounding but also supports human decision paths rooted in intuition, anticipation, and expertise.

Use Case: M&A as a Stress Test for Human-AI Collaboration

M&A transactions exemplify high complexity, high stakes, and fast-shifting priorities. From due diligence to compliance, unforeseen variables frequently reshuffle task priorities.

Under conventional AI systems, such volatility results in execution errors or strategic misalignment. In contrast, a Co-Gym-enabled AI agent continuously re-assesses objectives, consults human stakeholders, and reshapes the workflow—ensuring that collaboration remains robust and aligned.

Case-in-Point

During a share acquisition negotiation, the sudden discovery of a patent litigation issue triggers the AI agent to:

Proactively raise alerts
Suggest tactical adjustments
Reorganize task flows collaboratively

This “co-creation mechanism” not only increases accuracy but reinforces human trust and decision authority—two critical pillars in professional domains.

Beyond Function: A Philosophical Reframing

Crucially, Co-Gym is not merely a feature set—it is a philosophical reimagining of intelligent systems.
Effective AI agents must be communicative, context-sensitive, and capable of balancing initiative with control. Only then can they become:

Conversational partners
Strategic collaborators
Co-creators of value

Looking Ahead: Strategic Recommendations

We recommend expanding the Co-Gym model across other professional domains featuring complex workflows, including:

Venture capital and startup financing
IPO preparation
Patent lifecycle management
Corporate restructuring and bankruptcy

In parallel, we are developing fine-grained task coordination strategies between multiple AI agents to scale collaborative effectiveness and further elevate the agent-to-partner transition.

Final Takeaway

2025 marks an inflection point in human-AI collaboration. With frameworks like Co-Gym, we are transitioning from command-execution to shared-goal creation.
This is not merely technological evolution—it is the dawn of a new work paradigm, where AI agents and professionals co-shape the future

Generative AI: From Experimentation to Enterprise-Level Value Realization

March 29, 2025

Generative AI (Gen AI) is transitioning from the proof-of-concept (PoC) phase to measurable enterprise-level value. However, according to Accenture’s report Making Reinvention Real with Gen AI, while 36% of companies have successfully scaled Gen AI solutions, only 13% have achieved enterprise-wide impact. This gap stems from inadequate data preparedness, incomplete process redesign, lagging talent strategies, and insufficient governance. This article explores how businesses can transition Gen AI from experimentation to large-scale enterprise adoption and provides actionable solutions.

Five Key Actions for Scaling Gen AI at the Enterprise Level

Accenture’s research identifies five key imperatives that help businesses overcome the challenges of Gen AI adoption.

1. Lead with Value

To drive transformation, companies must focus on high-impact business initiatives rather than isolated AI experiments.

Case Study: Ecolab
Ecolab implemented a “Lead to Cash” end-to-end optimization strategy, leveraging AI agents to automate order validation, credit checks, and invoice processing. This not only enhanced customer and sales representative experiences but also unlocked new revenue opportunities.

2. Reinvent Talent and Ways of Working

Gen AI is more than just a tool—it is a catalyst for transforming enterprise operations. However, Accenture’s report highlights that companies invest three times more in AI technology than in workforce training, hindering progress.

Case Study: Accenture’s Marketing & Communications (M+C) Team
Accenture’s M+C team deployed 14 specialized AI agents to optimize marketing processes, reducing internal communications by 60%, increasing brand value by 25%, and improving operational efficiency by 30% through automation.

3. Build an AI-Enabled, Secure Digital Core

Merely adopting AI is insufficient—businesses must establish a flexible, AI-powered data and computing infrastructure to enable large-scale deployment.

Case Study: Sempra
Sempra modernized its digital core through cloud architecture, a data mesh framework, and AI governance, improving data analysis efficiency by 90% and enhancing both customer experience and security.

4. Close the Gap on Responsible AI

AI governance is not just about compliance—it is essential for long-term value creation.

Case Study: A Leading Bank
A global bank implemented AI governance frameworks, including an AI Security Questionnaire, reducing legal review times by 67%, improving credit assessment efficiency by 80%, and saving over $200 million annually in operational costs.

5. Drive Continuous Reinvention

Gen AI transformation is an ongoing process, requiring an agile organizational culture where AI is embedded at the core of business operations.

Case Study: A Leading Electronics Retailer
This retailer used AI to enhance customer service, achieving a 35% improvement in voice interaction accuracy, a 70% increase in automated customer service responses, and reducing average chat handling time by 38 seconds.

How Enterprises Can Accelerate Gen AI Adoption at Scale

1. Executive Leadership and Sponsorship

According to Accenture, companies where CEOs actively lead AI adoption are 2.5 times more likely to achieve success. Strong executive commitment is crucial.

2. Elevate AI Literacy

Boards and senior executives must develop a deeper understanding of AI to make informed strategic decisions and avoid technology-driven misinvestments.

3. Redesign High-Value Processes

Businesses should focus on cross-functional process optimization rather than siloed implementations. Human-AI collaboration should be leveraged to delegate repetitive tasks to AI agents while allowing employees to focus on creative and strategic work.

4. Establish a Robust Data Foundation

2.9 times more successful enterprises emphasize a comprehensive data strategy, underlining the importance of data governance, quality, and accessibility.

Challenges and Considerations: Avoiding Pitfalls in Gen AI Transformation

1. Reliability and Limitations of Research

Accenture’s study, based on 2,000+ AI projects and 3,450 C-level executive surveys, provides clear causal insights. However, the following limitations should be noted:

Enterprise Size Suitability: The strategies outlined in the report are primarily designed for large enterprises, and mid-sized firms may need tailored approaches.
Lack of Failure Case Studies: The report does not deeply analyze AI adoption failures, potentially leading to survivorship bias.
Technical Challenges Not Fully Explored: Issues such as model selection, data security, and AI generalization remain underexplored.

2. Future Outlook

Small Language Models (SLMs) will become mainstream, enabling more domain-specific AI applications.
AI Agents will achieve large-scale adoption by 2025.
Companies with strong continuous reinvention capabilities are 2.1 times more likely to succeed in AI-driven business transformation.

Conclusion and Strategic Recommendations

Key Takeaways

The biggest barrier to Gen AI adoption is not technology but talent, processes, and governance.
The 2.5x ROI gap stems from whether companies systematically execute the five key action areas.
Enterprises must act swiftly—delaying AI adoption risks losing competitive advantage.

Final Thought

The journey of Gen AI transformation has just begun. Companies that successfully bridge the gap between experimentation and enterprise-wide adoption will secure a sustainable competitive edge in the AI-driven era.

Analysis of DeepSeek-R1's Product Algorithm and Implementation

January 30, 2025

Against the backdrop of rapid advancements in large models, reasoning capability has become a key metric in evaluating the quality of Large Language Models (LLMs). DeepSeek-AI recently introduced the DeepSeek-R1 series, which demonstrates outstanding reasoning capabilities. User trials indicate that its reasoning chain is richer in detail and clearer, closely aligning with user expectations. Compared to OpenAI's O1 series, DeepSeek-R1 provides a more interpretable and reliable reasoning process. This article offers an in-depth analysis of DeepSeek-R1’s product algorithm, implementation approach, and its advantages.

Core Algorithms of DeepSeek-R1

Reinforcement Learning-Driven Reasoning Optimization

DeepSeek-R1 enhances its reasoning capabilities through Reinforcement Learning (RL), incorporating two key phases:

DeepSeek-R1-Zero: Applies reinforcement learning directly to the base model without relying on Supervised Fine-Tuning (SFT). This allows the model to autonomously explore reasoning pathways, exhibiting self-verification, reflection, and long-chain reasoning capabilities.
DeepSeek-R1: Introduces Cold Start Data and a multi-stage training pipeline before RL to enhance reasoning performance, readability, and user experience.

Training Process

The training process of DeepSeek-R1 consists of the following steps:

Cold Start Data Fine-Tuning: Initial fine-tuning with a large volume of high-quality long-chain reasoning data to ensure logical clarity and readability.
Reasoning-Oriented Reinforcement Learning: RL training on specific tasks (e.g., mathematics, programming, and logical reasoning) to optimize reasoning abilities, incorporating a Language Consistency Reward to improve readability.
Rejection Sampling and Supervised Fine-Tuning: Filtering high-quality reasoning pathways generated by the RL model for further fine-tuning, enhancing general abilities in writing, Q&A, and other applications.
Reinforcement Learning for All Scenarios: Integrating multiple reward signals to balance reasoning performance, helpfulness, and harmlessness.
Knowledge Distillation: Transferring DeepSeek-R1’s reasoning capability to smaller models to improve efficiency and reduce computational costs.

Comparison Between DeepSeek-R1 and OpenAI O1

Logical Reasoning Capability

Experimental results indicate that DeepSeek-R1 performs on par with or even surpasses OpenAI O1-1217 in mathematics, coding, and logical reasoning. For example, in the AIME 2024 mathematics competition, DeepSeek-R1 achieved a Pass@1 score of 79.8%, slightly higher than O1-1217’s 79.2%.

Interpretability and Readability

DeepSeek-R1’s reasoning process is more detailed and readable due to:

The use of explicit reasoning format tags such as <think> and <answer>.
The introduction of a language consistency reward during training, reducing language-mixing issues.
Cold start data ensuring initial stability in the RL phase.

In contrast, while OpenAI’s O1 series generates longer reasoning chains, some responses lack clarity, making them harder to comprehend. DeepSeek-R1’s optimizations improve interpretability, making it easier for users to understand the reasoning process.

Reliability of Results

DeepSeek-R1 employs a self-verification mechanism, allowing the model to actively reflect on and correct errors during reasoning. Experiments demonstrate that this mechanism effectively reduces logical inconsistencies and enhances the coherence of the reasoning process. By comparison, OpenAI O1 occasionally produces plausible yet misleading answers without deep logical validation.

Conclusion

DeepSeek-R1 excels in reasoning capability, interpretability, and reliability. By combining reinforcement learning with cold start data, the model provides a more detailed analysis, making its working principles more comprehensible. Compared to OpenAI's O1 series, DeepSeek-R1 has clear advantages in interpretability and consistency, making it particularly suitable for applications requiring structured reasoning, such as mathematical problem-solving, coding tasks, and complex decision support.

Moving forward, DeepSeek-AI may further refine the model’s general capabilities, enhance multilingual reasoning support, and expand its applications in software engineering, knowledge management, and other domains.

Join the HaxiTAG Community to engage in discussions and share datasets for Chain-of-Thought (CoT) training. Collaborate with experts, exchange best practices, and enhance reasoning model performance through community-driven insights and knowledge sharing.

Case Review and Case Study: Building Enterprise LLM Applications Based on GitHub Copilot Experience

November 24, 2024

GitHub Copilot is a code generation tool powered by LLM (Large Language Model) designed to enhance developer productivity through automated suggestions and code completion. This article analyzes the successful experience of GitHub Copilot to explore how enterprises can effectively build and apply LLMs, especially in terms of technological innovation, usage methods, and operational optimization in enterprise application scenarios.

Key Insights

The Importance of Data Management and Model Training
At the core of GitHub Copilot is its data management and training on a massive codebase. By learning from a large amount of publicly available code, the LLM can understand code structure, semantics, and context. This is crucial for enterprises when building LLM applications, as they need to focus on the diversity, representativeness, and quality of data to ensure the model's applicability and accuracy.

Model Integration and Tool Compatibility
When implementing LLMs, enterprises should ensure that the model can be seamlessly integrated into existing development tools and processes. A key factor in the success of GitHub Copilot is its compatibility with multiple IDEs (Integrated Development Environments), allowing developers to leverage its powerful features within their familiar work environments. This approach is applicable to other enterprise applications, emphasizing tool usability and user experience.

Establishing a User Feedback Loop
Copilot continuously optimizes the quality of its suggestions through ongoing user feedback. When applying LLMs in enterprises, a similar feedback mechanism needs to be established to continuously improve the model's performance and user experience. Especially in complex enterprise scenarios, the model needs to be dynamically adjusted based on actual usage.

Privacy and Compliance Management
In enterprise applications, privacy protection and data compliance are crucial. While Copilot deals with public code data, enterprises often handle sensitive proprietary data. When applying LLMs, enterprises should focus on data encryption, access control, and compliance audits to ensure data security and privacy.

Continuous Improvement and Iterative Innovation
LLM and Generative AI technologies are rapidly evolving, and part of GitHub Copilot's success lies in its continuous technological innovation and improvement. When applying LLMs, enterprises need to stay sensitive to cutting-edge technologies and continuously iterate and optimize their applications to maintain a competitive advantage.

Application Scenarios and Operational Methods

Automated Code Generation: With LLMs, enterprises can achieve automated code generation, improving development efficiency and reducing human errors.
Document Generation and Summarization: Utilize LLMs to automatically generate technical documentation or summarize content, helping to accelerate project progress and improve information transmission accuracy.
Customer Support and Service Automation: Generative AI can assist enterprises in building intelligent customer service systems, automatically handling customer inquiries and enhancing service efficiency.
Knowledge Management and Learning: Build intelligent knowledge bases with LLMs to support internal learning and knowledge sharing within enterprises, promoting innovation and employee skill enhancement.

Technological Innovation Points

Context-Based Dynamic Response: Leverage LLM’s contextual understanding capabilities to develop intelligent applications that can adjust outputs in real-time based on user input.
Cross-Platform Compatibility Development: Develop LLM applications compatible with multiple platforms, ensuring a consistent experience for users across different devices.
Personalized Model Customization: Customize LLM applications by training on enterprise-specific data to meet the specific needs of particular industries or enterprises.

Conclusion
By analyzing the successful experience of GitHub Copilot, enterprises should focus on data management, tool integration, user feedback, privacy compliance, and continuous innovation when building and applying LLMs. These measures will help enterprises fully leverage the potential of LLM and Generative AI, enhancing business efficiency and driving technological advancement.

Get GenAI guide

Saturday, July 12, 2025

Human-AI Collaboration Enters a Deeper Phase

From General AI to Industry-Specific Integration

Synchronizing Organizational Capability with Talent Strategy

Conclusion

Related Topic

Wednesday, April 9, 2025

Opening Perspective

From Hierarchies to Dynamic Synergy

The Co-Gym Framework: A New Foundation for AI Collaboration

Use Case: M&A as a Stress Test for Human-AI Collaboration

Case-in-Point

Beyond Function: A Philosophical Reframing

Looking Ahead: Strategic Recommendations

Final Takeaway

Related Topic

Saturday, March 29, 2025

Five Key Actions for Scaling Gen AI at the Enterprise Level

1. Lead with Value

2. Reinvent Talent and Ways of Working

3. Build an AI-Enabled, Secure Digital Core

4. Close the Gap on Responsible AI

5. Drive Continuous Reinvention

How Enterprises Can Accelerate Gen AI Adoption at Scale

1. Executive Leadership and Sponsorship

2. Elevate AI Literacy

3. Redesign High-Value Processes

4. Establish a Robust Data Foundation

Challenges and Considerations: Avoiding Pitfalls in Gen AI Transformation

1. Reliability and Limitations of Research

2. Future Outlook

Conclusion and Strategic Recommendations

Key Takeaways

Final Thought

Related Topic

Thursday, January 30, 2025

Core Algorithms of DeepSeek-R1

Reinforcement Learning-Driven Reasoning Optimization

Training Process

Comparison Between DeepSeek-R1 and OpenAI O1

Logical Reasoning Capability

Interpretability and Readability

Reliability of Results

Conclusion

Related Topic

Sunday, November 24, 2024

Related Topic

Views

Product

Labels