Telegram

DeepSeek

p

近期文章

Collective Communication Profiling: Unveiling GPU Interconnect Bottlenecks in LLMs
Collective communication analysis is essential, not optional—for ensuring high-throughput, resilient deployment of today’s and tomorrow’s LLM workloads.
ic_writer ds66
ic_date 2024-11-09
5C Prompt Contracts: A Minimalist, Creative-Friendly, Token-Efficient Prompt Framework for Individuals & SMEs
The 5C Prompt Contract offers a simple yet robust prompt design framework—perfectly suited for solo creators and SMEs.
ic_writer ds66
ic_date 2024-11-09
Enhancing Food-Domain Question Answering with a Multimodal Knowledge Graph: Hybrid QA Generation and Diversity Analysis
This paper marks a major advance in food-centric AI: by integrating a large multimodal KG, hybrid QA generation, and joint text–image fine-tuning,
ic_writer ds66
ic_date 2024-11-08
CogniSQL‑R1‑Zero: Reinforced Reasoning for Efficient, High-Fidelity Text-to-SQL
This work emphasizes that aim, alignment, and simplicity together constitute a new paradigm for efficient, responsible system design.
ic_writer ds66
ic_date 2024-11-08
Lights, Camera, Language Models: Evaluating GPT‑4o, Gemini‑2.0 & DeepSeek‑V3 for Movie Review Generation 🎬
This in-depth study demonstrates that LLMs are now fluent enough to craft structurally coherent, sentiment-laced movie reviews,
ic_writer ds66
ic_date 2024-08-05
Insights into DeepSeek‑V3: Tackling Scaling Challenges with Hardware–Model Co‑Design
Its ISCA paper offers a roadmap—hardware and model architects must collaborate closely to break the next frontier in AI scale.
ic_writer ds66
ic_date 2024-08-04
Benchmarking GPT‑4.0 vs DeepSeek‑V3 for Code-Smell Detection: Accuracy, Cost, and Practical Guidance
Our benchmarking positions LLMs as powerful additions to code-quality ecosystems. GPT‑4.0 and DeepSeek‑V3 both significantly outperform static analyzers on nuanced smell detection,
ic_writer ds66
ic_date 2024-08-02
DeepSeek‑V3, GPT‑4, Phi‑4, and LLaMA‑3.3: Automating LoRaWAN Engineering with LLM Code Generation
This study underscores that LLMs—large and lean—can reliably generate domain‑specific engineering code.
ic_writer ds66
ic_date 2024-07-29
DeepSeek‑V3 Technical Report: Redefining Efficient Language Model Training
DeepSeek‑V3 stands out as a pivotal demonstration that smarter architectures trump bigger budgets. Through MLA, MoE routing, FP8 precision, and network-aware designs,
ic_writer ds66
ic_date 2024-07-28
Argument Mining with Large Language Models: An Extensive Evaluation from LLAMA to GPT-4o and DeepSeek-R1
As LLMs continue to evolve, tools for interpretable and domain-specific argument analysis will become vital across law, journalism,
ic_writer ds66
ic_date 2024-07-27
Bridging Technology and Humanities: Evaluating DeepSeek‑R1 in Social Sciences Research
DeepSeek‑R1 stands as a pioneering example of reasoning-capable LLMs tailored to the humanities and social sciences.
ic_writer ds66
ic_date 2024-07-26
Evaluating Test-Time Scaling LLMs for Legal Reasoning: OpenAI o1, DeepSeek‑R1, and Beyond
Building hybrid legal AI systems—including retrieval-augmented pipelines, domain fine-tuning, and human oversight—will be the most productive path forward.
ic_writer ds66
ic_date 2024-07-23