DeepSeek V4 Preview Released: 1M Context, Flash and Pro Tiers, Pricing from 1 Yuan

neural network

DeepSeek V4 Preview Goes Live: 1M Context Window and Dual Versions

Chinese AI lab DeepSeek has released a preview version of its next-generation large language model, DeepSeek V4, according to coverage on AIBase. The release includes two variants — DeepSeek-V4-Flash and DeepSeek-V4-Pro — and introduces a 1,000,000-token context window, matching the longest context offerings from competitors like Gemini 1.5 Pro and Claude 3.5 Sonnet. Pricing is set aggressively: starting at 1 yuan (approximately $0.14) per million tokens for the Flash version, making enterprise-scale long-context usage economically feasible for the first time at this performance level.

DeepSeek is positioning V4 as a direct challenger to top closed-source models. The company's announcement claims that V4's performance "approaches the level of the best closed-source models" in benchmarks such as reasoning, coding, and multilingual tasks. The Flash variant is optimized for speed and cost-efficiency, while the Pro version prioritizes inference quality and nuanced understanding. Both models are available via API and through Tencent Cloud's TokenHub service, which has simultaneously launched support for the preview.

This release marks a significant step in the commoditization of long-context AI. Until recently, processing one million tokens in a single query was a premium feature reserved for expensive enterprise APIs. DeepSeek V4's pricing undercuts many competitors by an order of magnitude, while maintaining competitive quality. According to data shared by DeepSeek, the Pro variant scores comparably to GPT-4 Turbo on MMLU and HumanEval, though independent benchmarks are still pending.

neural network

Open-Weight Strategy and Ecosystem Implications

DeepSeek has historically released models under permissive open-source licenses, but the V4 preview is currently offered as a closed API. The company has not yet confirmed whether the final version will be open-weight. This hybrid approach allows DeepSeek to gather usage data and fine-tune performance before a potential open release. If V4 does follow its predecessors (DeepSeek V2, V3) and becomes open-weight, it would be the first 1M-context model available for self-hosting.

The integration with Tencent Cloud's TokenHub is notable. TokenHub aggregates multiple LLM APIs, giving Chinese developers a unified billing and access layer. By listing DeepSeek V4 on TokenHub, Tencent is betting on the model's adoption for enterprise workflows like document analysis, code repository understanding, and long-form content generation. The 1M context is particularly relevant for legal, financial, and research domains where entire contracts or research papers must be processed in one pass.

DeepSeek's timing is also strategic. OpenAI, Google, and Anthropic have all recently extended context windows, but at significantly higher per-token costs. DeepSeek V4's price point could pressure these companies to lower their rates, especially in the Asia-Pacific market where DeepSeek already has a strong developer following. However, the model's performance on non-Chinese languages and cultural contexts remains to be fully evaluated by the global community.

neural network

What to Watch: Flash vs. Pro, and the Road to Open Source

For developers evaluating DeepSeek V4, the key decision is between Flash and Pro. The Flash variant targets real-time applications like chatbots and live translation, where latency and cost are paramount. Our analysis of the pricing suggests Flash is roughly 60% cheaper than Pro per million tokens, making it suitable for high-volume production use. Pro is likely necessary for tasks requiring deep reasoning or factual precision, such as medical diagnosis support or legal reasoning.

DeepSeek has not disclosed the exact model architecture, but the company's previous papers (e.g., DeepSeek-V2) used a Mixture-of-Experts (MoE) design. It is reasonable to assume V4 continues this approach, enabling efficient scaling to 1M context without exploding inference costs. The preview period will likely last several weeks, during which early adopters can test the model's consistency at max context length — a known challenge for many long-context models that suffer from "lost in the middle" accuracy degradation.

The AI community should watch for third-party evaluations, particularly the LongBench and RULER benchmarks, which specifically test long-context capabilities. If DeepSeek V4 performs strongly, it could accelerate the shift toward on-premise long-context models, reducing reliance on cloud API providers. For now, the DeepSeek V4 preview represents a compelling option for developers who need to process large documents cheaply, and a signal that the battle for long-context dominance is heating up.

Source: AIbase
345tool Editorial Team
345tool Editorial Team

We are a team of AI technology enthusiasts and researchers dedicated to discovering, testing, and reviewing the latest AI tools to help users find the right solutions for their needs.

我们是一支由 AI 技术爱好者和研究人员组成的团队,致力于发现、测试和评测最新的 AI 工具,帮助用户找到最适合自己的解决方案。

コメント

Loading comments...