Chinese artificial intelligence (AI) firm DeepSeek unveiled its latest open-source image generation model, Janus Pro 7B.
The release follows the company’s recent success with its fully open-source frontier foundation models, including the reasoning-focused DeepSeek-R1. DeepSeek asserts that Janus Pro 7B outperforms OpenAI’s DALL-E 3 across multiple benchmarks and is available under a permissive MIT licence for academic and commercial use.
Key Features of Janus Pro 7B
- Advanced Architecture:
- The Janus Pro 7B is the successor to the Janus and Janus Pro 1B models and features significant functional upgrades.
- It uses an autoregressive framework that unifies multimodal understanding and generation with improvements to the architecture and encoder.
- Efficient Processing:
- The model decouples visual encoding into separate pathways while utilizing a unified transformer architecture for processing.
- For multimodal understanding, it employs the SigLIP-L vision encoder, and for generation, it uses a tokenizer with a downsample rate of 16.
- Benchmark Performance:
- In internal testing, Janus Pro 7B scored 80% on GenEval and 84.2 on DPG-Bench, outperforming DALL-E 3 and Stable Diffusion models.
- Independent testing in the coming days will provide further insights into its capabilities.
Availability and Licensing
- Janus Pro 7B is available for download on GitHub and Hugging Face.
- It is released under an MIT license, allowing free use for both academic and commercial purposes.
- A demo of the model is also available, though DeepSeek has not yet announced an API for integration.
Perplexity, an AI platform, has announced support for DeepSeek-R1 alongside OpenAI’s o1 AI model in related news. Perplexity CEO Aravind Srinivas described DeepSeek-R1 as the “world’s most powerful reasoning model” and confirmed its availability to all users.