多模态AIMultimodal AI

MUSE多模态AI生成模型 - 高质量文本到图像合成

MUSE Multimodal AI Generation Model - High-Quality Text-to-Image Synthesis

MUSE多模态AI生成模型，基于Transformer的高质量文本到图像生成系统。结合了扩散模型和Transformer的优势，生成高质量图像。

MUSE multimodal AI generation model, a high-quality text-to-image generation system based on Transformer. Combines the advantages of diffusion models and Transformers to generate high-quality images.

MUSE多模态文本到图像TransformerMUSEMultimodalText-to-ImageTransformer

文件大小

18.7 GB

Upload Size

18.7 GB

上传日期

2025-02-03

Upload Date

2025-02-03

下载次数

12,400

Downloads

12,400

评分

4.8/5.0

Rating

4.8/5.0

下载资源 Download Resources

下载资源表示您同意我们的使用条款和隐私政策

By downloading this resource, you agree to our Terms of Service and Privacy Policy

BLIP-2视觉语言模型，先进的图像字幕生成工具。能够理解图像内容并生成准确、富有表现力的描述，支持零样本学习，在多个视觉语言基准测试中取得领先成绩。

BLIP-2 vision-language model, advanced image captioning tool. Understands image content and generates accurate, expressive descriptions, supports zero-shot learning, achieving leading results in multiple vision-language benchmarks.

BLIP-2视觉语言图像字幕BLIP-2Vision-LanguageImage Captioning

6.8 GB2025-04-07

LLaVA视觉语言模型 - 融合图像理解的对话AI LLaVA Vision-Language Model - Conversational AI with Image Understanding

LLaVA视觉语言模型，融合图像理解的对话AI。将视觉编码器与语言模型相结合，支持图像相关的对话和推理，适用于教育、客户服务等场景。

LLaVA vision-language model, conversational AI with image understanding. Combines visual encoder with language model, supports image-related conversations and reasoning, suitable for educational, customer service and other scenarios.

LLaVA视觉语言对话AILLaVAVision-LanguageConversational AI

15.3 GB2025-04-13

Flamingo视觉语言模型 - 少样本视觉语言理解 Flamingo Vision-Language Model - Few-Shot Visual Language Understanding

Flamingo视觉语言模型，实现少样本视觉语言理解。结合图像和文本信息，支持问答、描述生成等多模态任务，具有优秀的泛化能力。

Flamingo vision-language model, achieving few-shot visual language understanding. Combines image and text information, supporting multimodal tasks such as question answering and description generation, with excellent generalization capabilities.

视觉语言多模态FlamingoVision-LanguageMultimodalFlamingo

72.6 GB2025-03-11

MUSE多模态AI生成模型 - 高质量文本到图像合成

MUSE Multimodal AI Generation Model - High-Quality Text-to-Image Synthesis

下载资源 Download Resources

相关资源推荐