Gitpedia

InternLM XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

From InternLMยทUpdated May 29, 2026ยทView on GitHubยท

InternLM-XComposer2.5 ๐Ÿค— &nbsp๏ฝœ XComposer2.5 Technical Report ๐Ÿ“„ The project is written primarily in Python, distributed under the Apache License 2.0 license, first published in 2023. It has gained significant community traction with 2,925 stars and 175 forks on GitHub. Key topics include: chatgpt, foundation, gpt, gpt-4, instruction-tuning.

<p align="center"> <img src="./assets/logo_en.png" width="650"/> </p> <p align="center"> <b><font size="6">InternLM-XComposer-2.5</font></b> </p> <div align="center"> InternLM-XComposer2.5 <a href="https://huggingface.co/internlm/internlm-xcomposer2d5-7b">๐Ÿค—</a> <a href="https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b"><img src="./assets/modelscope_logo.png" width="20px"></a> &nbsp๏ฝœ XComposer2.5 Technical Report <a href="https://arxiv.org/abs/2407.03320"> ๐Ÿ“„ </a>

English | ็ฎ€ไฝ“ไธญๆ–‡

</div> <p align="center"> Thanks the community for <a href="https://huggingface.co/spaces/Willow123/InternLM-XComposer">HuggingFace Demo </a> | <a href="https://openxlab.org.cn/apps/detail/WillowBreeze/InternLM-XComposer">OpenXLab Demo</a> of InternLM-XComposer-2.5. </p> <p align="center"> ๐Ÿ‘‹ join us on <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> and <a href="https://r.vansin.top/?r=internwx" target="_blank">WeChat</a> </p> <p align="center"> <a href="https://trendshift.io/repositories/5245" target="_blank"><img src="https://trendshift.io/api/badge/repositories/5245" alt="InternLM%2FInternLM-XComposer | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> </p> <br>

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ InternLM-XComposer2.5-Reward

We release InternLM-XComposer2.5-Reward <a href="https://huggingface.co/internlm/internlm-xcomposer2d5-7b-reward">๐Ÿค—</a> (IXC-2.5-Reward, ACL 2025 Findings), a simple yet effective multi-modal reward model, including training code, evaluation scripts, and parts of the traininig data. Please refer to the project page for details.

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ InternLM-XComposer2.5-OmniLive

We release InternLM-XComposer2.5-OmniLive, a comprehensive multimodal system for long-term streaming video and audio interactions. Please refer to the project page for details.

<br>

Multimodal Projects of Our Team

InternLM-XComposer-2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

InternLM-XComposer-2.5-OmniLive: A Specialized Generalist Multimodal System for Streaming Video and Audio Interactions

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

InternLM-XComposer2-<img src="./assets/4k.png" width="25px">: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Models

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

<img src="https://raw.githubusercontent.com/ShareGPT4V/ShareGPT4V-Resources/master/images/share4video_tight.png" style="vertical-align: -20px;" :height="25px" width="25px">ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

<img src="https://raw.githubusercontent.com/ShareGPT4V/ShareGPT4V-Resources/master/images/logo_tight.png" style="vertical-align: -20px;" :height="25px" width="25px">ShareGPT4V: Improving Large Multi-modal Models with Better Captions

<img src="https://github.com/Liuziyu77/MMDU/blob/main/asset/logo.png" style="vertical-align: -20px;" :height="25px" width="25px">MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

</br>

InternLM-XComposer-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. IXC-2.5 is trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. This long-context capability allows IXC-2.5 to perform exceptionally well in tasks requiring extensive input and output contexts.

  • Ultra-High Resolution Understanding: IXC-2.5 enhances the dynamic resolution solution proposed in IXC2-4KHD with a native 560 ร— 560 ViT vision encoder, supporting high-resolution images with any aspect ratio.

  • Fine-Grained Video Understanding: IXC-2.5 treats videos as a ultra-high-resolution composite picture consisting of tens to hundreds of frames, allowing it to capture fine details through dense sampling and higher resolution for each frame.

  • Multi-Turn Multi-Image Dialogue: IXC-2.5 supports free-form multi-turn multi-image dialogue, allowing it to naturally interact with humans in multi-round conversations.

  • Webpage Crafting: IXC-2.5 can be readily applied to create webpages by composing source code (HTML, CSS, and JavaScript) following text-image instructions.

  • Composing High-Quality Text-Image Articles: IXC-2.5 leverages specially designed Chain-of-Thought (CoT) and Direct Preference Optimization (DPO) techniques to significantly enhance the quality of its written content.

  • Awesome performance: IXC-2.5 has been evaluated on 28 benchmarks, outperforming existing open-source state-of-the-art models on 16 benchmarks. It also surpasses or competes closely with GPT-4V and Gemini Pro on 16 key tasks.

<p align="center"> <img src="assets/Benchmark_radar.png" width="1000"/> </p>

Please refer to Technical Report for more details.
<br>

Demo Video

๐Ÿ”ฅ For the best experience, please keep the audio on while enjoying the video.

https://github.com/InternLM/InternLM-XComposer/assets/147793160/8206f07f-3166-461e-a631-9cbcdec6ae75

Youtube Video

Please refer to Chinese Demo for the demo of the Chinese version.

News and Updates

  • 2024.02.02 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ The finetune code of InternLM-XComposer2-VL-7B are publicly available.
  • 2024.01.26 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ The evaluation code of InternLM-XComposer2-VL-7B are publicly available.
  • 2024.01.26 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ InternLM-XComposer2-7B and InternLM-XComposer-VL2-7B are publicly available on Hugging Face and ModelScope.
  • 2024.01.26 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ We release a technical report for more details of InternLM-XComposer2 series.
  • 2023.11.22 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ We release the ShareGPT4V, a large-scale highly descriptive image-text dataset generated by GPT4-Vision and a superior large multimodal model, ShareGPT4V-7B.
  • 2023.10.30 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ InternLM-XComposer-VL achieved the top 1 ranking in both Q-Bench and Tiny LVLM.
  • 2023.10.19 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ Support for inference on multiple GPUs. Two 4090 GPUs are sufficient for deploying our demo.
  • 2023.10.12 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ 4-bit demo is supported, model files are available in Hugging Face and ModelScope.
  • 2023.10.8 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ InternLM-XComposer-7B and InternLM-XComposer-VL-7B are publicly available on ModelScope.
  • 2023.9.27 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ The evaluation code of InternLM-XComposer-VL-7B are publicly available.
  • 2023.9.27 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ InternLM-XComposer-7B and InternLM-XComposer-VL-7B are publicly available on Hugging Face.
  • 2023.9.27 ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ We release a technical report for more details of our model series.
    </br>

Model Zoo

ModelUsageTransformers(HF)ModelScope(HF)Release Date
InternLM-XComposer-2.5Video Understanding, Multi-image Multi-tune Dialog, 4K Resolution Understanding, Web Craft, Article creation, Benchmark๐Ÿค—internlm-xcomposer2.5<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2.52024-07-03
InternLM-XComposer2-4KHD4K Resolution Understanding, Benchmark, VL-Chat๐Ÿค—internlm-xcomposer2-4khd-7b<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2-4khd-7b2024-04-09
InternLM-XComposer2-VL-1.8BBenchmark, VL-Chat๐Ÿค—internlm-xcomposer2-vl-1_8b<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2-vl-1_8b2024-04-09
InternLM-XComposer2Text-Image Composition๐Ÿค—internlm-xcomposer2-7b<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2-7b2024-01-26
InternLM-XComposer2-VLBenchmark, VL-Chat๐Ÿค—internlm-xcomposer2-vl-7b<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2-vl-7b2024-01-26
InternLM-XComposer2-4bitText-Image Composition๐Ÿค—internlm-xcomposer2-7b-4bit<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2-7b-4bit2024-02-06
InternLM-XComposer2-VL-4bitBenchmark, VL-Chat๐Ÿค—internlm-xcomposer2-vl-7b-4bit<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer2-vl-7b-4bit2024-02-06
InternLM-XComposerText-Image Composition, VL-Chat๐Ÿค—internlm-xcomposer-7b<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer-7b2023-09-26
InternLM-XComposer-4bitText-Image Composition, VL-Chat๐Ÿค—internlm-xcomposer-7b-4bit<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer-7b-4bit2023-09-26
InternLM-XComposer-VLBenchmark๐Ÿค—internlm-xcomposer-vl-7b<img src="./assets/modelscope_logo.png" width="20px" /> internlm-xcomposer-vl-7b2023-09-26

Evaluation

We evaluate InternLM-XComposer-2.5 on 28 multimodal benchmarks, including image benchmarks MMDU, MMStar, RealWorldQA, Design2Code, DocVQA, Infographics VQA, TextVQA, ChartQA, OCRBench, DeepFrom, WTQ, VisualMRC, TabFact, MathVista, MMMU, AI2D, MME, MMBench, MMBench-CN, SEED-Bench, HallusionBench, MM-Vet, and video benchmarks MVBench, MLVU, Video-MME, MMBench-Video, TempCompass

See Evaluation Details here.

Compared with closed-source APIs and previous SOTAs on Video and Structural High-resolution images.

MVBenchMLVUMME-VideoMMBench-VideoTempCompassDocVQAChartVQAInfoVQATextVQAOCRBenchDeepFormWTQVisualMRCTabFact
VideoChat2InternVL1.5LIVAInternVL1.5Qwen-VLInternVL1.5InternVL1.5InternVL1.5InternVL1.5GLM-4vDocOwl 1.5DocOwl 1.5DocOwl 1.5DocOwl 1.5
7B26B34B26B7B26B26B26B26B9B8B8B8B8B
60.450.459.042.052.990.983.872.580.677.668.840.6246.480.2
GPT-4V43.549.259.956.0---88.478.575.178.051.6------------
Gemini-Pro------75.049.367.188.174.175.274.668.0------------
Ours69.158.855.846.990.982.269.978.269.071.253.6307.585.2

Compared with closed-source APIs and previous SOTAs on Multi-Image dialog and General Visual QA Benchmarks.

MVBenchMLVUMME-VideoMMBench-VideoTempCompassDocVQAChartVQAInfoVQATextVQAOCRBenchDeepFormWTQVisualMRCTabFact
VideoChat2InternVL1.5LIVAInternVL1.5Qwen-VLInternVL1.5InternVL1.5InternVL1.5InternVL1.5GLM-4vDocOwl 1.5DocOwl 1.5DocOwl 1.5DocOwl 1.5
7B26B34B26B7B26B26B26B26B9B8B8B8B8B
60.450.459.042.058.490.983.872.580.677.668.840.6246.480.2
GPT-4V43.549.259.956.0---88.478.575.178.051.6------------
Gemini-Pro------75.049.370.688.174.175.274.668.0------------
Ours69.158.855.846.967.190.982.269.978.269.071.253.6307.585.2

Requirements

  • python 3.8 and above
  • pytorch 1.12 and above, 2.0 and above are recommended
  • CUDA 11.4 and above are recommended (this is for GPU users)
  • flash-attention2 is required for high-resolution usage of InternLM-XComposer2.5.
    <br>

Installation

Before running the code, make sure you have setup the environment and installed the required packages. Make sure you meet the above requirements, and then install the dependent libraries.
Please refer to the installation instructions

Quickstart

We provide a simple example to show how to use InternLM-XComposer-2.5 with ๐Ÿค— Transformers.

<details> <summary> <b>Video Understanding</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer query = 'Here are some frames of a video. Describe this video in detail' image = ['./examples/liuxiang.mp4',] with torch.autocast(device_type='cuda', dtype=torch.float16): response, his = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True) print(response) #The video opens with a shot of an athlete, dressed in a red and yellow uniform with the word "CHINA" emblazoned across the front, preparing for a race. #The athlete, Liu Xiang, is seen in a crouched position, focused and ready, with the Olympic rings visible in the background, indicating the prestigious setting of the Olympic Games. As the race commences, the athletes are seen sprinting towards the hurdles, their determination evident in their powerful strides. #The camera captures the intensity of the competition, with the athletes' numbers and times displayed on the screen, providing a real-time update on their performance. The race reaches a climax as Liu Xiang, still in his red and yellow uniform, triumphantly crosses the finish line, his arms raised in victory. #The crowd in the stands erupts into cheers, their excitement palpable as they witness the athlete's success. The video concludes with a close-up shot of Liu Xiang, still basking in the glory of his victory, as the Olympic rings continue to symbolize the significance of the event. query = 'tell me the athlete code of Liu Xiang' image = ['./examples/liuxiang.mp4',] with torch.autocast(device_type='cuda', dtype=torch.float16): response, _ = model.chat(tokenizer, query, image, history=his, do_sample=False, num_beams=3, use_meta=True) print(response) #The athlete code of Liu Xiang, as displayed on his uniform in the video, is "1363".
</details> <details> <summary> <b>Multi-Image Mutli-Tune Dialog</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer query = 'Image1 <ImageHere>; Image2 <ImageHere>; Image3 <ImageHere>; I want to buy a car from the three given cars, analyze their advantages and weaknesses one by one' image = ['./examples/cars1.jpg', './examples/cars2.jpg', './examples/cars3.jpg',] with torch.autocast(device_type='cuda', dtype=torch.float16): response, his = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True) print(response) #To analyze the advantages and disadvantages of each car, we need to consider factors such as brand reputation, performance, design, cost, and maintenance. #1. Mercedes-Benz: - Advantages: Known for its luxury and reliability, the Mercedes-Benz brand offers a high level of comfort, advanced technology, and superior craftsmanship. The vehicle in the image appears to be an SUV, which is versatile for both city driving and off-road conditions. - Disadvantages: Typically, Mercedes-Benz vehicles are more expensive compared to other brands, and they may require more frequent maintenance due to their luxury status. #2. Bugatti: - Advantages: Bugatti is renowned for producing some of the fastest and most powerful cars in the world. The vehicle in the image is a sports car, likely offering an exhilarating driving experience with its high-performance engine and advanced aerodynamics. - Disadvantages: Bugatti cars are extremely expensive, making them less accessible to the average consumer. They also require specialized knowledge for maintenance and may not be suitable for everyday driving due to their high-performance nature. #3. BMW: - Advantages: BMW is a well-respected manufacturer known for its innovative engineering and design. The car in the image appears to be an electric vehicle (EV), which offers advantages such as lower emissions, reduced noise pollution, and potentially lower operating costs. - Disadvantages: EVs like the one in the image may have a shorter range compared to traditional gasoline-powered vehicles, limiting their suitability for long-distance travel. #Additionally, the initial cost of purchasing an EV can be higher than that of a conventional car. In conclusion, each car has its own set of advantages and disadvantages. The Mercedes-Benz offers luxury and versatility, the Bugatti provides high performance and exclusivity, and the BMW represents innovation in electric vehicle technology. The choice ultimately depends on the buyer's priorities, such as comfort, performance, environmental impact, and budget. query = 'Image4 <ImageHere>; How about the car in Image4' image.append('./examples/cars4.jpg') with torch.autocast(device_type='cuda', dtype=torch.float16): response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, history= his, use_meta=True) print(response) #The car in Image4 is a red sports car, which appears to be a Ferrari. Ferrari is a renowned Italian manufacturer known for producing some of the most iconic and high-performance sports cars in the world. - Advantages: Ferrari vehicles are synonymous with speed, luxury, and engineering excellence. #The car in the image likely offers an exhilarating driving experience with its powerful engine, advanced aerodynamics, and high-quality craftsmanship. The red color adds to the car's aesthetic appeal, making it stand out on the road. - Disadvantages: Ferrari cars are extremely expensive, making them less accessible to the average consumer. #They also require specialized knowledge for maintenance and may not be suitable for everyday driving due to their high-performance nature. In conclusion, the Ferrari in Image4 represents a pinnacle of automotive engineering and design, offering unmatched performance and luxury. #However, its high cost and specialized maintenance requirements make it less practical for everyday use compared to the other vehicles in the images.
</details> <details> <summary> <b>High Resolution Image Understanding</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer query = 'Analyze the given image in a detail manner' image = ['./examples/dubai.png'] with torch.autocast(device_type='cuda', dtype=torch.float16): response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True) print(response) #The infographic is a visual representation of various facts about Dubai. It begins with a statement about Palm Jumeirah, highlighting it as the largest artificial island visible from space. It then provides a historical context, noting that in 1968, there were only a few cars in Dubai, contrasting this with the current figure of more than 1.5 million vehicles. #The infographic also points out that Dubai has the world's largest Gold Chain, with 7 of the top 10 tallest hotels located there. Additionally, it mentions that the crime rate is near 0%, and the income tax rate is also 0%, with 20% of the world's total cranes operating in Dubai. Furthermore, it states that 17% of the population is Emirati, and 83% are immigrants. #The Dubai Mall is highlighted as the largest shopping mall in the world, with 1200 stores. The infographic also notes that Dubai has no standard address system, with no zip codes, area codes, or postal services. It mentions that the Burj Khalifa is so tall that its residents on top floors need to wait longer to break fast during Ramadan. #The infographic also includes information about Dubai's climate-controlled City, with the Royal Suite at Burj Al Arab costing $24,000 per night. Lastly, it notes that the net worth of the four listed billionaires is roughly equal to the GDP of Honduras.
</details> <details> <summary> <b>Instruction to Webpage</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer query = 'A website for Research institutions. The name is Shanghai AI lab. Top Navigation Bar is blue.Below left, an image shows the logo of the lab. In the right, there is a passage of text below that describes the mission of the laboratory.There are several images to show the research projects of Shanghai AI lab.' with torch.autocast(device_type='cuda', dtype=torch.float16): response = model.write_webpage(query, seed=202, task='Instruction-aware Webpage Generation', repetition_penalty=3.0) print(response) # see the Instruction-aware Webpage Generation.html

See the Instruction to Webpage results here.

</details> <details> <summary> <b>Resume to Webpage</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer ## the input should be a resume in markdown format query = './examples/resume.md' with torch.autocast(device_type='cuda', dtype=torch.float16): response = model.resume_2_webpage(query, seed=202, repetition_penalty=3.0) print(response)

See the Resume to Webpage results here.

</details> <details> <summary> <b>Screenshot to Webpage</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer query = 'Generate the HTML code of this web image with Tailwind CSS.' image = ['./examples/screenshot.jpg'] with torch.autocast(device_type='cuda', dtype=torch.float16): response = model.screen_2_webpage(query, image, seed=202, repetition_penalty=3.0) print(response)

See the Screenshot to Webpage results here.

</details> <details> <summary> <b>Write Article</b> </summary>
python
import torch from transformers import AutoModel, AutoTokenizer torch.set_grad_enabled(False) # init model and tokenizer model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval().half() tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True) model.tokenizer = tokenizer query = '้˜…่ฏปไธ‹้ข็š„ๆๆ–™๏ผŒๆ นๆฎ่ฆๆฑ‚ๅ†™ไฝœใ€‚ ็”ตๅฝฑใ€Š้•ฟๅฎ‰ไธ‰ไธ‡้‡Œใ€‹็š„ๅ‡บ็Žฐ่ฎฉไบบๆ„Ÿๆ…จ๏ผŒๅฝฑ็‰‡ๅนถๆœชๅฐ†้‡็‚นๅ…จ่ฝๅœจๅคงๅ”้ฃŽๅŽไธŠ๏ผŒไนŸๅฑ•็Žฐไบ†ๆขๅผ˜ๆฐ”่ฑก็š„้˜ดๆš—้ข๏ผŒๅณๆ—ง้—จ้˜€็š„่ต„ๆบๅž„ๆ–ญใ€ๆœๆ”ฟ็š„ๆ—ฅ็›Š่กฐ่ดฅไธŽ้’ๅนดๆ‰ไฟŠ็š„ๅฃฎๅฟ—้šพ้…ฌใ€‚้ซ˜้€‚ไป•่ฟ›ๆ— ้—จ๏ผŒๅช่ƒฝๅ›žไนก>ๆฒ‰ๆฝœไฟฎ่กŒใ€‚ๆŽ็™ฝ่™ฝๅพ—็މ็œŸๅ…ฌไธปไธพ่๏ผŒๆ“ขๅ…ฅ็ฟฐๆž—๏ผŒไฝ†ไป–ๅชๆ˜ฏๆˆไธบๅ”็Ž„ๅฎ—็š„ๅพก็”จๆ–‡ไบบ๏ผŒไธ่ƒฝ็œŸๆญฃๅฎž็Žฐๆœ‰็›ŠไบŽๆœๆ”ฟ็š„ๅฟ—ๆ„ใ€‚็„ถ่€Œ๏ผŒ็‰‡ไธญ้ซ˜ๆฝฎ้ƒจๅˆ†ใ€Šๅฐ†่ฟ›้…’ใ€‹ไธ€่Š‚๏ผŒไบบ่‡ณไธญๅนดใ€ๆŒ‚็€่‚š่…ฉ็š„ๆŽ็™ฝๅผ•ไผ—ไบบไน˜ไป™้นคไธŠๅคฉ๏ผŒไธ€่ทฏไปŽๆฐด้ขใ€็€‘ๅธƒ้ฃžๅ‡่‡ณ้“ถๆฒณ่ฟ›ๅ…ฅไป™>ๅฎซ๏ผŒๆŽ็™ฝ็‹‚ๅฅ”็€ไธŽไป™ไบบไปฌ็ขฐๆฏ๏ผŒๆœ€ๅŽๅคงๅฎถ็บต่บซ้ฃžๅ‘ๆผฉๆถก่ˆฌ็š„ไน้‡ๅคฉใ€‚่‚‰่บซ็š„ๅพฎ่ดฑใ€ไธ–่ทฏ็š„โ€œๅคฉ็”Ÿๆˆ‘ๆๅฟ…ๆœ‰็”จ๏ผŒๅŽๅท๏ผŒๆ‹˜ไธไฝ็ฒพ็ฅž็š„้ซ˜่นˆใ€‚โ€œๅคฉ็”Ÿๆˆ‘ๆๅฟ…ๆœ‰็”จ๏ผŒๅƒ้‡‘ๆ•ฃๅฐฝ่ฟ˜ๅคๆฅใ€‚โ€ ๅคๅพ€ไปŠๆฅ๏ผŒ่บซๅค„้—ฒ้กฟใ€้ญๅ—ๆŒซๆŠ˜ใ€่ขซ็—…็—›ๆŠ˜็ฃจ๏ผŒๅพˆๅคšไบบ้ƒฝๆ›พ็ปๅކ>ไบ†ไบบ็”Ÿ็š„โ€œๅคฑๆ„โ€๏ผŒๅดๅ่€Œๆˆๅฐฑไบ†ไป–ไปฌโ€œ่ฏ—ๆ„โ€็š„ไบบ็”Ÿใ€‚ๅฏนๆญฃๅœจ่ฟฝๆฑ‚ไบบ็”Ÿไปทๅ€ผ็š„ๅฝ“ไปฃ้’ๅนดๆฅ่ฏด๏ผŒๅฆ‚ไฝ•ๅฏนๅพ…ไบบ็”Ÿไธญ็š„็ผบๆ†พๅ’Œๅ›ฐ้กฟ?่ฏ—ๆ„ไบบ็”Ÿไธญๅˆๆœ‰ๆ€Žๆ ท็š„่‡ชๆˆ‘ๅšๅฎˆๅ’Œ่‡ชๆˆ‘่ฎคๅŒ?่ฏท็ป“ๅˆโ€œๅคฑๆ„โ€ไธŽโ€œ่ฏ—ๆ„โ€่ฟ™ไธคไธชๅ…ณ้”ฎ่ฏๅ†™ไธ€็ฏ‡ๆ–‡็ซ ใ€‚ ่ฆๆฑ‚:้€‰ๅ‡†่ง’ๅบฆ๏ผŒ็กฎๅฎš>็ซ‹ๆ„๏ผŒๆ˜Ž็กฎๆ–‡ไฝ“๏ผŒ่‡ชๆ‹Ÿๆ ‡้ข˜;ไธ่ฆๅฅ—ไฝœ๏ผŒไธๅพ—ๆŠ„่ขญ;ไธๅพ—ๆณ„้œฒไธชไบบไฟกๆฏ;ไธๅฐ‘ไบŽ 800 ๅญ—ใ€‚' with torch.autocast(device_type='cuda', dtype=torch.float16): response = model.write_artical(query, seed=8192) print(response) #่ฏ—ๆ„ไบบ็”Ÿ๏ผŒ่ดตๅœจๅšๅฎˆ #ใ€Š่œๆ น่ฐญใ€‹ๆœ‰ไบ‘:โ€œ้—ฒๆ—ถ่ฆๆœ‰ๅƒ็ดง็š„ๅฟƒๆ€,ๅฟ™้‡Œ่ฆ็•™ๅƒ้—ฒๅทฅๅคซใ€‚โ€ไบบ็”Ÿๅœจไธ–,ๆ€ปๆœ‰ๅคฑๆ„ไน‹ๆ—ถ,ๅฝ“้ขๅฏน็ผบๆ†พๅ’Œๅ›ฐ้กฟ,่ฏ—ๆ„ๅœฐ็”Ÿๆดป็€ๆ‰่ƒฝไธบไบบ็”Ÿๅขžๆทปไธ€ๆŠนไบฎ่‰ฒใ€‚ไฝ•่ฐ“่ฏ—ๆ„ๅœฐ็”Ÿๆดป? ๆ‰€่ฐ“่ฏ—ๆ„ๅœฐ็”Ÿๆดป๏ผŒไพฟๆ˜ฏๅœจไบŽๅšๅฎˆๆœฌๅฟƒใ€็›ด้ข้—ๆ†พใ€่ถ…่ถŠ่‡ชๆˆ‘,ๅœจๅคฑๆ„ไธญๅฏปๆ‰พไบบ็”Ÿไปทๅ€ผใ€‚ #่ฏ—ๆ„ๅœฐ็”Ÿๆดป,้œ€ๅšๅฎˆๆœฌๅฟƒ,ๆทก็„ถๅค„ไน‹ใ€‚ #้™ถๆธŠๆ˜Žๆ›พๆ‰งๆ„่พžๅŽปๅฝญๆณฝๅŽฟไปค,ๅฝ’้š็”ฐๅ›ญ,โ€œ้‡‡่Šไธœ็ฏฑไธ‹,ๆ‚ ็„ถ่งๅ—ๅฑฑโ€,ๅœจๅฑฑๆฐด้—ดๅฏ„ๆƒ…่‡ชๅจฑ๏ผ›็Ž‹็ปด้ขๅฏนไป•้€”ๅคฑๆ„,็ปˆๆ—ฅๆฒ‰้†‰ไบŽ่ฏ—้…’ไน‹ไธญ,โ€œๅ…ดๆฅๆฏ็‹ฌๅพ€,่ƒœไบ‹็ฉบ่‡ช็Ÿฅโ€,ๅœจ่ฏ—้…’ไธญ้—ฒ้€ธ่‡ชๅฆ‚;ๆŽ็™ฝไป•้€”ไธ้กบ,่ขซ่ต้‡‘ๆ”พ่ฟ˜,ไฝ†ไป–ไพๆ—ง่ฑชๆฐ”ๅนฒไบ‘,โ€œๅคฉ็”Ÿๆˆ‘ๆ‰ๅฟ…ๆœ‰็”จ,ๅƒ้‡‘ๆ•ฃๅฐฝ่ฟ˜ๅคๆฅโ€,ๅœจๅคฑๆ„ไธญๅฆ็„ถ่ฑ่พพใ€‚ๅšๅฎˆๆœฌๅฟƒ๏ผŒไพฟ่ƒฝๅœจ้ญ้‡ๅคฑๆ„ไน‹ๆ—ถๅฎˆไฝ่‡ชๅทฑ็š„็ฒพ็ฅžๅฎถๅ›ญ,่ฎฉ็”Ÿๆดปๅ……ๆปก่ฏ—ๆ„ใ€‚ๅไน‹,่‹ฅไธ่ƒฝๅšๅฎˆๆœฌๅฟƒ,่€Œๅชๆ˜ฏไธ€ๅ‘ณ่ฟŽๅˆไธ–ไฟ—ไปฅๆฑ‚ๅพ—ๅ‡่ฟ,้‚ฃ็บตไฝฟ่บซๅฑ…้ซ˜ไฝ,ไบฆไผšไธงๅคฑ็”Ÿๆดป็š„ไน่ถฃใ€‚ #่ฏ—ๆ„ๅœฐ็”Ÿๆดป,้œ€็›ด้ข้—ๆ†พ,่ถ…่ถŠ่‡ชๆˆ‘ใ€‚ #โ€œ่ฅฟๅกžๅฑฑๅ‰็™ฝ้นญ้ฃž,ๆกƒ่Šฑๆตๆฐด้ณœ้ฑผ่‚ฅใ€‚้’็ฎฌ็ฌ ,็ปฟๆŸณๆž,ๅŠๆ–ค้…’,ไธ€็บถไธใ€‚ไบ”ๆน–ๅ››ๆตท็š†ๅฆ‚ๆญค,ไฝ•ๅฆจๅˆฐๆญคๅค„ๅฝ’ใ€‚โ€็™ฝๅฑ…ๆ˜“็š„ใ€Šๆธ”ๆญŒๅญใ€‹ๅ†™ๅ‡บไบ†ๅคšๅฐ‘ไบบ็š„ๆ„ฟๆœ›:ๆฒกๆœ‰ๆƒๅŠฟ็บทๆ‰ฐ,ๆฒกๆœ‰่ดซๅ›ฐๅ‡„ๅ‡‰,ๅชๆœ‰้’ๅฑฑ็ปฟๆฐดใ€็™ฝ้นญ้ธฅ้ธŸไฝœไผด,ๅฆ‚ๆญค่‡ช็”ฑ่‡ชๅœจ็š„็”Ÿๆดปไปคไบบ็ฅžๅพ€ใ€‚็„ถ่€Œ,็™ฝๅฑ…ๆ˜“ๅดๅนถๆฒกๆœ‰ๅ› ๆญค็œŸ็š„ๅฝ’้šๅฑฑๆž—,่€Œๆ˜ฏ็›ด้ขไบบ็”Ÿ,่ถ…่ถŠ่‡ชๆˆ‘,ๅ†™ไธ‹ไบ†ไธ€้ฆ–้ฆ–่ฏ—ๆ„่€ŒๅฏŒๆœ‰็Žฐๅฎžๅ…ณๆ€€็š„ไฝœๅ“ใ€‚ๅฆ‚ๆžœ็™ฝๅฑ…ๆ˜“ๅช้กพ้€ƒ้ฟไบบ็”Ÿ,้‚ฃๅˆๆ€Žไผšๆ‹ฅๆœ‰โ€œๅคงๅผฆๅ˜ˆๅ˜ˆๅฆ‚ๆ€ฅ้›จ,ๅฐๅผฆๅˆ‡ๅˆ‡ๅฆ‚็ง่ฏญโ€็š„็ป็พŽๆฏ”ๅ–ปๅ‘ข?ๅฆ‚ๆžœ็™ฝๅฑ…ๆ˜“ๅช้กพๅฝ’้šๅฑฑๆž—,้‚ฃๅˆๆ€Žไผšๅ†™ๅ‡บโ€œๆญคๆ›ฒๅชๅบ”ๅคฉไธŠๆœ‰,ไบบ้—ดๅ“ชๅพ—้…็™ฝๅฑ…ๆ˜“โ€่ฟ™ๆ ท็š„่ฏ—ๅฅๅ‘ข? #่ฏ—ๆ„ๅœฐ็”Ÿๆดป,้œ€็›ด้ข้—ๆ†พ,ๅšๅฎˆๆœฌๅฟƒใ€‚ #ๆŽๆ–‡ๆณขๆ‚ฃๆœ‰ๆธๅ†ป็—‡,ๅŒป็”Ÿ่ฏดไป–ๆดปไธ่ฟ‡ไบ”ๅนด,ไฝ†ไป–ๆฒกๆœ‰ๅ› ๆญคๆ”พๅผƒๅฏน้Ÿณไน็š„็ƒญ็ˆฑ,่€Œๆ˜ฏไธŽ็—…้ญ”ไฝœๆ–—ไบ‰,ๆผ”ๅฅๅ‡บ็พŽๅฆ™็š„ไนๆ›ฒ;ๅญ™ๅฎถๆž—่‡ชๅนผๆ‚ฃๆœ‰่„‘็˜ซ,ไฝ†ไป–ไธ็”˜ไบŽๅ‘ฝ่ฟ็š„ๆ‰ๅผ„,็ปˆๆˆๅ…จๅ›ฝๆœ€็พŽๆ•™ๅธˆ;ๅฒ้“็”Ÿ้ฅฑๅ—็–พ็—…ๆŠ˜็ฃจ,ไฝ†ไป–ไป่ƒฝๅ‘ๅ‡บโ€œๆˆ‘ๅธธๅธธๅœจๆˆ‘็š„ๅฟƒๅคดๆธ…็‚น,ๆˆ‘ๆœ‰ไป€ไนˆ?โ€็š„ๅฉ้—ฎ,ๅนถ็”ฑๆญค่ตฐไธŠๆ–‡ๅญฆ้“่ทฏ,ไธบๅŽไธ–็•™ไธ‹ไธฐๅŽš็š„ๆ–‡ๅŒ–้—ไบงใ€‚่ฟ™ไบ›ไบบๆฒกๆœ‰้€ƒ้ฟ,่€Œๆ˜ฏ้€‰ๆ‹ฉ็›ด้ขไบบ็”Ÿ็š„็ผบๆ†พ,ๅœจๅšๅฎˆๆœฌๅฟƒ็š„ๅŒๆ—ถ่ถ…่ถŠ่‡ชๆˆ‘,ๆœ€็ปˆๅฎž็Žฐไบ†่‡ชๅทฑ็š„ไปทๅ€ผใ€‚ #่ฏ—ๆ„ๅœฐ็”Ÿๆดป,ๆ˜ฏไบŽๅคฑๆ„ไธญๅšๅฎˆๆœฌๅฟƒ,ไบŽ็ผบๆ†พไธญ่ถ…่ถŠ่‡ชๆˆ‘ใ€‚ๅฝ“้ขๅฏนไบบ็”Ÿ็š„็ผบๆ†พไธŽๆŒซๆŠ˜,ๅšๅฎˆๆœฌๅฟƒใ€่ถ…่ถŠ่‡ชๆˆ‘็š„ๅŒๆ—ถ,ไนŸๅฟ…ๅฐ†ไนฆๅ†™ๅฑžไบŽ่‡ชๅทฑ็š„่พ‰็…Œ็ฏ‡็ซ ใ€‚ #ๆ„ฟไฝ ๆˆ‘้ƒฝ่ƒฝ่ฏ—ๆ„ๅœฐ็”Ÿๆดป็€! query = 'Please write a blog based on the title: French Pastries: A Sweet Indulgence' with torch.autocast(device_type='cuda', dtype=torch.float16): response = model.write_artical(query, seed=8192) print(response) #French Pastries: A Sweet Indulgence #The French are well known for their love of pastries, and itโ€™s a love that is passed down through generations. When one visits France, they are treated to an assortment of baked goods that can range from the delicate macaron to the rich and decadent chocolate mousse. While there are many delicious types of pastries found in France, five stand out as being the most iconic. Each of these pastries has its own unique qualities that make it special. #1. Croissant #One of the most famous pastries from France is the croissant. It is a buttery, flaky pastry that is best enjoyed fresh from the bakery. The dough is laminated with butter, giving it its signature layers. Croissants are typically eaten for breakfast or brunch, often accompanied by coffee or hot chocolate. #2. Macaron #The macaron is a small, delicate French confection made from almond flour, powdered sugar, and egg whites. The macaron itself is sandwiched with a ganache or jam filling. They come in a variety of colors and flavors, making them a popular choice for both casual snacking and upscale desserts. #3. Madeleine #The madeleine is a small shell-shaped cake that is light and sponge-like. It is often flavored with lemon or orange zest and sometimes dipped in chocolate. Madeleines are perfect for an afternoon snack with tea or coffee. #4. ร‰clair #The รฉclair is a long, thin pastry filled with cream and topped with chocolate glaze. It is a classic French treat that is both sweet and satisfying. ร‰clairs can be found in bakeries all over France and are often enjoyed with a cup of hot chocolate. #5. Tarte Tatin #The tarte Tatin is an apple tart that is known for its caramelized apples and puff pastry crust. It is named after the Tatin sisters who created the recipe in the late 19th century. Tarte Tatin is best served warm with a scoop of vanilla ice cream. #These pastries are just a few of the many delicious treats that France has to offer. Whether you are a seasoned traveler or a first-time visitor, indulging in French pastries is a must-do activity. So go ahead, treat yourselfโ€”you deserve it!
</details>

Inference on Multiple GPUs

If you have multiple GPUs, but the memory size of each GPU is not enough to accommodate the entire model, you can split the model across multiple GPUs. First, install accelerate using the command: pip install accelerate. Then, execute the follows scripts for chat:

# chat with 2 GPUs
python example_code/example_chat.py --num_gpus 2

Inference Acceleration by LMDeploy

If InternLM-XComposer2d5 model inference optimization is required, we recommend using LMDeploy.

In the following subsections, we will introduce the usage of LMDeploy with the internlm-xcomposer2d5-7b model as an example.

First of all, please install the pypi package with pip install lmdeploy. By default, it depends on CUDA 12.x. For a CUDA 11.x environment, please refer to the installation guide.

Offline Inference Pipeline

python
from lmdeploy import pipeline from lmdeploy.vl import load_image pipe = pipeline('internlm/internlm-xcomposer2d5-7b') image = load_image('examples/dubai.png') response = pipe(('describe this image', image)) print(response.text)

For more on using the VLM pipeline, including multi-image inference or multi-turn chat, please overview this guide.

4-Bit Model

We offer 4-bit quantized models via LMDeploy to reduce memory requirements. For a memory usage comparison, please refer to here.

python
from lmdeploy import TurbomindEngineConfig, pipeline from lmdeploy.vl import load_image engine_config = TurbomindEngineConfig(model_format='awq') pipe = pipeline('internlm/internlm-xcomposer2d5-7b-4bit', backend_config=engine_config) image = load_image('examples/dubai.png') response = pipe(('describe this image', image)) print(response.text)

Finetune

  1. Please refer to our finetune scripts.
  2. Inference and finetune support from ModelScope Swift

Gradio Deploy

We provide code for users to build a web UI demo. Please use gradio==4.13.0

Please run the command below for Chat / Composition:

# For Multimodal Chat
python gradio_demo/gradio_demo_chat.py

# For Free-form Text-Image Composition
python gradio_demo/gradio_demo_composition.py

The user guidance of UI demo is given in HERE. If you wish to change the default folder of the model, please use the --code_path=new_folder option.
<br>

Citation

If you find our models / code / papers useful in your research, please consider giving โญ and citations ๐Ÿ“, thx :)

BibTeX
@inproceedings{internlmxcomposer2_5_reward, title={InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model}, author={Yuhang Zang and Xiaoyi Dong and Pan Zhang and Yuhang Cao and Ziyu Liu and Shengyuan Ding and Shenxi Wu and Yubo Ma and Haodong Duan and Wenwei Zhang and Kai Chen and Dahua Lin and Jiaqi Wang}, booktitle={Findings of ACL}, year={2025} }
BibTeX
@article{internlmxcomposer2_5_OL, title={InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions}, author={Pan Zhang and Xiaoyi Dong and Yuhang Cao and Yuhang Zang and Rui Qian and Xilin Wei and Lin Chen and Yifei Li and Junbo Niu and Shuangrui Ding and Qipeng Guo and Haodong Duan and Xin Chen and Han Lv and Zheng Nie and Min Zhang and Bin Wang and Wenwei Zhang and Xinyue Zhang and Jiaye Ge and Wei Li and Jingwen Li and Zhongying Tu and Conghui He and Xingcheng Zhang and Kai Chen and Yu Qiao and Dahua Lin and Jiaqi Wang}, journal={arXiv preprint arXiv:2412.09596}, year={2024} }
BibTeX
@article{internlmxcomposer2_5, title={InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output}, author={Pan Zhang and Xiaoyi Dong and Yuhang Zang and Yuhang Cao and Rui Qian and Lin Chen and Qipeng Guo and Haodong Duan and Bin Wang and Linke Ouyang and Songyang Zhang and Wenwei Zhang and Yining Li and Yang Gao and Peng Sun and Xinyue Zhang and Wei Li and Jingwen Li and Wenhai Wang and Hang Yan and Conghui He and Xingcheng Zhang and Kai Chen and Jifeng Dai and Yu Qiao and Dahua Lin and Jiaqi Wang}, journal={arXiv preprint arXiv:2407.03320}, year={2024} }
BibTeX
@article{internlmxcomposer2_4khd, title={InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD}, author={Xiaoyi Dong and Pan Zhang and Yuhang Zang and Yuhang Cao and Bin Wang and Linke Ouyang and Songyang Zhang and Haodong Duan and Wenwei Zhang and Yining Li and Hang Yan and Yang Gao and Zhe Chen and Xinyue Zhang and Wei Li and Jingwen Li and Wenhai Wang and Kai Chen and Conghui He and Xingcheng Zhang and Jifeng Dai and Yu Qiao and Dahua Lin and Jiaqi Wang}, journal={arXiv preprint arXiv:2404.06512}, year={2024} }
BibTeX
@article{internlmxcomposer2, title={InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model}, author={Xiaoyi Dong and Pan Zhang and Yuhang Zang and Yuhang Cao and Bin Wang and Linke Ouyang and Xilin Wei and Songyang Zhang and Haodong Duan and Maosong Cao and Wenwei Zhang and Yining Li and Hang Yan and Yang Gao and Xinyue Zhang and Wei Li and Jingwen Li and Kai Chen and Conghui He and Xingcheng Zhang and Yu Qiao and Dahua Lin and Jiaqi Wang}, journal={arXiv preprint arXiv:2401.16420}, year={2024} }
BibTeX
@article{internlmxcomposer, title={InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition}, author={Pan Zhang and Xiaoyi Dong and Bin Wang and Yuhang Cao and Chao Xu and Linke Ouyang and Zhiyuan Zhao and Shuangrui Ding and Songyang Zhang and Haodong Duan and Wenwei Zhang and Hang Yan and Xinyue Zhang and Wei Li and Jingwen Li and Kai Chen and Conghui He and Xingcheng Zhang and Yu Qiao and Dahua Lin and Jiaqi Wang}, journal={arXiv preprint arXiv:2309.15112}, year={2023} }
<br>

License & Contact Us

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/็”ณ่ฏท่กจ๏ผˆไธญๆ–‡๏ผ‰. For other questions or collaborations, please contact internlm@pjlab.org.cn.

Contributors

Showing top 12 contributors by commit count.

View all contributors on GitHub โ†’

This article is auto-generated from InternLM/InternLM-XComposer via the GitHub API.Last fetched: 6/1/2026