The TCMLLM (Traditional Chinese Medicine Large Language Model) project aims to leverage powerful AI models for clinical support in traditional Chinese medicine, focusing on tasks like diagnosis and prescription recommendations. In this guide, we will cover how to get started with TCMLLM, troubleshoot common issues, and explore useful analogies for better understanding.
Getting Started with TCMLLM
To effectively utilize the TCMLLM, follow these steps:
- First, download the original model code and parameters for ChatGLM-6B and set up your environment.
- Next, download the TCMLLM model parameters. Extract the checkpoint file into the ChatGLM-6B tuning directory. You can find the compressed file at this Baidu link with the extraction code: iwg3.
Understanding the Instruction Fine-Tuning Dataset
The instruction fine-tuning dataset for TCMLLM comprises 68,000 entries with approximately 10 million tokens, extracted from various authentic clinical cases and traditional textbooks. This dataset has been categorized into eight sources, including:
- Four classic TCM medical textbooks (Internal medicine, Surgery, Gynecology, and Pediatrics)
- The 2020 edition of the Chinese Pharmacopoeia
- Traditional Chinese Medicine Clinical Cases collected from multiple major hospitals
Explaining TCMLLM-PR Model Creation with Analogy
Imagine a chef mastering the art of cooking. At first, they gather numerous recipes (the dataset), organizing them into categories based on cuisine types (traditional texts and clinical cases). Then, they practice cooking these dishes under various conditions until they become adept at creating them from memory (the fine-tuning process). Finally, when someone comes to them with specific dietary restrictions (patient symptoms), the chef swiftly identifies the most suitable dish from their repertoire (prescription recommendations). Similarly, TCMLLM uses vast amounts of data to learn effective responses to various healthcare inquiries.
Training and Inference Details
For anyone looking to train or fine-tune the TCMLLM-PR model, consider the following:
- During training, it is recommended to utilize two NVIDIA 3090 GPUs (24G memory each). With a batch size of 16, each GPU consumes about 23G of memory.
- For inference, the memory usage amounts to approximately 14G. Adjust the batch size based on your GPU memory availability.
Troubleshooting Common Issues
Even with cutting-edge technology, obstacles can arise. Here are some troubleshooting tips:
- If you’ve set up everything correctly, but the model does not respond as expected, check to ensure all dependencies are installed properly and that paths are correctly configured.
- In case of high memory usage or crashes, reduce the batch size or ensure that you are using a compatible GPU to optimize resource allocation.
- If there are instances of inconsistent prescription outputs, consider refining the dataset or using updated model parameters.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

