# Spark-Chemistry-X1-13B **Repository Path**: iflytekopensource/Spark-Chemistry-X1-13B ## Basic Information - **Project Name**: Spark-Chemistry-X1-13B - **Description**: iFLYTEK Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team. - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-28 - **Last Updated**: 2025-10-11 ## Categories & Tags **Categories**: llm **Tags**: None ## README # Model Card for iFLYTEK Spark Chemistry-X1-13B ## Model Introduction iFLYTEK Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team. Fine-tuned from the iFLYTEK Spark-X1 foundation model on diverse chemical task datasets, it demonstrates exceptional proficiency in solving complex chemical problems while maintaining strong general capabilities. The model achieves strong performance across chemistry-related benchmarks and shows clear advantages over leading general-purpose models on most evaluation metrics. **Key Features** - **Deep Reasoning Architecture**: Unified framework combining long Chain-of-Thought (CoT) with dual-process theory, supporting both fast (reactive) and slow (deliberative) thinking modes - **Hybrid Training Stability**: Novel attention masking mechanisms decouple training phases for different reasoning modes, preventing interference between data distributions - **Chemical Domain Enhancement**: Multi-stage optimization for specialized tasks including: - Advanced knowledge Q&A - Chemical name conversion - Molecular property prediction ## Model Summary | Parameter | Value | |--------------------------|----------------| | Total Parameters | 13B | | Context Length | 32K | | Window Length | 32K | | Number of Layers | 40 | | Attention Hidden Dim | 5120 | | Attention Heads | 40 | | Vocabulary Size | 130K | | Attention Mechanism | GQA | | Activation Function | GeLU | ## Evaluation Results *Bold = Global SOTA | Task | Metric | Spark Chemistry-X1-13B | DeepSeek-R1 | Gemini 2.5 pro | GPT-4.1 | O3-mini | |---------------------------------|--------|----------------|---------|---------|---------|-------------| | Advanced Knowledge Q&A | Acc | **84.00** | 77.00 | 64.00 | 76.00 | 80.00 | | Name Conversion | Acc | **71.00** | 6.00 | 15.00 | 4.00 | 6.00 | | Property Prediction | Acc | **85.33** | 41.73 | 51.19 | 51.66 | 67.58 | **Evaluation Notes**: 1. All results show zero-shot performance averages 2. Consistent evaluation protocol applied across all models 3. DeepSeek-R1, Gemini 2.5 Pro, GPT-4.1, and O3-mini were evaluated using Chain-of-Thought (CoT) reasoning with API verification 5. Spark Chemistry-X1-13B was evaluated using Chain-of-Thought (CoT) reasoning in a local environment on NVIDIA A800 80GB GPUs 6. The evaluation dataset was self-constructed ## Usage **requirement** ```bash cd /path/to/Spark-Chemistry-X1-13B # We recommend using Python 3.10 pip install -r requirements.txt pip install . ``` **Quickstart** ```python from modelscope import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model_name = "iflytek/Spark-Chemistry-X1-13B" tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float32, device_map="auto", trust_remote_code=True ) # Reactive chat_history = [ { "role" : "user", "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?" }] inputs = tokenizer.apply_chat_template( chat_history, tokenize=True, return_tensors="pt", add_generation_prompt=True ).to(model.device) outputs = model.generate( inputs, max_new_tokens=8192, top_k=1, do_sample=True, repetition_penalty=1.02, temperature=0.7, eos_token_id=5, pad_token_id=0, ) response = tokenizer.decode( outputs[0][inputs.shape[1] :], skip_special_tokens=True ) print(response) ``` ```python from modelscope import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model_name = "iflytek/Spark-Chemistry-X1-13B" tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float32, device_map="auto", trust_remote_code=True ) # Deliberative chat_history = [ { "role" : "system", "content" : "请你先深入剖析给出问题的关键要点与内在逻辑,生成思考过程,再根据思考过程回答给出问题。思考过程以开头,在结尾处用标注结束,后为基于思考过程的回答内容" } , { "role" : "user", "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?" }] inputs = tokenizer.apply_chat_template( chat_history, tokenize=True, return_tensors="pt", add_generation_prompt=True ).to(model.device) outputs = model.generate( inputs, max_new_tokens=8192, top_k=1, do_sample=True, repetition_penalty=1.02, temperature=0.7, eos_token_id=5, pad_token_id=0, ) response = tokenizer.decode( outputs[0][inputs.shape[1] :], skip_special_tokens=True ) print(response) ``` **Optional: Convert FP32 Weights to BF16** The released weights of Spark Chemistry-X1-13B are stored in FP32 precision. For inference efficiency, users can optionally convert the weights into bfloat16 (BF16) format. ```python from modelscope import AutoModelForCausalLM import torch model_name = " /path_to/Spark-Chemistry-X1-13B" # Load FP32 weights model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float32, # explicitly FP32 device_map="auto", trust_remote_code=True ) # Convert to BF16 model = model.to(torch.bfloat16) # Save BF16 weights for later fast loading save_path = "./Spark-Chemistry-X1-13B-bf16" model.save_pretrained(save_path, safe_serialization=True) ``` ## License Agreement iFLYTEK Spark Chemistry-X1-13B is licensed under Apache 2.0.