# Spark-Chemistry-X1-13B

**Repository Path**: iflytekopensource/Spark-Chemistry-X1-13B

## Basic Information

- **Project Name**: Spark-Chemistry-X1-13B
- **Description**: iFLYTEK Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team.
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-09-28
- **Last Updated**: 2025-10-11

## Categories & Tags

**Categories**: llm

**Tags**: None

## README


# Model Card for iFLYTEK  Spark Chemistry-X1-13B

## Model Introduction
iFLYTEK  Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team. Fine-tuned from the iFLYTEK Spark-X1 foundation model on diverse chemical task datasets, it demonstrates exceptional proficiency in solving complex chemical problems while maintaining strong general capabilities. The model achieves strong performance across chemistry-related benchmarks and shows clear advantages over leading general-purpose models on most evaluation metrics.

**Key Features**  
-  **Deep Reasoning Architecture**: Unified framework combining long Chain-of-Thought (CoT) with dual-process theory, supporting both fast (reactive) and slow (deliberative) thinking modes
-  **Hybrid Training Stability**: Novel attention masking mechanisms decouple training phases for different reasoning modes, preventing interference between data distributions
- **Chemical Domain Enhancement**: Multi-stage optimization for specialized tasks including:

  - Advanced knowledge Q&A
  - Chemical name conversion
  - Molecular property prediction


## Model Summary
| Parameter                | Value          |
|--------------------------|----------------|
| Total Parameters         | 13B            |
| Context Length           | 32K            |
| Window Length            | 32K            |
| Number of Layers         | 40             |
| Attention Hidden Dim     | 5120           |
| Attention Heads          | 40             |
| Vocabulary Size          | 130K           |
| Attention Mechanism      | GQA            |
| Activation Function      | GeLU           |

## Evaluation Results
*Bold = Global SOTA

| Task                            | Metric | Spark Chemistry-X1-13B |  DeepSeek-R1 | Gemini 2.5 pro | GPT-4.1 | O3-mini |
|---------------------------------|--------|----------------|---------|---------|---------|-------------|
| Advanced Knowledge Q&A          | Acc    | **84.00**       |  77.00          | 64.00         | 76.00     | 80.00     |
| Name Conversion                 | Acc    | **71.00**       |  6.00         | 15.00         | 4.00     | 6.00     |
| Property Prediction             | Acc    | **85.33**       |  41.73        | 51.19         | 51.66     | 67.58     |

**Evaluation Notes**:
1. All results show zero-shot performance averages
2. Consistent evaluation protocol applied across all models
3. DeepSeek-R1, Gemini 2.5 Pro, GPT-4.1, and O3-mini were evaluated using Chain-of-Thought (CoT) reasoning with API verification
5. Spark Chemistry-X1-13B was evaluated using Chain-of-Thought (CoT) reasoning in a local environment on NVIDIA A800 80GB GPUs
6. The evaluation dataset was self-constructed

## Usage
**requirement**
```bash
cd /path/to/Spark-Chemistry-X1-13B
# We recommend using Python 3.10
pip install -r requirements.txt
pip install .
```
**Quickstart**
```python
from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="auto",
    trust_remote_code=True
)
# Reactive
chat_history = [
  {
    "role" : "user",
    "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
  }]

inputs = tokenizer.apply_chat_template(
    chat_history,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
    temperature=0.7,
    eos_token_id=5,
    pad_token_id=0,
)

response = tokenizer.decode(
    outputs[0][inputs.shape[1] :],
    skip_special_tokens=True
)
print(response)
```

```python
from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="auto",
    trust_remote_code=True
)
# Deliberative
chat_history = [
  {
    "role" : "system",
    "content" : "请你先深入剖析给出问题的关键要点与内在逻辑,生成思考过程,再根据思考过程回答给出问题。思考过程以<unused6>开头,在结尾处用<unused7>标注结束,<unused7>后为基于思考过程的回答内容"
  }
  ,
  {
    "role" : "user",
    "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
  }]


inputs = tokenizer.apply_chat_template(
    chat_history,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
    temperature=0.7,
    eos_token_id=5,
    pad_token_id=0,
)

response = tokenizer.decode(
    outputs[0][inputs.shape[1] :],
    skip_special_tokens=True
)
print(response)
```


**Optional: Convert FP32 Weights to BF16**

The released weights of Spark Chemistry-X1-13B are stored in FP32 precision.
For inference efficiency, users can optionally convert the weights into bfloat16 (BF16) format.
```python
from modelscope import AutoModelForCausalLM
import torch

model_name = " /path_to/Spark-Chemistry-X1-13B"

# Load FP32 weights
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32, # explicitly FP32
    device_map="auto",
    trust_remote_code=True
)

# Convert to BF16
model = model.to(torch.bfloat16)

#  Save BF16 weights for later fast loading
save_path = "./Spark-Chemistry-X1-13B-bf16"
model.save_pretrained(save_path, safe_serialization=True)
```

 
## License Agreement
iFLYTEK Spark Chemistry-X1-13B is licensed under Apache 2.0.