diff --git a/docs/mindformers/docs/source_en/advanced_development/precision_optimization.md b/docs/mindformers/docs/source_en/advanced_development/precision_optimization.md index f3da95bb46b0ddd8569be149499ea4e6fb7cdf05..23350d86c8af68d4d8b0bcce0a884bd5afddf773 100644 --- a/docs/mindformers/docs/source_en/advanced_development/precision_optimization.md +++ b/docs/mindformers/docs/source_en/advanced_development/precision_optimization.md @@ -120,11 +120,11 @@ MindSpore's Dump tool is enabled by configuring a JSON file, which Dumps out all "op_debug_mode": 0, "dump_mode": 0, "path": "/absolute_path", - "net_name": "ResNet50", + "net_name": "Qwen3", "iteration": "0|5-8|100-120", "saved_data": "tensor", "input_output": 0, - "kernels": ["Default/Conv-op12"], + "kernels": ["Default"], "support_device": [0,1,2,3,4,5,6,7] }, "e2e_dump_settings": { @@ -148,6 +148,8 @@ After setting the environment variables, start the program training to get the c In addition to the full amount of operator Dump introduced above, the tool also supports partial data Dump, overflow Dump, specified-condition Dump and so on. Limited to space, interested users can refer to [Dump function debugging](https://www.mindspore.cn/tutorials/en/r2.7.0rc1/debug/dump.html) for configuration and use. In addition, the msprobe precision debugging tool is provided. msprobe is a tool package under the precision debugging component of the MindStudio Training Tools suite. It mainly includes functions such as precision pre-check, overflow detection, and precision comparison. For more information, refer to [msprobe User Guide](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe). +When **deterministic computation** is enabled or the **Dump** feature is used, the training **performance** will significantly degrade. This may cause larger training steps and slower execution, which is expected behavior. + ## Generalized Processes for Precision Positioning Quickly troubleshoot the problem by using the [Precision Problems Location CheckList](#precision-problems-location-checklist) section. If the precision problem still exists after completing the CheckList and there is no obvious direction, you can narrow down the scope of the problem by using the precision location generic process in this section for further troubleshooting. The current generalized process is mainly for benchmarked scenarios, and the following section will take the scenario of comparing the precision of GPU+PyTorch and Ascend+MindSpore as an example to introduce the precision localization process. @@ -217,7 +219,7 @@ The training process fixes randomness and turns on deterministic computation in torch.cuda.manual_seed_all(seed) torch.cuda.manual_seed(seed) torch.backends.cudnn.deterministic = True - torch.backends.cudnn.enable = False + torch.backends.cudnn.enabled = False torch.backends.cudnn.benchmark = False if __name__ == "__main__": diff --git a/docs/mindformers/docs/source_zh_cn/advanced_development/precision_optimization.md b/docs/mindformers/docs/source_zh_cn/advanced_development/precision_optimization.md index aad5a8037fdc334265369c1cb660065e4712fb17..443ad7eb3310c2147aecd8e9099b14502d09da37 100644 --- a/docs/mindformers/docs/source_zh_cn/advanced_development/precision_optimization.md +++ b/docs/mindformers/docs/source_zh_cn/advanced_development/precision_optimization.md @@ -120,11 +120,11 @@ MindSpore的Dump工具通过配置JSON文件进行使能,该方式Dump出网 "op_debug_mode": 0, "dump_mode": 0, "path": "/absolute_path", - "net_name": "ResNet50", + "net_name": "Qwen3", "iteration": "0|5-8|100-120", "saved_data": "tensor", "input_output": 0, - "kernels": ["Default/Conv-op12"], + "kernels": ["Default"], "support_device": [0,1,2,3,4,5,6,7] }, "e2e_dump_settings": { @@ -148,6 +148,8 @@ export MINDSPORE_DUMP_CONFIG=${JSON_PATH} 除了上述介绍的全量算子Dump,工具还支持部分数据Dump、溢出Dump、指定条件Dump等。限于篇幅,感兴趣的用户可以参考[Dump功能调试](https://www.mindspore.cn/tutorials/zh-CN/r2.7.0rc1/debug/dump.html)进行配置使用。此外,还提供了msprobe精度调试工具。msprobe是 MindStudio Training Tools 工具链下精度调试部分的工具包,主要包括精度预检、溢出检测和精度比对等功能,详细请参考[msprobe使用手册](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe)。 +需要特别注意的是,开启**确定性计算**和使用**Dump**功能时,模型训练的**性能**会明显下降。这可能导致训练步长变大、运行速度变慢,这是正常现象。 + ## 模型迁移精度定位通用流程 通过章节[精度问题定位CheckList](#精度问题定位checklist)进行快速的排查。若完成CheckList的检查后,精度问题依然存在且无明显指向时,可通过本章节的精度定位通用流程缩小问题范围,进行下一步排查。当前通用流程主要针对有标杆的场景,下文将以 GPU+PyTorch 与 Ascend+MindSpore 精度对比的场景为例,对精度定位流程进行介绍。 @@ -217,7 +219,7 @@ MindSpore与PyTorch均支持`bin`格式数据,加载相同的数据集进行 torch.cuda.manual_seed_all(seed) torch.cuda.manual_seed(seed) torch.backends.cudnn.deterministic = True - torch.backends.cudnn.enable = False + torch.backends.cudnn.enabled = False torch.backends.cudnn.benchmark = False if __name__ == "__main__":