diff --git a/docs/lite/docs/source_en/mindir/converter_tool.md b/docs/lite/docs/source_en/mindir/converter_tool.md index 207042077218c25bcb4ffc98c4aaf51638633bf3..09852034feb523231c785baf9f8ac0977cfa0d18 100644 --- a/docs/lite/docs/source_en/mindir/converter_tool.md +++ b/docs/lite/docs/source_en/mindir/converter_tool.md @@ -10,7 +10,7 @@ The currently supported input formats are MindSpore, TensorFlow Lite, Caffe, Ten The `mindir` model converted by the converter supports the converter companion and higher versions of the Runtime inference framework to perform inference. -Note: Due to interface compatibility issues, the conversion tool cannot be run in CANN packages below version 7.5 due to interface compatibility issues. The version number of the CANN package is the content in the latest/version.cfg directory of the CANN package installation directory. +Note: Due to interface compatibility issues, the conversion tool cannot be run in CANN packages below version 7.5. The version number of the CANN package is the content in the latest/version.cfg directory of the CANN package installation directory. ## Linux Environment Usage Instructions @@ -75,8 +75,8 @@ Detailed parameter descriptions are provided below. | `--fp16=` | Not | Set whether the weights in float32 data format need to be stored in float16 data format during model serialization. | on, off | off | Not supported at the moment| | `--inputDataType=` | Not | Set the data type of the quantized model input tensor. Only if the quantization parameters (scale and zero point) of the model input tensor are available. The default is to keep the same data type as the original model input tensor. | FLOAT32, INT8, UINT8, DEFAULT | DEFAULT | Not supported at the moment | | `--outputDataType=` | Not | Set the data type of the quantized model output tensor. Only if the quantization parameters (scale and zero point) of the model output tensor are available. The default is to keep the same data type as the original model output tensor. | FLOAT32, INT8, UINT8, DEFAULT | DEFAULT | Not supported at the moment | -| `--device=` | Not | Set target device when converter model. The use case is when on the Ascend device, if you need to the converted model to have the ability to use Ascend backend to perform inference, you can set the parameter. If it is not set, the converted model will use CPU backend to perform inference by default. | This option will be deprecated. It is replaced by setting `optimize` option to `ascend_oriented` | -| `--optimizeTransformer=` | Not | Set whether to do transformer-fursion or not. | true, false | false | only support tensorrt | +| `--device=` | Not | Set target device when converting model. The use case is when on the Ascend device, if you need to the converted model to have the ability to use Ascend backend to perform inference, you can set the parameter. If it is not set, the converted model will use CPU backend to perform inference by default. | This option will be deprecated. It is replaced by setting `optimize` option to `ascend_oriented` | +| `--optimizeTransformer=` | Not | Set whether to do transformer-fusion or not. | true, false | false | only support tensorrt | Notes: @@ -84,7 +84,7 @@ Notes: - Caffe models are generally divided into two files: `*.prototxt` model structure, corresponding to the `--modelFile` parameter, and `*.caffemodel` model weights, corresponding to the `--weightFile` parameter. - The `configFile` configuration file uses the `key=value` approach to define the relevant parameters. - `--optimize` parameter is used to set the mode of optimization during the offline conversion. If this parameter is set to none, no relevant graph optimization operations will be performed during the offline conversion phase of the model, and the relevant graph optimization operations will be done during the execution of the inference phase. The advantage of this parameter is that the converted model can be deployed directly to any CPU/GPU/Ascend hardware backend since it is not optimized in a specific way, while the disadvantage is that the initialization time of the model increases during inference execution. If this parameter is set to general, general optimization will be performed, such as constant folding and operator fusion (the converted model only supports CPU/GPU hardware backend, not Ascend backend). If this parameter is set to gpu_oriented, the general optimization and extra optimization for GPU hardware will be performed (the converted model only supports GPU hardware backend). If this parameter is set to ascend_oriented, the optimization for Ascend hardware will be performed (the converted model only supports Ascend hardware backend). -- The encryption and decryption function only takes effect when `MSLITE_ENABLE_MODEL_ENCRYPTION=on` is set at [compile](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/build.html) time and only supports Linux x86 platforms. `decrypt_key` and `encrypt_key` are string expressed in hexadecimal. Linux platform users can use the' xxd 'tool to convert the key expressed in bytes into hexadecimal expressions. +- The encryption and decryption function only takes effect when `MSLITE_ENABLE_MODEL_ENCRYPTION=on` is set at [compile](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/build.html) time and only supports Linux x86 platforms. `decrypt_key` and `encrypt_key` are string expressed in hexadecimal. Linux platform users can use the `xxd` tool to convert the key expressed in bytes into hexadecimal expressions. - For the MindSpore model, since it is already a `mindir` model, two approaches are suggested: Inference is performed directly without offline conversion. diff --git a/docs/lite/docs/source_en/mindir/runtime_cpp.md b/docs/lite/docs/source_en/mindir/runtime_cpp.md index e0650b393adccc37d0413f5445549f15d56a866e..f38843f06b6aeedccd54f0713cc8190c84748cf7 100644 --- a/docs/lite/docs/source_en/mindir/runtime_cpp.md +++ b/docs/lite/docs/source_en/mindir/runtime_cpp.md @@ -320,7 +320,7 @@ int SpecifyInputDataExample(const std::string &model_path, const std::string &de ## Compilation and Execution -Set the environment variables as described in the Environment Variables section in [quick start](https://www.mindspore.cn/lite/docs/zh-CN/r2.7.0/mindir/build.html#%E6%89%A7%E8%A1%8C%E7%BC%96%E8%AF%91), and then compile the prograom as follows: +Set the environment variables as described in the Environment Variables section in [quick start](https://www.mindspore.cn/lite/docs/zh-CN/r2.7.0/mindir/build.html#%E6%89%A7%E8%A1%8C%E7%BC%96%E8%AF%91), and then compile the program as follows: ```bash mkdir build && cd build @@ -349,7 +349,7 @@ Lite cloud-side inference framework supports dynamic shape input for models. GPU The configuration of dynamic input information is related to offline and online scenarios. For offline scenarios, the model conversion tool parameter `--optimize=general`, `--optimize=gpu_oriented` or `--optimize=ascend_oriented`, i.e. experiencing the hardware-related fusion and optimization. The generated MindIR model can only run on the corresponding hardware backend. For example, in Atlas 200/300/500 inference product environment, if the model conversion tool specifies `--optimize=ascend_oriented`, the generated model will only support running on Atlas 200/300/500 inference product. If `--optimize=general` is specified, running on GPU and CPU is supported. For online scenarios, the loaded MindIR has not experienced hardware-related fusion and optimization, supports running on Ascend, GPU, and CPU. The model conversion tool parameter `--optimize=none`, or the MindSpore-exported MindIR model has not been processed by the conversion tool. -Ascend hardware backend offline scenarios require dynamic input information to be configured during the model conversion phase. Ascend hardware backend online scenarios, as well as GPU hardware backend offline and online scenarios, require dynamic input information to be configured during the model loading phase via the [LoadConfig](https://www.mindspore.cn/lite/api/en/r2.7.0/api_cpp/mindspore.html# loadconfig) interface. +Ascend hardware backend offline scenarios require dynamic input information to be configured during the model conversion phase. Ascend hardware backend online scenarios, as well as GPU hardware backend offline and online scenarios, require dynamic input information to be configured during the model loading phase via the [LoadConfig](https://www.mindspore.cn/lite/api/en/r2.7.0/api_cpp/mindspore.html#loadconfig) interface. An example configuration file loaded via `LoadConfig` is shown below: @@ -619,7 +619,7 @@ if (build_ret != mindspore::kSuccess) { } ``` -In the configuration file, the options from `[ge_global_options]`, `[ge_sesion_options]` and `[ge_graph_options]` will be used as global, session and model (graph) level options for the GE interface. For details, please refer to [GE Options](https://www.hiascend.com/document/detail/zh/canncommercial/700/inferapplicationdev/graphdevg/atlasgeapi_07_0119.html). For example: +In the configuration file, the options from `[ge_global_options]`, `[ge_session_options]` and `[ge_graph_options]` will be used as global, session and model (graph) level options for the GE interface. For details, please refer to [GE Options](https://www.hiascend.com/document/detail/zh/canncommercial/700/inferapplicationdev/graphdevg/atlasgeapi_07_0119.html). For example: ```ini [ge_global_options] @@ -643,7 +643,7 @@ When the backend is Ascend and the provider is the default, it supports loading In the Ascend device GE scenario, a single device can deploy multiple models, and models deployed in the same device can share weights, including constants and variables. -The same model script can export different models with the same weights for different conditional branches or input shapes. During the inference process, some weights can no longer be updated and are parsed as constants, where multiple models will have the same constant weights, while some weights may be updated and are parsed as variables. If one model updates one weight, the modified weight can be use and updated in the next inference or by other models. +The same model script can export different models with the same weights for different conditional branches or input shapes. During the inference process, some weights can no longer be updated and are parsed as constants, where multiple models will have the same constant weights, while some weights may be updated and are parsed as variables. If one model updates one weight, the modified weight can be used and updated in the next inference or by other models. The relationship between multiple models sharing weights can be indicated through interface [ModelGroup](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.ModelGroup.html#mindspore_lite.ModelGroup). @@ -799,4 +799,4 @@ AddModel, CalMaxSizeOfWorkspace, and model.build need to be executed in child th ### multi-backend runtime -MindSpore Lite cloud inference is supporting multi-backend heterogeneous inference, which can be enabled by specifying the environment variable 'export ENABLE_MULTI_BACKEND_RUNTIME=on' during runtime, and other interfaces are used in the same way as the original cloud inference. At present, this feature is an experimental feature, and the correctness, stability and subsequent compatibility of the feature are not guaranteed. +MindSpore Lite cloud inference supports multi-backend heterogeneous inference, which can be enabled by specifying the environment variable 'export ENABLE_MULTI_BACKEND_RUNTIME=on' during runtime, and other interfaces are used in the same way as the original cloud inference. At present, this feature is an experimental feature, and the correctness, stability and subsequent compatibility of the feature are not guaranteed. diff --git a/docs/lite/docs/source_en/mindir/runtime_distributed_cpp.md b/docs/lite/docs/source_en/mindir/runtime_distributed_cpp.md index a88d1628efeb6c0de995e787c64494a14dc38533..c765d0b4d55f3d32fcda7a29a2bc9a6e167c0146 100644 --- a/docs/lite/docs/source_en/mindir/runtime_distributed_cpp.md +++ b/docs/lite/docs/source_en/mindir/runtime_distributed_cpp.md @@ -188,7 +188,7 @@ cmake .. make ``` -After successful compilation, the `{device_type}_{backend}_distributed_cpp` executable programs is obtained in the `build` directory, and the distributed inference is started in the following multi-process manner. Please refer to `run.sh` in the sample code directory for the complete run command. When run successfully, the name, data size, number of elements and the first 10 elements of each output `Tensor` will be printed. +After successful compilation, the `{device_type}_{backend}_distributed_cpp` executable programs are obtained in the `build` directory, and the distributed inference is started in the following multi-process manner. Please refer to `run.sh` in the sample code directory for the complete run command. When run successfully, the name, data size, number of elements and the first 10 elements of each output `Tensor` will be printed. ```bash # for Ascend, run the executable file for each rank using shell commands diff --git a/docs/lite/docs/source_en/mindir/runtime_distributed_multicard_python.md b/docs/lite/docs/source_en/mindir/runtime_distributed_multicard_python.md index 5e7f17e08165e76ca2a59aa86bfefb2ddfb83ab8..dfb074f1aee141e62715fb7442a620b8acf3bccc 100644 --- a/docs/lite/docs/source_en/mindir/runtime_distributed_multicard_python.md +++ b/docs/lite/docs/source_en/mindir/runtime_distributed_multicard_python.md @@ -6,7 +6,7 @@ In single-machine multi-card and single-card multi-core scenarios, in order to fully utilize the performance of the device, it is necessary to allow the chip or different cards to perform parallel inference directly. This scenario is more common in Ascend environments, such as the Atlas 300I Duo inference card which has a single card, dual-core specification is naturally better suited for parallel processing. This tutorial describes how to perform MindSpore Lite multi-card/multi-core inference in the Ascend backend environment using the [Python interface](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite.html). The inference core process is roughly the same as the [cloud-side single-card inference](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/runtime_python.html) process, and the users can refer to it. -MindSpore Lite cloud-side distributed inference is only supported to run in Linux environment deployment. The device type supported in this tutorial is Atlas training series products, and takes **Atlas 300I Duo inference card + open source Stable Diffusion (SD) ONNX model** as the case. This case calls Python multiprocess library to create sub-processes, where the user interacts with the master process that interacts with the sub-processes through a pipeline. +MindSpore Lite cloud-side distributed inference is only supported to run in Linux environment deployment. The device type supported in this tutorial is Atlas training series products, and takes **Atlas 300I Duo inference card + open source Stable Diffusion (SD) ONNX model** as the case. This case calls Python multiprocessing library to create sub-processes, where the user interacts with the master process that interacts with the sub-processes through a pipeline. ## Preparations diff --git a/docs/lite/docs/source_en/mindir/runtime_distributed_python.md b/docs/lite/docs/source_en/mindir/runtime_distributed_python.md index cb77ea69d5e1e1bf900904a82661d41c6e4202ce..c899ce711d54a8a58e587de9b71fb9f3f6bfc7d0 100644 --- a/docs/lite/docs/source_en/mindir/runtime_distributed_python.md +++ b/docs/lite/docs/source_en/mindir/runtime_distributed_python.md @@ -45,7 +45,7 @@ Ascend, Nvidia GPU are supported in distributed inference scenarios, and can be ### Configuring Ascend Device Context -When the device type is Ascend (Atlas training series is currently supported by distributed inference), set [Context.target](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.Context.html#mindspore_lite.Context.target) to `Ascend` and set `DeviceID`, `RankID` by the following way. Since Ascend provides multiple inference engine backends, currently only the `ge` backend supports distributed inference, and the Ascend inference engine backend is specified as `ge` by via `ascend.provider`.The sample code is as follows. +When the device type is Ascend (Atlas training series is currently supported by distributed inference), set [Context.target](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.Context.html#mindspore_lite.Context.target) to `Ascend` and set `DeviceID`, `RankID` by the following way. Since Ascend provides multiple inference engine backends, currently only the `ge` backend supports distributed inference, and the Ascend inference engine backend is specified as `ge` by `ascend.provider`.The sample code is as follows. ```python # set Ascend target and distributed info @@ -68,7 +68,7 @@ context.gpu.provider = "tensorrt" ## Model Creation, Loading and Compilation -Consistent with [MindSpore Lite Cloud-side Single Card Inference](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/runtime_cpp.html), the main entry point for distributed inference is the [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/generate/classmindspore_Model.html) interface for model loading, compilation and execution. Create [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.Model.html#mindspore_lite.Model) and call the [Model.build_from_file](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.Model.html#mindspore_lite.Model.build_from_file) interface to implement the model Loading and model compilation, the sample code is as follows. +Consistent with [MindSpore Lite Cloud-side Single Card Inference](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/runtime_cpp.html), the main entry point for distributed inference is the [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/generate/classmindspore_Model.html) interface for model loading, compilation and execution. Create [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.Model.html#mindspore_lite.Model) and call the [Model.build_from_file](https://www.mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.Model.html#mindspore_lite.Model.build_from_file) interface to implement the model loading and model compilation, the sample code is as follows. ```python # create Model and build Model diff --git a/docs/lite/docs/source_en/mindir/runtime_java.md b/docs/lite/docs/source_en/mindir/runtime_java.md index 47efed7123422a7ffeed187391aa4d97f6d6e957..bd9dd2fdf361775b1808841a5216e0148d8a8487 100644 --- a/docs/lite/docs/source_en/mindir/runtime_java.md +++ b/docs/lite/docs/source_en/mindir/runtime_java.md @@ -9,7 +9,7 @@ After converting the `.mindir` model by [MindSpore Lite model conversion tool](h Compared with C++ API, Java API can be called directly in Java Class, and users do not need to implement the code related to JNI layer, with better convenience. Running MindSpore Lite inference framework mainly consists of the following steps: 1. Model reading: Export MindIR model via MindSpore or get MindIR model by [model conversion tool](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/converter_tool.html). -2. Create configuration context: Create a configuration context [MSContext](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mscontext.html#mscontext) and save some basic configuration parameters used to guide model compilation and model execution, including device type, number of threads, CPU pinning, and enabling fp16 mixed precision inference. +2. Create configuration context: Create a configuration context [MSContext](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mscontext.html#mscontext) and save some basic configuration parameters used to guide model compilation and model execution, including device type, number of threads, CPU pinning, and enabling float16 mixed precision inference. 3. Model creation, loading and compilation: Before executing inference, you need to call [build](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#build) interface of [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#model) for model loading and model compilation. Both loading files and MappedByteBuffer are currently supported. The model loading phase parses the file or buffer into a runtime model. 4. Input data: The model needs to be padded with data from the input Tensor before execution. 5. Execute inference: Use [predict](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#predict) of [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#model) method for model inference. @@ -84,7 +84,7 @@ context.addDeviceInfo(DeviceType.DT_ASCEND, false, 0); ## Model Creation, Loading and Compilation -When using MindSpore Lite to perform inference, [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#model) is the main entry for inference. Model loading, compilation and execution are implemented through Model. Using the [MSContext](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mscontext.html#init) created in the previous step, call the compound [build](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#build) interface of Model to implement model loading and model compilation. +When using MindSpore Lite to perform inference, [Model](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#model) is the main entry for inference. Model loading, compilation and execution are implemented. Using the [MSContext](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mscontext.html#init) created in the previous step, call the compound [build](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#build) interface of Model to implement model loading and model compilation. The following demonstrates the process of Model creation, loading and compilation: @@ -129,13 +129,13 @@ boolean ret = model.predict(); MindSpore Lite can get the inference result by outputting Tensor after performing inference. MindSpore Lite provides three methods to get the output of the model [MSTensor](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html), and also supports [getByteData](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#getbytedata), [getFloatData](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#getfloatdata), [getIntData](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#getintdata), [getLongData](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#getlongdata) four methods to get the output data. -1. Use the [getOutputs](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#getoutputs) method, get all the model to output list of [MSTensor](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#mstensor). The following demonstrates how to call `getOutputs` to get the list of output Tensor. +1. Use the [getOutputs](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#getoutputs) method, get all the model output list of [MSTensor](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#mstensor). The following demonstrates how to call `getOutputs` to get the list of output Tensor. ```java List outTensors = model.getOutputs(); ``` -2. Use the [getOutputsByNodeName](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#getoutputsbynodename) method and get the vector of the Tensor connected to that node in the model output [MSTensor](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#mstensor) according to the name of model output node. The following demonstrates how to call ` getOutputByTensorName` to get the output Tensor. +2. Use the [getOutputsByNodeName](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model.html#getoutputsbynodename) method and get the vector of the Tensor connected to that node in the model output [MSTensor](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/mstensor.html#mstensor) according to the name of model output node. The following demonstrates how to call `getOutputByTensorName` to get the output Tensor. ```java MSTensor outTensor = model.getOutputsByNodeName("Default/head-MobileNetV2Head/Softmax-op204"); diff --git a/docs/lite/docs/source_en/mindir/runtime_parallel_cpp.md b/docs/lite/docs/source_en/mindir/runtime_parallel_cpp.md index 453c8c49d2a7063ea0528c7c6d16a93f503d2103..679ab63e93a1e57aa5a30aa911cce5c85e665df2 100644 --- a/docs/lite/docs/source_en/mindir/runtime_parallel_cpp.md +++ b/docs/lite/docs/source_en/mindir/runtime_parallel_cpp.md @@ -4,7 +4,7 @@ ## Overview -MindSpore Lite provides multi-model concurrent inference interface [ModelParallelRunner](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model_parallel_runner.html). Multi model concurrent inference now supports Atlas 200/300/500 inference product, Atlas inference series, Atlas training series, Nvidia GPU and CPU backends. +MindSpore Lite provides multi-model concurrent inference interface [ModelParallelRunner](https://www.mindspore.cn/lite/api/en/r2.7.0/api_java/model_parallel_runner.html). Multi-model concurrent inference now supports Atlas 200/300/500 inference product, Atlas inference series, Atlas training series, Nvidia GPU and CPU backends. After exporting the `mindir` model by MindSpore or converting it by [model conversion tool](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/converter_tool.html) to obtain the `mindir` model, the concurrent inference process of the model can be executed in Runtime. This tutorial describes how to perform concurrent inference with multiple modes by using the [C++ interface](https://www.mindspore.cn/lite/api/en/r2.7.0/index.html). @@ -67,7 +67,7 @@ runner_config->SetWorkersNum(kNumWorkers); > > Multi-model concurrent inference does not support FP32-type data inference. Binding cores only supports no core binding or binding large cores. It does not support the parameter settings of the bound cores, and does not support configuring the binding core list. > -> For large models, when using the model buffer to load and compile, you need to set the path of the weight file separately, sets the model path through [SetConfigInfo](https://www.mindspore.cn/lite/api/en/r2.7.0/generate/classmindspore_RunnerConfig.html) interface, where `section` is `model_File` , `key` is `mindir_path`; When using the model path to load and compile, you do not need to set other parameters. The weight parameters will be automatically read. +> For large models, when using the model buffer to load and compile, you need to set the path of the weight file separately, set the model path through [SetConfigInfo](https://www.mindspore.cn/lite/api/en/r2.7.0/generate/classmindspore_RunnerConfig.html) interface, where `section` is `model_File`, `key` is `mindir_path`; When using the model path to load and compile, you do not need to set other parameters. The weight parameters will be automatically read. ## Initialization @@ -103,7 +103,7 @@ if (predict_ret != mindspore::kSuccess) { ## Compiling And Executing -Follow the [quick start](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/build.html#executing-compilation) environment variables, set the environment variables. Run the build.sh script in the `mindspore-lite/examples/cloud_infer/quick_start_parallel_cpp` directory to automatically download the MindSpore Lite inference framework library and model files and compile the demo. +Follow the [quick start](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/build.html#executing-compilation) environment variables to set the environment variables. Run the build.sh script in the `mindspore-lite/examples/cloud_infer/quick_start_parallel_cpp` directory to automatically download the MindSpore Lite inference framework library and model files and compile the demo. ```bash bash build.sh diff --git a/docs/lite/docs/source_en/mindir/runtime_parallel_java.md b/docs/lite/docs/source_en/mindir/runtime_parallel_java.md index ffa57aaea0b4131be571139339d3bc1232c51258..f78ff6e28f8fda3b8742f6c4af3726dee664827b 100644 --- a/docs/lite/docs/source_en/mindir/runtime_parallel_java.md +++ b/docs/lite/docs/source_en/mindir/runtime_parallel_java.md @@ -87,7 +87,7 @@ if (!ret) { ### Build -Set environment variables, and Run the [build script](https://gitee.com/mindspore/mindspore-lite/blob/r2.7/mindspore-lite/examples/cloud_infer/quick_start_parallel_java/build.sh) in the `mindspore-lite/examples/quick_start_parallel_java` directory to automatically download the MindSpore Lite inference framework library and model files and build the Demo. +Set environment variables, and run the [build script](https://gitee.com/mindspore/mindspore-lite/blob/r2.7/mindspore-lite/examples/cloud_infer/quick_start_parallel_java/build.sh) in the `mindspore-lite/examples/quick_start_parallel_java` directory to automatically download the MindSpore Lite inference framework library and model files and build the Demo. ```bash export JAVA_HOME=/{path}/default-java @@ -106,7 +106,7 @@ After the build, go to the `mindspore-lite/examples/cloud_infer/quick_start_para java -classpath .:./quick_start_parallel_java.jar:../lib/runtime/lib/mindspore-lite-java.jar com.mindspore.lite.demo.Main ../model/mobilenetv2.mindir ``` -After the execution is completed, it will show the model concurrent inference success. +After the execution is completed, it will show that the model concurrent inference was successful. ## Memory release diff --git a/docs/lite/docs/source_en/mindir/runtime_parallel_python.md b/docs/lite/docs/source_en/mindir/runtime_parallel_python.md index 916fd3cdec0432c046ea27108d06b5b4aa668af4..0a8e72acd5e3ba96f258a15ee6340c920e92c6b1 100644 --- a/docs/lite/docs/source_en/mindir/runtime_parallel_python.md +++ b/docs/lite/docs/source_en/mindir/runtime_parallel_python.md @@ -6,7 +6,7 @@ MindSpore Lite provides a multi-model concurrent inference interface [ModelParallelRunner](https://mindspore.cn/lite/api/en/r2.7.0/mindspore_lite/mindspore_lite.ModelParallelRunner.html), multi-model concurrent inference now supports Atlas 200/300/500 inference product, Atlas inference series, Atlas training series, Nvidia GPU, CPU backends. -After exporting the `mindir` model by MindSpore or converting it by [model conversion tool](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/converter_tool.html) to obtain the ` mindir` model, the concurrent inference process of the model can be executed in Runtime. This tutorial describes how to use the [Python interface](https://mindspore.cn/lite/api/en/r2.7.0/mindspore_lite.html) to perform concurrent inference with multiple models. +After exporting the `mindir` model by MindSpore or converting it by [model conversion tool](https://www.mindspore.cn/lite/docs/en/r2.7.0/mindir/converter_tool.html) to obtain the `mindir` model, the concurrent inference process of the model can be executed in Runtime. This tutorial describes how to use the [Python interface](https://mindspore.cn/lite/api/en/r2.7.0/mindspore_lite.html) to perform concurrent inference with multiple models. Concurrent inference with MindSpore Lite consists of the following main steps: diff --git a/docs/lite/docs/source_en/mindir/runtime_python.md b/docs/lite/docs/source_en/mindir/runtime_python.md index 540ba11ad1ea36c4ee37e5d87e16b076590def4d..cd0de0fec261e2a7f15ef4a7c8a553195e07865a 100644 --- a/docs/lite/docs/source_en/mindir/runtime_python.md +++ b/docs/lite/docs/source_en/mindir/runtime_python.md @@ -6,7 +6,7 @@ This tutorial provides a sample program for MindSpore Lite to perform cloud-side inference, demonstrating the [Python interface](https://mindspore.cn/lite/api/en/r2.7.0/mindspore_lite.html) to perform the basic process of cloud-side inference through file input, inference execution, and inference result printing, and enables users to quickly understand the use of MindSpore Lite APIs related to cloud-side inference execution. The related files are put in the directory [mindspore-lite/examples/cloud_infer/quick_start_python](https://gitee.com/mindspore/mindspore-lite/tree/r2.7/mindspore-lite/examples/cloud_infer/quick_start_python). -MindSpore Lite cloud-side inference is supported to run in Linux environment deployment only. Atlas 200/300/500 inference product, Atlas inference series, Atlas training series, Nvidia GPU and CPU hardware backends are supported. +MindSpore Lite cloud-side inference is supported for running in Linux environment deployment only. Atlas 200/300/500 inference product, Atlas inference series, Atlas training series, Nvidia GPU and CPU hardware backends are supported. The following is an example of how to use the Python Cloud-side Inference Demo on a Linux X86 operating system and a CPU hardware platform, using Ubuntu 18.04 as an example: @@ -30,9 +30,9 @@ MINDSPORE_LITE_VERSION=2.0.0 bash ./lite-cpu-pip.sh > If the MobileNetV2 model download fails, please manually download the relevant model file [mobilenetv2.mindir](https://download.mindspore.cn/model_zoo/official/lite/quick_start/mobilenetv2.mindir) and copy it to the `mindspore-lite/examples/cloud_infer/quick_start_python/model` directory. > -> If the input.bin input data file download fails, please manually download the relevant input data file [input.bin](https://download.mindspore.cn/model_zoo/official/lite/quick_start/input.bin) and copy it to the ` mindspore-lite/examples/cloud_infer/quick_start_python/model` directory. +> If the input.bin input data file download fails, please manually download the relevant input data file [input.bin](https://download.mindspore.cn/model_zoo/official/lite/quick_start/input.bin) and copy it to the `mindspore-lite/examples/cloud_infer/quick_start_python/model` directory. > -> If MindSpore Lite inference framework by using the script download fails, please manually download [MindSpore Lite model cloud-side inference framework](https://www.mindspore.cn/lite/docs/en/r2.7.0/use/downloads.html) corresponding to the hardware platform of CPU and operating system of Linux-x86_64 or Linux-aarch64. Users can use the `uname -m` command to query the operating system in the terminal, and copy it to the `mindspore-lite/examples/cloud_infer/quick_start_python` directory. +> If MindSpore Lite inference framework by using the script download fails, please manually download [MindSpore Lite model cloud-side inference framework](https://www.mindspore.cn/lite/docs/en/r2.7.0/use/downloads.html) corresponding to the CPU hardware platform and operating system of Linux-x86_64 or Linux-aarch64. Users can use the `uname -m` command to query the operating system in the terminal, and copy it to the `mindspore-lite/examples/cloud_infer/quick_start_python` directory. > > If you need to use MindSpore Lite corresponding to Python 3.7 or above, please [compile](https://mindspore.cn/lite/docs/en/r2.7.0/mindir/build.html) locally. Note that the Python API module compilation depends on Python >= 3.7.0, NumPy >= 1.17.0, wheel >= 0.32.0. After successful compilation, copy the Whl installation package generated in the `output/` directory to the `mindspore-lite/examples/cloud_infer/quick_start_python` directory. > diff --git a/docs/lite/docs/source_zh_cn/mindir/runtime_cpp.md b/docs/lite/docs/source_zh_cn/mindir/runtime_cpp.md index 06c2be84cc67f61ac82f6ca58351b27d72cce10b..60c5c4eff89ac6899e1a1bdc7691aeaaaa8c7b72 100644 --- a/docs/lite/docs/source_zh_cn/mindir/runtime_cpp.md +++ b/docs/lite/docs/source_zh_cn/mindir/runtime_cpp.md @@ -620,7 +620,7 @@ if (build_ret != mindspore::kSuccess) { } ``` -在配置文件中,来自 `[ge_global_options]` 、 `[ge_sesion_options]` 和 `[ge_graph_options]` 中的选项将作为GE接口的全局、Session和模型(图)级别的选项,详情可参考[GE选项](https://www.hiascend.com/document/detail/zh/canncommercial/700/inferapplicationdev/graphdevg/atlasgeapi_07_0119.html)。比如: +在配置文件中,来自 `[ge_global_options]` 、 `[ge_session_options]` 和 `[ge_graph_options]` 中的选项将作为GE接口的全局、Session和模型(图)级别的选项,详情可参考[GE选项](https://www.hiascend.com/document/detail/zh/canncommercial/700/inferapplicationdev/graphdevg/atlasgeapi_07_0119.html)。比如: ```ini [ge_global_options] diff --git a/docs/mindspore/source_en/faq/implement_problem.md b/docs/mindspore/source_en/faq/implement_problem.md index cc68af669c5d9771ee4f28e951fc4190cff48aa9..90917fd5538257b9f0e5214e65cd88bb26eb0cd4 100644 --- a/docs/mindspore/source_en/faq/implement_problem.md +++ b/docs/mindspore/source_en/faq/implement_problem.md @@ -491,7 +491,7 @@ In addition, CANN may throw some Inner Errors, for example, the error code is "E ## Q: How to control the Tensor value printed by the `print` method? -A: In PyNative dynamic graph mode, you can use numpy native methods such as ` set_ Printoptions ` to control the output value. In the Graph static graph mode, because the `print` method needs to be converted into an operator, the output value cannot be controlled temporarily. For specific usage of print operator, see [Reference](https://www.mindspore.cn/docs/en/r2.7.0/api_python/ops/mindspore.ops.Print.html). +A: In PyNative mode, you can use numpy native methods such as `set_printoptions` to control the output value. In the Graph static graph mode, because the `print` method needs to be converted into an operator, the output value cannot be controlled temporarily. For specific usage of print operator, see [Reference](https://www.mindspore.cn/docs/en/r2.7.0/api_python/ops/mindspore.ops.Print.html).
## Q: How does `Tensor.asnumpy()` share the underlying storage with Tensor?