ArcGIS REST API - ArcGIS Services - Train Deep Learning Model

URL:https://<rasteranalysistools-url>/TrainDeepLearningModel
Related Resources: Add Image, Aggregate Multidimensional Raster, Build Multidimensional Transpose, Calculate Density, Calculate Distance, Calculate Travel Cost, Classify, Classify Object Using Deep Learning, Classify Pixels Using Deep Learning, Convert Feature to Raster, Convert Raster Function Template, Convert Raster to Feature, Copy Raster, Cost Path as Polyline, Create Image Collection, Create Viewshed, Delete Image, Delete Image Collection, Detect Objects Using Deep Learning, Determine Optimum Travel Cost Network, Determine Travel Cost Paths to Destinations, Determine Travel Cost Path as Polyline, Export Training Data for Deep Learning, Fill, Find Argument Statistics, Flow Accumulation, Flow Direction, Flow Distance, Generate Multidimensional Anomaly, Generate Raster, Generate Trend Raster, Install Deep Learning Model, Interpolate Points, Linear Spectral Unmixing, List Deep Learning Model Info, Nibble, Predict Using Trend Raster, Publish Deep Learning Model, Query Deep Learning Model Info, Segment, Stream Link, Subset Multidimensional Raster, Summarize Raster Within, Train Classifier, Train Deep Learning Model,Uninstall Deep Learning Model, Watershed
Version Introduced:10.8

Description

Train Deep Learning Model

The TrainDeepLearningModel task is used to train a deep learning model using the output from the Export Training Data for Deep Learning tool. It generates the deep learning model package (*.dlpk) and adds it to your enterprise portal. The tool also provides an option to write the deep learning model package to a file share data store location.

License:

As of 10.5, you must license your ArcGIS Server as an ArcGIS Image Server to use this resource.

Request parameters


Parameter	Details
in_folder (Required)	This is the input location for the training sample data. It can be the path of the output location on the file share raster data store or a shared file system path. The training sample data folder needs to be the output of Export Training Data For Deep Learning tool, containing images and labels folders, as well as the JSON model definition file written out together by the tool. File share raster store path examples: Examples `//File share raster store path examples in_folder=/rasterStores/yourRasterStoreFolderName/trainingSampleData in_folder=/fileShares/yourFileShareFolderName/trainingSampleData //File share path example in_folder=\\serverName\deepLearning\trainingSampleData`
output_name (Required)	This is the output location for the trained deep learning model package (.dlpk). It can be a JSON object representing the output trained deep learning model package .dlpk name that will be added as a portal item, or a string of the folder path in the file share data store. The file share data store must be registered on your server. Example: `//Output dlpk name output_name={"name": "trainedModel"} output_name={"name": "trainedModel","folderId":"dfwerfbd3ec25584d0d8f4"} //File share data store path output_name=“/fileShares/filesharename/folder” //File share data store path: output_name={"uri":"/fileShares/yourFileShareFolderName/trainedModel"}`
model_type (Required)	The model type to use for training the deep learning model. The single shot detector (SSD), FasterRCNN (FASTERRCNN), and You Only Look Once v3 (YOLOV3) model types are used for object detection. U-Net (UNET) and DeepLab (DEEPLAB) model types are used for pixel classification. For object classification, the Feature Classifier (FEATURE_CLASSIFIER) model type should be used. The Pyramid Scene Parsing Network (PSPNET) model type is used for pixel classification. Both RetinaNet (RETINANET) and MarkRCNN (MARKRCNN) model types are used for object detection. Values: SSD \| UNET \| FEATURE_CLASSIFIER \| PSPNET \| RETINANET \| MASKRCNN \| YOLOV3 \| DEEPLAB \| FASTERRCNN
arguments (Optional)	This is where you list additional deep learning parameters and arguments for experiments and refinement, such as a confidence threshold for adjusting sensitivity. The names of the arguments are populated from reading the Python module. When you set the model_type to SSD, the following arguments will be used: grids—The number of grids the image will be divided into for processing. Setting this argument to 4 means the image will be divided into 4 x 4 or 16 grid cells. If no value is specified, the optimal grid value will be calculated based on the input imagery. zooms—The number of zoom levels each grid cell will be scaled up or down. Setting this argument to 1 means all the grid cells will remain at the same size or zoom level. A zoom level of 2 means all the grid cells will become twice as large (zoomed in 100 percent). Providing a list of zoom levels means all the grid cells will be scaled using all the numbers in the list. The default is 1.0. ratios—The list of aspect ratios to use for the anchor boxes. In object detection, an anchor box represents the ideal location, shape, and size of the object being predicted. Setting this argument to [1.0,1.0], [1.0, 0.5] means the anchor box is a square (1:1) or a rectangle in which the horizontal side is half the size of the vertical side (1:0.5). The default is [1.0, 1.0]. When you set the model_type to any of the pixel classification models (PSPNET, UNET, or DEEPLAB), the following arguments will be used: USE_UNET—The U-Net decoder will be used to recover data once the pyramid pooling is complete. The default is True. This argument is specific to the PSPNET model. PYRAMID_SIZES—The number and size of convolution layers to be applied to the different subregions. The default is [1,2,3,6]. This argument is specific to the PSPNET model. MIXUP—Specifies whether to use mixup augmentation and mixup loss. The default is False. CLASS_BALANCING—Specifies whether to balance the cross-entropy loss inverse to the frequency of pixels per class. The default is False. FOCAL_LOSS—Specifies whether to use focal loss. The default is False. IGNORE_CLASSES—Contains the list of class values on which the model will incur loss. When you set the model_type to RETINANET, the following arguments will be used: SCALES—The number of scale levels each cell will be scaled up or down. The default is [1, 0.8, 0.63]. RATIOS—The aspect ratio of the anchor box. The default is [0.5,1,2]. All model types support the chip_size argument, which is the chip size of the tiles in the training samples. The image chip size is extracted from the .emd file in the input folder. Syntax: The value pairs of arguments and their values. Example `arguments={"name1": "value1", "name2": "value2"}`
batch_size (Optional)	The number of training samples to be processed for training at one time. If the server has a powerful GPU, this number can be increased to 16, 36, 64, and so on. Example `batch_size=4`
max_epochs (Optional)	The maximum number of epochs for training the model. One epoch means the whole training dataset will be passed forward and backward through the deep neural network once. Example `max_epochs=20`
learning_rate (Optional)	The rate at which the weights are updated during the training. It is a small positive value in the range between 0.0 and 1.0. If the learning rate is set to 0, it will extract the optimal learning rate from the learning curve during the training process. Example `learning_rate=0`
backbone_model (Optional)	Specifies the preconfigured neural network to be used as an architecture for training the new model. See the descriptions of the backbone models below. Values: DARKNET53 \| DENSENET121 \| DENSENET161 \| DENSENET169 \| DENSENET201 \| MOBILENET_V2\| RESNET18 \| RESNET34 \| RESNET50 \| RESNET101 \| RESNET152 \| VGG11 \| VGG11_BN \| VGG13 \| VGG13_BN \| VGG16 \| VGG16_BN \| VGG19 \| VGG19_BN Example `backbone_model=RESNET34`
validation_percent (Optional)	The percentage of training sample data that will be used for validating the model. Example `validation_percent=10`
pretrained_model (Optional)	The pretrained model to be used for fine-tuning the new model. It is a .dlpk portal item. Example `pretrained_model={"itemId": "8cfbd3ec25584d0d8fed23b8ff7c43b"}`
stop_training (Optional)	Specifies whether early stopping will be implemented. If true, the model training will stop when the model is no longer improving, regardless of the maximum epochs specified. This is the default. If false, the model training will continue until the maximum epochs is reached. Values: true \| false
overwriteModel (Optional)	Overwrites an existing deep learning model package (.dlpk) portal item with the same name. If the output_name parameter uses the file share data store path, this overwriteModel parameter is not applied. True—The portal .dlpk item will be overwritten. False—The portal .dlpk item will not be overwritten. This is the default.
context (Optional)	Environment settings that affect task execution. This task has the following settings: extent—A bounding box that defines the analysis area. cellSize—The output raster will have the resolution specified by cell size. processorType—The specified processor (CPU or GPU) will be used for the analysis. Example `context={"cellSize": "20","processorType": "GPU"}`
freeze_Model	Specifies whether to freeze the backbone layers in the pretrained model, so that the weights and biases in the backbone layers remain unchanged. If true, the predefined weights and biases will not be altered in the backboneModel. This is the default. If false, the weights and biases of the backboneModel may be altered to better fit your training samples. This may take more time to process but usually could get better results. Values: true \| false
f	The response format. The default response format is html. Values: html \| json \| pjson

Backbone model values

Below are the accepted preconfigured neural network values that can be submitted with the backbone_model parameter:


Value	Description
DARKNET53	The preconfigured model will be a convolutional neural network trained on the ImageNET Dataset that contains more than 1 million images and is 53 layers deep.
DENSENET121	The preconfigured model will be a dense network trained on the ImageNET Dataset that contains more than a million images and is 121 layers deep. Unlike RESNET, which combines the layer using summation, DenseNet combines the layers using concatenation.
DENSENET161	The preconfigured model will be a dense network trained on the ImageNET Dataset that contains more than a million images and is 161 layers deep. Unlike RESNET, which combines the layer using summation, DenseNet combines the layers using concatenation.
DENSENET169	The preconfigured model will be a dense network trained on the ImageNET Dataset that contains more than a million images and is 169 layers deep. Unlike RESNET, which combines the layer using summation, DenseNet combines the layers using concatenation.
DENSENET201	The preconfigured model will be a dense network trained on the ImageNET Dataset that contains more than a million images and is 201 layers deep. Unlike RESNET, which combines the layer using summation, DenseNet combines the layers using concatenation.
MOBILENET_V2	This preconfigured model is trained on the ImageNet Database and is 54 layers deep geared toward Edge device computing, since it uses less memory.
RESNET18	The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 18 layers deep.
RESNET34	The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 34 layers deep. This is the default.
RESNET50	The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 50 layers deep.
RESNET101	The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 101 layers deep.
RESNET152	The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 152 layers deep.
VGG11	The preconfigured model will be a convolution neural network trained on the ImageNET Dataset that contains more than a million images to classify images into 1,000 object categories and is 11 layers deep.
VGG11_BN	This preconfigured model is based on the VGG network but with batch normalization, which is each layer in the network in normalized. It trained on the ImageNet dataset and has 11 layers.
VGG13	The preconfigured model will be a convolution neural network trained on the ImageNET Dataset that contains more than a million images to classify images into 1,000 object categories and is 13 layers deep.
VGG13_BN	This preconfigured model is based on the VGG network but with batch normalization, which is each layer in the network in normalized. It trained on the ImageNet dataset and has 13 layers.
VGG16	The preconfigured model will be a convolution neural network trained on the ImageNET Dataset that contains more than a million images to classify images into 1,000 object categories and is 16 layers deep.
VGG16_BN	This preconfigured model is based on the VGG network but with batch normalization, which is each layer in the network in normalized. It trained on the ImageNet dataset and has 16 layers.
VGG19	The preconfigured model will be a convolution neural network trained on the ImageNET Dataset that contains more than a million images to classify images into 1,000 object categories and is 19 layers deep.
VGG19_BN	This preconfigured model is based on the VGG network but with batch normalization, which is each layer in the network in normalized. It trained on the ImageNet dataset and has 19 layers.

Example usage

The following is a sample request URL for TrainDeepLearningModel:

https://services.myserver.com/arcgis/rest/services/System/RasterAnalysisTools/GPServer/TrainDeepLearningModel

Response

When you submit a request, the task assigns a unique job ID for the transaction.

Syntax:

{ "jobId": "<unique job identifier>", "jobStatus": "<job status>" }

After the initial request is submitted, you can use the jobId to periodically check the status of the job and messages, as described in Check job status. Once the job has successfully completed, use the jobId to retrieve the results. To track the status, you can make a request of the following form:

https://<rasterAnalysisTools-url>/TrainDeepLearningModel/jobs/<jobId>

When the status of the job request is esriJobSucceeded, you can access the results of the analysis by making a request of the following form:

https://<rasterAnalysisTools-url>/TrainDeepLearningModel/jobs/<jobId>/results/out_item

JSON Response example

The response returns the .dlpk portal item, which has properties for title, type, filename, file, id, and folderId.

{
  "title": "dlpk_name",
  "type": "Deep Learning Package",
  "multipart": True,
  "tags": "imagery" 
  "typeKeywords": "Deep Learning, Raster"
  "filename": "dlpk_name",
  "file": "\\servername\rasterstore\mytrainedmodel.dlpk",
  "id": "f121390b85ef419790479fc75b493efd",
  "folderId": "dfwerfbd3ec25584d0d8f4"
}

However, if a data store path is specified as the value for outputName, the output would be the data store location.

{
  "paramName": "out_item",
  "dataType": "GPString",
  "value": {"uri": "/fileShares/yourFileShareFolderName/trainedModel/trainedModel.dlpk"}"value": {"uri": "/fileShares/yourFileShareFolderName/trainedModel/trainedModel.dlpk"}
}