Class: Aws::SageMaker::Types::TransformInput

Inherits:

Struct

Object
Struct
Aws::SageMaker::Types::TransformInput

show all

Includes:: Aws::Structure

Defined in:: lib/aws-sdk-sagemaker/types.rb

Overview

Note:

When making an API call, you may pass TransformInput data as a hash:

{
  data_source: { # required
    s3_data_source: { # required
      s3_data_type: "ManifestFile", # required, accepts ManifestFile, S3Prefix, AugmentedManifestFile
      s3_uri: "S3Uri", # required
    },
  },
  content_type: "ContentType",
  compression_type: "None", # accepts None, Gzip
  split_type: "None", # accepts None, Line, RecordIO, TFRecord
}

Describes the input source of a transform job and the way the transform job consumes it.

Instance Attribute Summary collapse

#compression_type ⇒ String

If your transform data is compressed, specify the compression type.
#content_type ⇒ String

The multipurpose internet mail extension (MIME) type of the data.
#data_source ⇒ Types::TransformDataSource

Describes the location of the channel data, which is, the S3 location of the input data that the model can consume.
#split_type ⇒ String

The method to use to split the transform job’s data files into smaller batches.

Instance Attribute Details

#compression_type ⇒ `String`

If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses the data for the transform job accordingly. The default value is ‘None`.

Returns:

(String)

# File 'lib/aws-sdk-sagemaker/types.rb', line 10768

class TransformInput < Struct.new(
  :data_source,
  :content_type,
  :compression_type,
  :split_type)
  include Aws::Structure
end

#content_type ⇒ `String`

The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each http call to transfer data to the transform job.

Returns:

(String)

# File 'lib/aws-sdk-sagemaker/types.rb', line 10768

class TransformInput < Struct.new(
  :data_source,
  :content_type,
  :compression_type,
  :split_type)
  include Aws::Structure
end

#data_source ⇒ `Types::TransformDataSource`

Describes the location of the channel data, which is, the S3 location of the input data that the model can consume.

Returns:

(Types::TransformDataSource)

# File 'lib/aws-sdk-sagemaker/types.rb', line 10768

class TransformInput < Struct.new(
  :data_source,
  :content_type,
  :compression_type,
  :split_type)
  include Aws::Structure
end

#split_type ⇒ `String`

The method to use to split the transform job’s data files into smaller batches. Splitting is necessary when the total size of each object is too large to fit in a single request. You can also use data splitting to improve performance by processing multiple concurrent mini-batches. The default value for ‘SplitType` is `None`, which indicates that input data files are not split, and request payloads contain the entire contents of an input object. Set the value of this parameter to `Line` to split records on a newline character boundary. `SplitType` also supports a number of record-oriented binary data formats.

When splitting is enabled, the size of a mini-batch depends on the values of the ‘BatchStrategy` and `MaxPayloadInMB` parameters. When the value of `BatchStrategy` is `MultiRecord`, Amazon SageMaker sends the maximum number of records in each request, up to the `MaxPayloadInMB` limit. If the value of `BatchStrategy` is `SingleRecord`, Amazon SageMaker sends individual records in each request.

<note markdown=“1”> Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is applied to a binary data format, padding is removed if the value of ‘BatchStrategy` is set to `SingleRecord`. Padding is not removed if the value of `BatchStrategy` is set to `MultiRecord`.

For more information about the RecordIO, see [Data Format][1] in the

MXNet documentation. For more information about the TFRecord, see

Consuming TFRecord data][2: in the TensorFlow documentation.

</note>

[1]: mxnet.io/architecture/note_data_loading.html#data-format [2]: www.tensorflow.org/guide/datasets#consuming_tfrecord_data

Returns:

(String)

# File 'lib/aws-sdk-sagemaker/types.rb', line 10768

class TransformInput < Struct.new(
  :data_source,
  :content_type,
  :compression_type,
  :split_type)
  include Aws::Structure
end

Class: Aws::SageMaker::Types::TransformInput

Overview

Instance Attribute Summary collapse

Instance Attribute Details

#compression_type ⇒ String

#content_type ⇒ String

#data_source ⇒ Types::TransformDataSource

#split_type ⇒ String

#compression_type ⇒ `String`

#content_type ⇒ `String`

#data_source ⇒ `Types::TransformDataSource`

#split_type ⇒ `String`