Class: Ikra::Translator::CommandTranslator::KernelLauncher

Inherits:
Object
  • Object
show all
Defined in:
lib/translator/kernel_launcher/kernel_launcher.rb

Overview

Builds the launch of the kernel. This class is responsible for generating the invocation of the kernel.

For example: kernel<<<…, …>>>(env, result, d_a, …);

Class Attribute Summary collapse

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(kernel_builder) ⇒ KernelLauncher

Returns a new instance of KernelLauncher.



46
47
48
49
50
51
52
53
54
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 46

def initialize(kernel_builder)
    @kernel_builder = kernel_builder
    @additional_arguments = []
    @previous_kernel_input = []
    @reuse_memory = false
    @kernel_result_var_name = "_kernel_result_" + CommandTranslator.next_unique_id.to_s
    @cached_results = {}
    @previously_cached_results = {}
end

Class Attribute Details

.debug_free_previous_input_immediatelyObject

Debug flag only: Frees all input after launching kernel. This causes an error if data is used twice or kept (using the ‘keep` flag)



14
15
16
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 14

def debug_free_previous_input_immediately
  @debug_free_previous_input_immediately
end

Instance Attribute Details

#additional_argumentsObject

Additional parameters that this kernel should accept (to access the result of previous kernels)



25
26
27
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 25

def additional_arguments
  @additional_arguments
end

#block_dimObject

Returns the value of attribute block_dim.



32
33
34
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 32

def block_dim
  @block_dim
end

#cached_resultsObject

IDs and types of commands whose results are kept on the GPU



41
42
43
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 41

def cached_results
  @cached_results
end

#grid_dimObject

Block/grid dimensions (should be 1D)



31
32
33
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 31

def grid_dim
  @grid_dim
end

#kernel_builderObject

Returns the value of attribute kernel_builder.



17
18
19
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 17

def kernel_builder
  @kernel_builder
end

#kernel_result_var_nameObject (readonly)

Pointer to the resulting array (device memory)



38
39
40
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 38

def kernel_result_var_name
  @kernel_result_var_name
end

#num_threadsObject

Number of threads (elements to be processed)



28
29
30
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 28

def num_threads
  @num_threads
end

#previous_kernel_inputObject

Additional parameters that this kernel should accept (to access the result of previous kernels)



21
22
23
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 21

def previous_kernel_input
  @previous_kernel_input
end

#previously_cached_resultsObject (readonly)

IDs and types of commands that were previously computed and shall now be used in this kernel as input



44
45
46
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 44

def previously_cached_results
  @previously_cached_results
end

#reuse_memoryObject

Whether the launch allocates new memory beforehand or uses previous memory



35
36
37
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 35

def reuse_memory
  @reuse_memory
end

Instance Method Details

#add_additional_arguments(*arguments) ⇒ Object

Add additional arguments to the kernel function that might be needed for some computations



101
102
103
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 101

def add_additional_arguments(*arguments)
    @additional_arguments.push(*arguments)
end

#add_cached_result(result_id, type) ⇒ Object

Adds command whose result will be kept on GPU



78
79
80
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 78

def add_cached_result(result_id, type)
    @cached_results[result_id] = type
end

#add_previous_kernel_parameter(parameter) ⇒ Object



96
97
98
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 96

def add_previous_kernel_parameter(parameter)
    kernel_builder.add_previous_kernel_parameter(parameter)
end

#assert_ready_to_buildObject



139
140
141
142
143
144
145
146
147
148
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 139

def assert_ready_to_build
    required_values = [:num_threads, :grid_dim, :block_dim]

    for selector in required_values
        if send(selector) == nil
            raise AssertionError.new(
                "Not ready to build (KernelBuilder): #{selector} is not set")
        end
    end
end

#build_device_memory_freeObject



229
230
231
232
233
234
235
236
237
238
239
240
241
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 229

def build_device_memory_free
    Log.info("Building kernel post-launch CUDA free")

    assert_ready_to_build

    if KernelLauncher.debug_free_previous_input_immediately == true
        Log.warn("Debug flag set... Freeing input memory immediately and some memory not at all!")
        return ""
    end

    return Translator.read_file(file_name: "free_device_memory.cpp", replacements: {
        "name" => kernel_result_var_name})
end

#build_device_memory_free_in_host_sectionObject

Same as above, but also removes item from the list of allocated memory chunks.



244
245
246
247
248
249
250
251
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 244

def build_device_memory_free_in_host_section
    Log.info("Building kernel post-launch CUDA free (host section")

    assert_ready_to_build

    return Translator.read_file(file_name: "host_section_free_device_memory.cpp", replacements: {
        "name" => kernel_result_var_name})
end

#build_kernel_launcherObject

Build the code that launches this kernel. The generated code performs the following steps:

  1. Allocate device memory for the result.

  2. If result should be written back: Allocate host memory for the result.

  3. Launch the kernel (+ error checking, synchronization)

  4. If result should be written back: Copy result back to host memory.



158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 158

def build_kernel_launcher
    
    Log.info("Building kernel launcher")

    assert_ready_to_build

    result = ""
    if !reuse_memory
        # Allocate device memory for kernel result
        result = result + Translator.read_file(file_name: "allocate_device_memory.cpp", replacements: {
            "name" => kernel_result_var_name,
            "bytes" => "(sizeof(#{kernel_builder.result_type.to_c_type}) * #{num_threads})",
            "type" => kernel_builder.result_type.to_c_type})
    end

    previously_cached_results.each do |result_id, type|
        result = result + "    #{type.to_c_type} *prev_" + result_id.to_s + " = (#{type.to_c_type} *) " + Constants::ENV_HOST_IDENTIFIER + "->prev_" + result_id.to_s + ";\n"
    end 

    # Allocate device memory for cached results
    cached_results.each do |result_id, type|
        result = result + Translator.read_file(file_name: "allocate_device_memory.cpp", replacements: {
            "name" => Constants::RESULT_IDENTIFIER + result_id,
            "bytes" => "(#{type.c_size} * #{num_threads})",
            "type" => type.to_c_type})
    end

    # Build arguments
    a_env = Constants::ENV_DEVICE_IDENTIFIER
    a_result = kernel_result_var_name

    previous_kernel_args = []
    for var in kernel_builder.previous_kernel_input
        previous_kernel_args.push(var.name.to_s)
    end

    a_cached_results = cached_results.map do |result_id, type|
        Constants::RESULT_IDENTIFIER + result_id
    end

    if reuse_memory
        previous_kernel_args[0] = a_result
    end

    arguments = ([a_env, num_threads, a_result] + a_cached_results + previous_kernel_args + additional_arguments).join(", ")

    # Launch kernel
    result = result + Translator.read_file(file_name: "launch_kernel.cpp", replacements: {
        "kernel_name" => kernel_builder.kernel_name,
        "arguments" => arguments,
        "grid_dim" => grid_dim,
        "block_dim" => block_dim})

    # ---- DEBUG ONLY: Free input after computation so that we can process larger
    #                  data sets in benchmarks without running out of memory
    # TODO: Implement analysis and do this automatically
    if KernelLauncher.debug_free_previous_input_immediately == true
        for var in kernel_builder.previous_kernel_input
            result = result + Translator.read_file(file_name: "free_device_memory.cpp", replacements: {
                "name" => var.name.to_s})
        end 
    end
    # ---- END DEBUG ONLY

    cached_results.each do |result_id, type|
        result = result + "    " + Constants::ENV_HOST_IDENTIFIER + "->prev_" + result_id + " = " + Constants::RESULT_IDENTIFIER + result_id + ";\n"
    end

    return result
end

#configure_grid(size, block_size: 256) ⇒ Object

Configures grid size and block size. Also sets number of threads.



117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 117

def configure_grid(size, block_size: 256)
    if block_size == nil
        block_size = 256
    end
    
    if size.is_a?(Fixnum)
        # Precompute constants
        @grid_dim = [size.fdiv(block_size).ceil, 1].max.to_s
        @block_dim = (size >= block_size ? block_size : size).to_s
        @num_threads = size
    else
        if !size.is_a?(String)
            raise AssertionError.new("Fixnum or String expected")
        end

        # Source code string determines the size
        @grid_dim = "max((int) ceil(((float) #{size}) / #{block_size}), 1)"
        @block_dim = "(#{size} >= #{block_size} ? #{block_size} : #{size})"
        @num_threads = size
    end
end

#kernel_buildersObject



70
71
72
73
74
75
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 70

def kernel_builders
    # The program builder accesses kernel builders via kernel launchers through
    # this method, because some specialized launchers might have multiple kernel 
    # builders.
    return [kernel_builder]
end

#prepare_additional_args_for_launch(command) ⇒ Object

Some of the values stored in ‘@additional_arguments` might be blocks, because not all information was known when adding something to that list. This method replaces those blocks (evaluates them) with actual strings, based on the command that is being launched.



60
61
62
63
64
65
66
67
68
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 60

def prepare_additional_args_for_launch(command)
    @additional_arguments = @additional_arguments.map do |arg|
        if arg.is_a?(String)
            arg
        else
            arg.call(command)
        end
    end
end

#result_sizeObject

The size of the result array is the number of threads.



112
113
114
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 112

def result_size
    return num_threads
end

#result_typeObject

The result type of this kernel launcher. Same as the result type of its kernel builder.



107
108
109
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 107

def result_type
    return kernel_builder.result_type
end

#reuse_memory!(parameter_name) ⇒ Object



87
88
89
90
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 87

def reuse_memory!(parameter_name)
    @reuse_memory = true
    @kernel_result_var_name = parameter_name
end

#reuse_memory?Boolean

Returns:

  • (Boolean)


92
93
94
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 92

def reuse_memory?
    return @reuse_memory
end

#use_cached_result(result_id, type) ⇒ Object

Adds a previously computed result which will be used in this launche as input



83
84
85
# File 'lib/translator/kernel_launcher/kernel_launcher.rb', line 83

def use_cached_result(result_id, type)
    @previously_cached_results[result_id] = type
end