Class: ComputeUnit::Gpu

Inherits:
ComputeBase show all
Defined in:
lib/compute_unit/gpu.rb

Direct Known Subclasses

AmdGpu, NvidiaGpu

Constant Summary collapse

DEVICE_CLASS =
'030000'
DEVICE_CLASS_NAME =
'GPU'

Constants inherited from ComputeBase

ComputeBase::CACHE_TIMEOUT

Constants inherited from Device

Device::PROC_PATH, Device::SYSFS_DEVICES_PATH

Instance Attribute Summary collapse

Attributes inherited from ComputeBase

#index, #meta, #power_offset, #serial, #timestamp, #type, #uuid

Attributes inherited from Device

#device_class_id, #device_id, #device_path, #device_vendor_id, #make, #model, #subsystem_device_id, #subsystem_vendor_id, #vendor

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from ComputeBase

compute_classes, #device_class_name, #experimental_on?, #expired_metadata?, #micro_formatter, #top_processes

Methods included from Logger

color, log_file, log_level, logger, #logger

Methods included from Formatters

#micro_formatter, #value_micro_formatter

Methods inherited from Device

#base_hwmon_path, create_from_path, device, device_class, device_lookup, device_vendor, #expired_metadata?, #generic_model, #hwmon_path, #lock_rom, logger, manual_device_database, manual_device_lookup, manual_vendor_lookup, manual_vendors, name_map, name_translation, pci_database, #read_file, #read_hwmon_data, #read_kernel_setting, read_kernel_setting, #rom_data, #rom_path, subsystem_device, subsystem_device_lookup, subsystem_vendor, subsystem_vendor_lookup, #sysfs_model_name, system_checksum, #to_json, #unlock_rom, vendor_lookup, #write_hwmon_data, #write_kernel_setting, write_kernel_setting

Methods included from Utils

check_for_root, #root?, root?

Constructor Details

#initialize(device_path, opts = {}) ⇒ Gpu

Returns a new instance of Gpu.

Parameters:

  • device_path (String)
    • that pci bus path to the device

  • opts (Hash) (defaults to: {})
  • bios (Hash)

    a customizable set of options

  • model (Hash)

    a customizable set of options

  • serial (Hash)

    a customizable set of options

  • busid (Hash)

    a customizable set of options

  • meta (Hash)

    a customizable set of options

  • index (Hash)

    a customizable set of options

  • uuid (Hash)

    a customizable set of options

  • use_opencl (Hash)

    a customizable set of options



69
70
71
72
73
74
75
76
77
78
79
80
81
82
# File 'lib/compute_unit/gpu.rb', line 69

def initialize(device_path, opts = {})
  super(device_path, opts)
  @type = :GPU
  @bios = opts[:bios].upcase if opts[:bios]
  @model = opts[:model]
  @serial = opts[:serial]
  @pci_loc = opts[:busid]
  @meta = opts[:meta]
  @index = opts[:index].to_i
  @uuid = opts[:uuid] || opts[:serial]
  @name = model
  @power_offset = 0
  @use_opencl = opts[:use_opencl] || false
end

Instance Attribute Details

#biosObject (readonly)

Returns the value of attribute bios.



7
8
9
# File 'lib/compute_unit/gpu.rb', line 7

def bios
  @bios
end

#nameObject (readonly)

Returns the value of attribute name.



7
8
9
# File 'lib/compute_unit/gpu.rb', line 7

def name
  @name
end

#pci_locObject (readonly)

Returns the value of attribute pci_loc.



7
8
9
# File 'lib/compute_unit/gpu.rb', line 7

def pci_loc
  @pci_loc
end

#power_limitObject

Returns the value of attribute power_limit.

Raises:

  • (NotImplementedError)


10
11
12
# File 'lib/compute_unit/gpu.rb', line 10

def power_limit
  @power_limit
end

#use_openclObject

Returns the value of attribute use_opencl.



10
11
12
# File 'lib/compute_unit/gpu.rb', line 10

def use_opencl
  @use_opencl
end

Class Method Details

.devicesArray

Note:

the devices are sorted by the device path

Note:

this can mean AMD, NVIDIA, Intel or other crappy embedded devices

Returns - returns a list of device paths of all devices considered for display.

Returns:

  • (Array)
    • returns a list of device paths of all devices considered for display



53
54
55
56
57
# File 'lib/compute_unit/gpu.rb', line 53

def self.devices
  @devices ||= ComputeUnit::ComputeBase.devices.find_all do |device|
    ComputeUnit::Device.device_class(device) == DEVICE_CLASS
  end.sort
end

.find_all(use_opencl = false) ⇒ Array

Returns - returns an array of gpu objects, sorted by index.

Returns:

  • (Array)
    • returns an array of gpu objects, sorted by index



276
277
278
279
280
281
# File 'lib/compute_unit/gpu.rb', line 276

def self.find_all(use_opencl = false)
  require 'compute_unit/gpus/amd_gpu'
  require 'compute_unit/gpus/nvidia_gpu'
  g = compute_classes.map { |klass| klass.find_all(use_opencl) }.flatten
  g.sort_by(&:index)
end

.found_devicesArray

Returns - array of devices paths either from amd or nvidia.

Returns:

  • (Array)
    • array of devices paths either from amd or nvidia



333
334
335
# File 'lib/compute_unit/gpu.rb', line 333

def self.found_devices
  @found_devices ||= ComputeUnit::AmdGpu.devices + ComputeUnit::NvidiaGpu.devices
end

.opencl_cacheCacheStore

Returns - returns an instance of the cachestore for storign opencl cache.

Returns:

  • (CacheStore)
    • returns an instance of the cachestore for storign opencl cache



284
285
286
# File 'lib/compute_unit/gpu.rb', line 284

def self.opencl_cache
  @opencl_cache ||= ComputeUnit::CacheStore.new('opencl_cache')
end

.opencl_devicesArray

overwrites cache if new devices are found OpenCL should only be used when necessary as it can freeze sometimes OpenCL indexes items differently

Returns:

  • (Array)
    • returns an array of opencl devices



341
342
343
344
345
346
347
# File 'lib/compute_unit/gpu.rb', line 341

def self.opencl_devices
  @opencl_devices ||= opencl_devices_from_cache || begin
    items = opencl_devices_from_platform
    opencl_cache.write_cache('opencl_compute_units', ComputeUnit::Device.system_checksum.to_s => items)
    items
  end
end

.opencl_devices_from_cacheArray

Returns - array of openstruct or nil.

Returns:

  • (Array)
    • array of openstruct or nil



289
290
291
292
# File 'lib/compute_unit/gpu.rb', line 289

def self.opencl_devices_from_cache
  data = opencl_cache.read_cache('opencl_compute_units', {})
  data[ComputeUnit::Device.system_checksum]
end

.opencl_devices_from_platformObject



295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
# File 'lib/compute_unit/gpu.rb', line 295

def self.opencl_devices_from_platform
  require 'ostruct'
  # opencl takes a second to load so we cache later in the process
  # which is why we need the openstruct object here
  # opencl can also freeze the system if it tries to enumerate a dead GPU
  # opencl sould be used sparingly as a result and only read when absolutely
  # neccessary and no dead GPUs.
  # TODO: warn when dead gpus detected
  begin
    require 'opencl_ruby_ffi'
    ComputeUnit::Logger.logger.debug('Searching for openCL devices')
    OpenCL.platforms.map(&:devices).flatten.map do |d|
      type = d.platform.name.include?('AMD') ? 'AMD' : 'Nvidia'
      board_name = type == 'AMD' ? d.board_name_amd : ''
      max_computes = d.respond_to?(:max_compute_units) ? d.max_compute_units : 0
      OpenStruct.new(
        name: d.name,
        type: type,
        board_name: board_name,
        max_compute_units: max_computes
      )
    end
  rescue OpenCL::Error::DEVICE_NOT_FOUND => e
    ComputeUnit::Logger.logger.debug("OpenCL error: #{e.message}, are you root?")
    []
  rescue RuntimeError => e # OpenCL::Error::PLATFORM_NOT_FOUND_KHR,
    ComputeUnit::Logger.logger.debug("OpenCL error: #{e.message}")
    ComputeUnit::Logger.logger.debug("OpenCL error: #{e.backtrace}")
    []
  end
end

Instance Method Details

#asic_tempInteger

Returns - the temperature of the asic chip.

Returns:

  • (Integer)
    • the temperature of the asic chip



216
217
218
# File 'lib/compute_unit/gpu.rb', line 216

def asic_temp
  0
end

#attached_processes(field = :pctcpu) ⇒ Array

Returns - an array of attached processes.

Parameters:

  • field (Symbol) (defaults to: :pctcpu)
    • the field to sort by

Returns:

  • (Array)
    • an array of attached processes



19
20
21
22
23
24
25
# File 'lib/compute_unit/gpu.rb', line 19

def attached_processes(field = :pctcpu)
  # looks for any fd device with dri or nvidia in the name
  p = Sys::ProcTable.ps(smaps: false).find_all do |p|
    p.fd.values.find { |f| f =~ %r{/dev/dri|nvidia\d+} }
  end
  p.sort_by(&field)
end

#compute_typeObject



12
13
14
# File 'lib/compute_unit/gpu.rb', line 12

def compute_type
  type
end

#configured_core_voltageNumeric

Returns - returns voltage of core in mV.

Returns:

  • (Numeric)
    • returns voltage of core in mV



163
164
165
# File 'lib/compute_unit/gpu.rb', line 163

def configured_core_voltage
  0
end

#core_clockInteger

Returns - the core clock speed.

Returns:

  • (Integer)
    • the core clock speed



153
154
155
# File 'lib/compute_unit/gpu.rb', line 153

def core_clock
  0
end

#core_voltageNumeric

Returns - returns voltage of core in mV.

Returns:

  • (Numeric)
    • returns voltage of core in mV



158
159
160
# File 'lib/compute_unit/gpu.rb', line 158

def core_voltage
  0
end

#fanObject

Raises:

  • (NotImplementedError)


84
85
86
# File 'lib/compute_unit/gpu.rb', line 84

def fan
  raise NotImplementedError
end

#fan_limitInteger

Returns - a percentage value of the current fan limit.

Returns:

  • (Integer)
    • a percentage value of the current fan limit



104
105
106
# File 'lib/compute_unit/gpu.rb', line 104

def fan_limit
  fan
end

#fan_max_limitInteger

Returns - a percentage value of the max fan limit.

Returns:

  • (Integer)
    • a percentage value of the max fan limit



114
115
116
# File 'lib/compute_unit/gpu.rb', line 114

def fan_max_limit
  nil
end

#fan_min_limitInteger

Returns - a percentage value of the min fan limit.

Returns:

  • (Integer)
    • a percentage value of the min fan limit



109
110
111
# File 'lib/compute_unit/gpu.rb', line 109

def fan_min_limit
  nil
end

#hardware_infoHash

Returns - hash of information about the gpu data.

Returns:

  • (Hash)
    • hash of information about the gpu data



200
201
202
203
204
205
206
207
208
209
210
211
212
213
# File 'lib/compute_unit/gpu.rb', line 200

def hardware_info
  {
    uuid: uuid,
    gpuId: "GPU#{index}",
    syspath: device_path,
    pciLoc: pci_loc,
    name: name,
    bios: bios,
    subType: subtype,
    make: make,
    model: model,
    vendor: vendor
  }
end

#mem_infoObject



167
168
169
170
171
172
173
174
175
176
177
178
179
180
# File 'lib/compute_unit/gpu.rb', line 167

def mem_info
  {
    index: "#{device_class_name}#{index}",
    name: name,
    volt: memory_volt,
    clock: memory_clock,
    memory_name: nil,
    memory_type: nil,
    memory_used: memory_used,
    memory_free: memory_free,
    memory_total: memory_total,
    mem_temp: mem_temp
  }
end

#mem_tempInteger

Returns - temperature of the memory.

Returns:

  • (Integer)
    • temperature of the memory



221
222
223
# File 'lib/compute_unit/gpu.rb', line 221

def mem_temp
  0
end

#memory_clockInteger

Returns - the memory speed.

Returns:

  • (Integer)
    • the memory speed



143
144
145
# File 'lib/compute_unit/gpu.rb', line 143

def memory_clock
  0
end

#memory_freeObject

Raises:

  • (NotImplementedError)


134
135
136
# File 'lib/compute_unit/gpu.rb', line 134

def memory_free
  raise NotImplementedError
end

#memory_totalObject

Raises:

  • (NotImplementedError)


126
127
128
# File 'lib/compute_unit/gpu.rb', line 126

def memory_total
  raise NotImplementedError
end

#memory_usedObject

Raises:

  • (NotImplementedError)


130
131
132
# File 'lib/compute_unit/gpu.rb', line 130

def memory_used
  raise NotImplementedError
end

#memory_voltInteger

Returns - the memory speed.

Returns:

  • (Integer)
    • the memory speed



148
149
150
# File 'lib/compute_unit/gpu.rb', line 148

def memory_volt
  0
end

#opencl_board_nameString

Returns - returns the raw data of the board name from opencl, return nil if no device.

Returns:

  • (String)
    • returns the raw data of the board name from opencl, return nil if no device



33
34
35
# File 'lib/compute_unit/gpu.rb', line 33

def opencl_board_name
  @opencl_board_name ||= opencl_device&.board_name if use_opencl
end

#opencl_deviceOpenCL_Device

Returns:

  • (OpenCL_Device)


28
29
30
# File 'lib/compute_unit/gpu.rb', line 28

def opencl_device
  @opencl_device ||= self.class.opencl_devices.find_all { |cu| cu[:type] == make }[index] if use_opencl
end

#opencl_nameString

Note:

not really needed for Nvidia types since nvidia-smi returns really complete information

ie. GeForce GTX 1070 or RX 580

Returns:

  • (String)
    • the device name



46
47
48
# File 'lib/compute_unit/gpu.rb', line 46

def opencl_name
  @opencl_name ||= opencl_device.name if use_opencl
end

#opencl_unitsInteger

Returns - returns the number of compute units decteded by opencl not to be confused with stream processors. Can be helpful when determining which product vega56 or vega64.

Returns:

  • (Integer)
    • returns the number of compute units decteded by opencl

    not to be confused with stream processors. Can be helpful when determining which product vega56 or vega64



39
40
41
# File 'lib/compute_unit/gpu.rb', line 39

def opencl_units
  @opencl_units ||= opencl_device.max_compute_units.to_i if use_opencl
end

#powerObject

Raises:

  • (NotImplementedError)


95
96
97
# File 'lib/compute_unit/gpu.rb', line 95

def power
  raise NotImplementedError
end

#power_max_limitObject

Raises:

  • (NotImplementedError)


122
123
124
# File 'lib/compute_unit/gpu.rb', line 122

def power_max_limit
  raise NotImplementedError
end

#pstateObject

Raises:

  • (NotImplementedError)


99
100
101
# File 'lib/compute_unit/gpu.rb', line 99

def pstate
  raise NotImplementedError
end

#statusObject



88
89
90
91
92
93
# File 'lib/compute_unit/gpu.rb', line 88

def status
  return 0 if utilization > 20 && power >= 50
  return 2 if power < 20

  1
end

#status_infoHash

Returns - hash of hardware status about the gpu.

Returns:

  • (Hash)
    • hash of hardware status about the gpu



183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
# File 'lib/compute_unit/gpu.rb', line 183

def status_info
  {
    index: "#{device_class_name}#{index}",
    name: name,
    bios: bios,
    core_clock: core_clock,
    memory_clock: memory_clock,
    power: power,
    fan: fan,
    core_volt: core_voltage,
    temp: temp,
    mem_temp: mem_temp,
    status: status
  }
end

#tempObject



230
231
232
# File 'lib/compute_unit/gpu.rb', line 230

def temp
  0
end

#to_hObject



234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
# File 'lib/compute_unit/gpu.rb', line 234

def to_h
  {
    uuid: uuid,
    gpuId: "GPU#{index}",
    syspath: device_path,
    pciLoc: pci_loc,
    name: name,
    bios: bios,
    subType: subtype,
    make: make,
    model: model,
    vendor: vendor,
    # memory_name: nil,
    # memory_type: nil,
    # gpu_platform: nil,
    power: power,
    # power_limit: power_limit,
    # power_max_limit: power_max_limit,
    utilization: utilization,
    # memory_used: memory_used ,
    # memory_free: memory_free,
    # memory_total: memory_total,
    temperature: temp,
    status: status,
    pstate: pstate,
    fanSpeed: fan,
    type: compute_type,
    maxTemp: nil,
    mem: memory_clock,
    cor: core_clock,
    vlt: core_voltage,
    mem_temp: mem_temp,
    maxFan: nil,
    dpm: nil,
    vddci: nil,
    maxPower: nil,
    ocProfile: nil,
    opencl_enabled: use_opencl
  }
end

#utilizationObject

Raises:

  • (NotImplementedError)


138
139
140
# File 'lib/compute_unit/gpu.rb', line 138

def utilization
  raise NotImplementedError
end

#vddgfxInteger

Returns - the voltage reading of the card, maybe just amd cards (mV).

Returns:

  • (Integer)
    • the voltage reading of the card, maybe just amd cards (mV)



226
227
228
# File 'lib/compute_unit/gpu.rb', line 226

def vddgfx
  0
end

#voltage_tableHash

Returns - a hash of voltages per the voltage table, nil if no table available.

Returns:

  • (Hash)
    • a hash of voltages per the voltage table, nil if no table available



328
329
330
# File 'lib/compute_unit/gpu.rb', line 328

def voltage_table
  []
end