Class: NanoGPT::Layers::Block

Inherits:
Torch::NN::Module
  • Object
show all
Defined in:
lib/nano_gpt/layers/block.rb

Overview

Transformer block: LayerNorm -> Attention -> LayerNorm -> MLP

Instance Method Summary collapse

Constructor Details

#initialize(config) ⇒ Block

Returns a new instance of Block.



7
8
9
10
11
12
13
# File 'lib/nano_gpt/layers/block.rb', line 7

def initialize(config)
  super()
  @ln_1 = LayerNorm.new(config.n_embd, bias: config.bias)
  @attn = CausalSelfAttention.new(config)
  @ln_2 = LayerNorm.new(config.n_embd, bias: config.bias)
  @mlp = MLP.new(config)
end

Instance Method Details

#forward(x) ⇒ Object



15
16
17
18
19
20
21
22
# File 'lib/nano_gpt/layers/block.rb', line 15

def forward(x)
  x = x + @attn.call(@ln_1.call(x))
  x = x + @mlp.call(@ln_2.call(x))
  # Trigger GC to free intermediate tensors (critical for torch.rb memory management)
  # Ruby's GC doesn't run frequently enough during forward pass, causing memory accumulation
  GC.start(full_mark: false, immediate_sweep: true)
  x
end