Class: NanoGPT::Layers::Block
- Inherits:
-
Torch::NN::Module
- Object
- Torch::NN::Module
- NanoGPT::Layers::Block
- Defined in:
- lib/nano_gpt/layers/block.rb
Overview
Transformer block: LayerNorm -> Attention -> LayerNorm -> MLP
Instance Method Summary collapse
- #forward(x) ⇒ Object
-
#initialize(config) ⇒ Block
constructor
A new instance of Block.
Constructor Details
#initialize(config) ⇒ Block
Returns a new instance of Block.
7 8 9 10 11 12 13 |
# File 'lib/nano_gpt/layers/block.rb', line 7 def initialize(config) super() @ln_1 = LayerNorm.new(config.n_embd, bias: config.bias) @attn = CausalSelfAttention.new(config) @ln_2 = LayerNorm.new(config.n_embd, bias: config.bias) @mlp = MLP.new(config) end |
Instance Method Details
#forward(x) ⇒ Object
15 16 17 18 19 20 21 22 |
# File 'lib/nano_gpt/layers/block.rb', line 15 def forward(x) x = x + @attn.call(@ln_1.call(x)) x = x + @mlp.call(@ln_2.call(x)) # Trigger GC to free intermediate tensors (critical for torch.rb memory management) # Ruby's GC doesn't run frequently enough during forward pass, causing memory accumulation GC.start(full_mark: false, immediate_sweep: true) x end |