Class: Leva::DatasetConverter
- Inherits:
-
Object
- Object
- Leva::DatasetConverter
- Defined in:
- app/services/leva/dataset_converter.rb
Overview
Converts Leva datasets to DSPy example format.
This service transforms DatasetRecord objects into DSPy::Example objects suitable for use with DSPy optimizers and predictors.
Instance Method Summary collapse
-
#initialize(dataset) ⇒ DatasetConverter
constructor
A new instance of DatasetConverter.
-
#split(train_ratio: 0.6, val_ratio: 0.2, seed: nil) ⇒ Hash
Splits the dataset into train, validation, and test sets.
-
#to_dspy_examples ⇒ Array<Hash>
Converts all dataset records to DSPy example format.
-
#valid_record_count ⇒ Integer
Returns the count of valid records in the dataset.
Constructor Details
#initialize(dataset) ⇒ DatasetConverter
Returns a new instance of DatasetConverter.
19 20 21 |
# File 'app/services/leva/dataset_converter.rb', line 19 def initialize(dataset) @dataset = dataset end |
Instance Method Details
#split(train_ratio: 0.6, val_ratio: 0.2, seed: nil) ⇒ Hash
Splits the dataset into train, validation, and test sets.
44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'app/services/leva/dataset_converter.rb', line 44 def split(train_ratio: 0.6, val_ratio: 0.2, seed: nil) examples = to_dspy_examples examples = seed ? examples.shuffle(random: Random.new(seed)) : examples.shuffle train_size = (examples.size * train_ratio).to_i val_size = (examples.size * val_ratio).to_i { train: examples[0...train_size], val: examples[train_size...(train_size + val_size)], test: examples[(train_size + val_size)..] } end |
#to_dspy_examples ⇒ Array<Hash>
Converts all dataset records to DSPy example format. Uses to_dspy_context if available, otherwise falls back to to_llm_context.
27 28 29 30 31 32 33 34 35 36 |
# File 'app/services/leva/dataset_converter.rb', line 27 def to_dspy_examples @dataset.dataset_records.includes(:recordable).map do |record| next unless record.recordable { input: sanitize_context(context_for(record.recordable)), expected: { output: record.recordable.ground_truth.to_s } } end.compact end |
#valid_record_count ⇒ Integer
Returns the count of valid records in the dataset.
61 62 63 |
# File 'app/services/leva/dataset_converter.rb', line 61 def valid_record_count to_dspy_examples.size end |