nbtfile

NBTFile is a low-level library for reading and writing NBT-format files, as used by the popular game Minecraft.

Official Specification

The official (if somewhat confusing) specification for the NBT file format may be found at www.minecraft.net/docs/NBT.txt

Data Model

The NBT data model has ten different data types:

  • 8-bit signed integers (Tag_Byte)

  • 16-bit signed integers (Tag_Short)

  • 32-bit signed integers (Tag_Int)

  • 64-bit signed integers (Tag_Long)

  • 32-bit floating-point numbers (Tag_Float)

  • 64-bit floating-point numbers (Tag_Double)

  • UTF-8 strings (Tag_String)

  • raw byte sequences (Tag_Byte_Array)

  • homogenous lists (Tag_List)

  • compound structures (Tag_Compound)

Compound structures (Tag_Compound) are unordered collections where each item (“tag”) has a name associated with it. Compound structures are heterogenous; the elements of a compound structure may be any mixture of types.

Lists (Tag_List) are ordered collections of unnamed items. Lists are homogenous; every element of a particular list must have the same type. Note that all lists have the same type (Tag_List) regardless of the type of their elements.

Items of an eleventh “type” (Tag_End) serve to terminate compound structures and lists (of any type). The wording of the official specification could permit lists of Tag_End, but this is not supported in practice.

The top level of an NBT file must be a single-element compound structure.

Interface

Low-level API

The low-level API deals directly in NBT tokens, which are represented by eleven types:

  • NBTFile::Tokens::TAG_End

  • NBTFile::Tokens::TAG_Byte

  • NBTFile::Tokens::TAG_Short

  • NBTFile::Tokens::TAG_Long

  • NBTFile::Tokens::TAG_Float

  • NBTFile::Tokens::TAG_Double

  • NBTFile::Tokens::TAG_Byte_Array

  • NBTFile::Tokens::TAG_String

  • NBTFile::Tokens::TAG_List

  • NBTFile::Tokens::TAG_Compound

Their constructors each take two arguments: name and value, which are also exposed by similarly-named accessors. Most of these tokens represent simple values, but TAG_List and TAG_Compound introduce the beginning of a list and a compound structure, and TAG_End terminates either one.

The name field supplies the field name when the token is part of a compound structure, and is ignored when it is part of a list.

For most token types, value represents the value of the token. For TAG_List it represents the type of the list. It is ignored for compound structures.

NBTFile.tokenize

NBTFile.emit

High-level API

Much of the high-level API deals in higher-level data types from the NBTFile::Types module.

There are eight “primitive” types with value fields which reflect their native Ruby values:

  • NBTFile::Types::Byte

  • NBTFile::Types::Short

  • NBTFile::Types::Int

  • NBTFile::Types::Long

  • NBTFile::Types::Float

  • NBTFile::Types::Double

  • NBTFile::Types::ByteArray

  • NBTFile::Types::String

There are also two “complex” types which reflect NBT lists and compound structures:

  • NBTFile::Types::List - an Array-like class corresponding to an NBT list

  • NBTFile::Types::Compound - a Hash-like class correspodning to an NBT compound structure

NBTFile.read

NBTFile.write

NBTFile.load

NBTFile.transcode_to_yaml

Syntax

NBT files are gzip-compressed; the structure of the uncompressed data can be described by the following ABNF (see RFC 5234).

nbt-data = tag-compound

tag-compound = TAG-COMPOUND name compound-body

compound-body = *tag TAG-END

tag = TAG-BYTE name byte /
      TAG-SHORT name short /
      TAG-INT name int /
      TAG-LONG name long /
      TAG-FLOAT name float /
      TAG-DOUBLE name double /
      TAG-STRING name string /
      TAG-BYTE-ARRAY name byte-array /
      TAG-LIST name list /
      tag-compound

name = string

list = TAG-BYTE list-length *byte /
       TAG-SHORT list-length *short /
       TAG-INT list-length *int /
       TAG-LONG list-length *long /
       TAG-FLOAT list-length *float /
       TAG-DOUBLE list-length *double /
       TAG-STRING list-length *string /
       TAG-BYTE-ARRAY list-length *byte-array /
       TAG-LIST list-length *list /
       TAG-COMPOUND list-length *compound-body

list-length = int

; see RFC 3629 for definition of UTF8-octets
string = string-length UTF8-octets

string-length = short

byte-array = byte-array-length *byte

byte-array-length = int

byte = OCTET ; 8-bit signed integer

short = 2OCTET ; 16-bit signed integer, big-endian

int = 4OCTET ; 32-bit signed integer, big-endian

long = 8OCTET ; 64-bit signed integer, big-endian

float = 4OCTET ; 32-bit float, big-endian, IEEE 754-2008

double = 8OCTET ; 64-bit float, big-endian, IEEE 754-2008

TAG-END = %x00
TAG-BYTE = %x01
TAG-SHORT = %x02
TAG-INT = %x03
TAG-LONG = %x04
TAG-FLOAT = %x05
TAG-DOUBLE = %x06
TAG-BYTE-ARRAY = %x07
TAG-STRING = %x08
TAG-LIST = %x09
TAG-COMPOUND = %x0a

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright © 2010-2011 MenTaLguY. See LICENSE for details.