MimeTyper

A pure Ruby MIME type detection library focused on accuracy and reliability. MimeTyper uses magic byte detection and file extensions to accurately identify file types without any external dependencies.

Features

  • Pure Ruby - No external dependencies or C extensions
  • Comprehensive - Supports 150+ file types including images, documents, audio, video, archives, and more
  • Accurate - Uses magic byte detection for reliable type identification
  • Fast - Optimized for performance with minimal file reading
  • Simple API - Easy to use with just three methods
  • Reliable - Extensive test coverage ensures accuracy

Installation

Add this line to your application's Gemfile:

gem 'mimetyper'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install mimetyper

Usage

MimeTyper provides three simple methods for MIME type detection:

Detect from file path

require 'mimetyper'

# Detects MIME type using magic bytes first, falls back to extension
MimeTyper.from_file('document.pdf')
# => "application/pdf"

MimeTyper.from_file('photo.jpg')
# => "image/jpeg"

MimeTyper.from_file('unknown.bin')
# => "application/octet-stream"

Detect from data

# Detect MIME type from binary data
data = File.read('image.png', mode: 'rb')
MimeTyper.from_data(data)
# => "image/png"

# Optionally provide a filename for fallback detection
unknown_data = "\x00\x01\x02"
MimeTyper.from_data(unknown_data, filename: 'file.txt')
# => "text/plain"

Detect from extension only

# Direct extension lookup (less accurate than magic byte detection)
MimeTyper.from_extension('mp4')
# => "video/mp4"

MimeTyper.from_extension('.docx')
# => "application/vnd.openxmlformats-officedocument.wordprocessingml.document"

Supported Types

MimeTyper supports a comprehensive range of file types:

Images

  • JPEG, PNG, GIF, WebP, BMP, ICO, TIFF
  • SVG, HEIC, AVIF, JP2
  • PSD (Photoshop), XCF (GIMP)

Documents

  • PDF, RTF
  • Microsoft Office: DOC, DOCX, XLS, XLSX, PPT, PPTX
  • OpenDocument: ODT, ODS, ODP

Audio

  • MP3, WAV, FLAC, OGG, M4A
  • MIDI, AIFF, WMA, AAC, OPUS

Video

  • MP4, AVI, MOV, WebM, MKV
  • FLV, MPEG, 3GP, WMV

Archives

  • ZIP, RAR, 7Z, TAR
  • GZIP, BZIP2, XZ

Programming

  • Source code files (Ruby, Python, JavaScript, Java, Go, etc.)
  • JSON, XML, YAML, TOML
  • HTML, CSS

Fonts

  • TTF, OTF, WOFF, WOFF2, EOT

Executables

  • EXE, DLL, ELF, Mach-O
  • JAR, APK, DEX

Databases

  • SQLite

How It Works

MimeTyper uses a two-tier detection approach:

  1. Magic Byte Detection: Reads the first few bytes of a file to identify unique file signatures (magic bytes). This is the most reliable method.

  2. Extension Fallback: If magic byte detection fails or is inconclusive, falls back to file extension mapping.

The library maintains a comprehensive database of:

  • Magic byte signatures with specific offsets
  • File extension mappings
  • Special detection logic for complex formats (Office documents, media containers, etc.)

Performance

MimeTyper is designed for performance:

  • Reads only the first 4KB of files for detection
  • Magic bytes are checked in order of popularity
  • Efficient binary string matching

Accuracy and Reliability

  • Extensive test coverage with real-world file samples
  • Handles edge cases like:
    • Office Open XML format detection (DOCX vs ZIP)
    • Different video container formats
    • Text encoding detection
    • Malformed or truncated files

Default Behavior

When MimeTyper cannot determine a specific MIME type, it returns "application/octet-stream" as a safe default.

Thread Safety

MimeTyper is thread-safe. All methods are stateless and can be called concurrently from multiple threads.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/vancuren/mimetyper.

License

The gem is available as open source under the terms of the MIT License.