Class: ChupaText::Data

Inherits:
Object
  • Object
show all
Defined in:
lib/chupa-text/data.rb

Direct Known Subclasses

InputData, TextData, VirtualFileData

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options = {}) ⇒ Data

Returns a new instance of Data



54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/chupa-text/data.rb', line 54

def initialize(options={})
  @uri = nil
  @body = nil
  @size = nil
  @path = nil
  @mime_type = nil
  @attributes = Attributes.new
  @source = nil
  @options = options || {}
  source_data = @options[:source_data]
  merge!(source_data) if source_data
end

Instance Attribute Details

#attributesAttributes (readonly)

Returns The attributes of the data.

Returns:



47
48
49
# File 'lib/chupa-text/data.rb', line 47

def attributes
  @attributes
end

#bodyString?

Returns The content of the data, nil if the data doesn't have any content.

Returns:

  • (String, nil)

    The content of the data, nil if the data doesn't have any content.



29
30
31
# File 'lib/chupa-text/data.rb', line 29

def body
  @body
end

#pathString?

Returns The path associated with the content of the data, nil if the data doesn't associated with any file.

The path may not be related with the original content. For example, "/tmp/XXX.txt" may be returned for the data of "http://example.com/XXX.txt".

This value is useful to use an external command to extract text and meta-data.

Returns:

  • (String, nil)

    The path associated with the content of the data, nil if the data doesn't associated with any file.

    The path may not be related with the original content. For example, "/tmp/XXX.txt" may be returned for the data of "http://example.com/XXX.txt".

    This value is useful to use an external command to extract text and meta-data.



44
45
46
# File 'lib/chupa-text/data.rb', line 44

def path
  @path
end

#sizeInteger?

Returns The byte size of the data, nil if the data doesn't have any content.

Returns:

  • (Integer, nil)

    The byte size of the data, nil if the data doesn't have any content.



33
34
35
# File 'lib/chupa-text/data.rb', line 33

def size
  @size
end

#sourceData?

Returns The source of the data. For example, text data (hello.txt) in archive data (hello.tar) have the archive data in #source.

Returns:

  • (Data, nil)

    The source of the data. For example, text data (hello.txt) in archive data (hello.tar) have the archive data in #source.



52
53
54
# File 'lib/chupa-text/data.rb', line 52

def source
  @source
end

#uriURI?

Returns The URI of the data if the data is for remote or local file, nil if the data isn't associated with any URIs.

Returns:

  • (URI, nil)

    The URI of the data if the data is for remote or local file, nil if the data isn't associated with any URIs.



25
26
27
# File 'lib/chupa-text/data.rb', line 25

def uri
  @uri
end

Instance Method Details

#[](name) ⇒ Object



104
105
106
# File 'lib/chupa-text/data.rb', line 104

def [](name)
  @attributes[name]
end

#[]=(name, value) ⇒ Object



108
109
110
# File 'lib/chupa-text/data.rb', line 108

def []=(name, value)
  @attributes[name] = value
end

#extensionString?

Returns Normalized extension as String if #uri is not nil, nil otherwise. The normalized extension uses lower case like pdf not PDF.

Returns:

  • (String, nil)

    Normalized extension as String if #uri is not nil, nil otherwise. The normalized extension uses lower case like pdf not PDF.



130
131
132
133
# File 'lib/chupa-text/data.rb', line 130

def extension
  return nil if @uri.nil?
  File.extname(@uri.path).downcase.gsub(/\A\./, "")
end

#initialize_copy(object) ⇒ Object



67
68
69
70
71
# File 'lib/chupa-text/data.rb', line 67

def initialize_copy(object)
  super
  @attributes = @attributes.dup
  self
end

#merge!(data) ⇒ void

This method returns an undefined value.

Merges metadata from data.

Parameters:

  • data (Data)

    The data to be merged.



78
79
80
81
82
83
84
85
86
87
88
# File 'lib/chupa-text/data.rb', line 78

def merge!(data)
  self.uri = data.uri
  self.path = data.path
  data.attributes.each do |name, value|
    self[name] = value
  end
  if data.mime_type
    self["source-mime-types"] ||= []
    self["source-mime-types"].unshift(data.mime_type)
  end
end

#mime_typeString?

Returns:

  • (String)

    The MIME type of the data. If MIME type isn't set, guesses MIME type from path and body.

  • (nil)

    If MIME type isn't set and it can't guess MIME type from path and body.



116
117
118
# File 'lib/chupa-text/data.rb', line 116

def mime_type
  @mime_type || guess_mime_type
end

#mime_type=(type) ⇒ Object

Parameters:

  • type (String, nil)

    The MIME type of the data. You can unset MIME type by nil. If you unset MIME type, MIME type is guessed from path and body of the data.



123
124
125
# File 'lib/chupa-text/data.rb', line 123

def mime_type=(type)
  @mime_type = type
end

#open {|StringIO.new(body)| ... } ⇒ Object

Yields:

  • (StringIO.new(body))


100
101
102
# File 'lib/chupa-text/data.rb', line 100

def open
  yield(StringIO.new(body))
end

#text?Bool

Returns true if MIME type is "text/XXX", false otherwise.

Returns:

  • (Bool)

    true if MIME type is "text/XXX", false otherwise.



137
138
139
# File 'lib/chupa-text/data.rb', line 137

def text?
  (mime_type || "").start_with?("text/")
end

#text_plain?Bool

Returns true if MIME type is "text/plain", false otherwise.

Returns:

  • (Bool)

    true if MIME type is "text/plain", false otherwise.



143
144
145
# File 'lib/chupa-text/data.rb', line 143

def text_plain?
  mime_type == "text/plain"
end