Class: Linguist::Language

Inherits:
Object
  • Object
show all
Defined in:
lib/linguist/language.rb

Overview

Language names that are recognizable by GitHub. Defined languages can be highlighted, searched and listed under the Top Languages page.

Languages are defined in ‘lib/linguist/languages.yml`.

Constant Summary collapse

TYPES =

Valid Languages types

[:data, :markup, :programming, :prose]

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(attributes = {}) ⇒ Language

Internal: Initialize a new Language

attributes - A hash of attributes



274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
# File 'lib/linguist/language.rb', line 274

def initialize(attributes = {})
  # @name is required
  @name = attributes[:name] || raise(ArgumentError, "missing name")

  # Set type
  @type = attributes[:type] ? attributes[:type].to_sym : nil
  if @type && !TYPES.include?(@type)
    raise ArgumentError, "invalid type: #{@type}"
  end

  @color = attributes[:color]

  # Set aliases
  @aliases = [default_alias_name] + (attributes[:aliases] || [])

  # Lookup Lexer object
  @lexer = Pygments::Lexer.find_by_name(attributes[:lexer] || name) ||
    raise(ArgumentError, "#{@name} is missing lexer")

  @ace_mode = attributes[:ace_mode]
  @wrap = attributes[:wrap] || false

  # Set legacy search term
  @search_term = attributes[:search_term] || default_alias_name

  # Set extensions or default to [].
  @extensions = attributes[:extensions] || []
  @interpreters = attributes[:interpreters]   || []
  @filenames  = attributes[:filenames]  || []

  # Set popular, and searchable flags
  @popular    = attributes.key?(:popular)    ? attributes[:popular]    : false
  @searchable = attributes.key?(:searchable) ? attributes[:searchable] : true

  # If group name is set, save the name so we can lazy load it later
  if attributes[:group_name]
    @group = nil
    @group_name = attributes[:group_name]

  # Otherwise we can set it to self now
  else
    @group = self
  end
end

Instance Attribute Details

#ace_modeObject (readonly)

Public: Get Ace mode

Examples

# => "text"
# => "javascript"
# => "c_cpp"

Returns a String name or nil



375
376
377
# File 'lib/linguist/language.rb', line 375

def ace_mode
  @ace_mode
end

#aliasesObject (readonly)

Public: Get aliases

Examples

Language['C++'].aliases
# => ["cpp"]

Returns an Array of String names



348
349
350
# File 'lib/linguist/language.rb', line 348

def aliases
  @aliases
end

#colorObject (readonly)

Public: Get color.

Returns a hex color String.



338
339
340
# File 'lib/linguist/language.rb', line 338

def color
  @color
end

#extensionsObject (readonly)

Public: Get extensions

Examples

# => ['.rb', '.rake', ...]

Returns the extensions Array



389
390
391
# File 'lib/linguist/language.rb', line 389

def extensions
  @extensions
end

#filenamesObject (readonly)

Public: Get filenames

Examples

# => ['Rakefile', ...]

Returns the extensions Array



407
408
409
# File 'lib/linguist/language.rb', line 407

def filenames
  @filenames
end

#interpretersObject (readonly)

Public: Get interpreters

Examples

# => ['awk', 'gawk', 'mawk' ...]

Returns the interpreters Array



398
399
400
# File 'lib/linguist/language.rb', line 398

def interpreters
  @interpreters
end

#lexerObject (readonly)

Public: Get Lexer

Returns the Lexer



364
365
366
# File 'lib/linguist/language.rb', line 364

def lexer
  @lexer
end

#nameObject (readonly)

Public: Get proper name

Examples

# => "Ruby"
# => "Python"
# => "Perl"

Returns the name String



328
329
330
# File 'lib/linguist/language.rb', line 328

def name
  @name
end

#search_termObject (readonly)

Deprecated: Get code search term

Examples

# => "ruby"
# => "python"
# => "perl"

Returns the name String



359
360
361
# File 'lib/linguist/language.rb', line 359

def search_term
  @search_term
end

#typeObject (readonly)

Public: Get type.

Returns a type Symbol or nil.



333
334
335
# File 'lib/linguist/language.rb', line 333

def type
  @type
end

#wrapObject (readonly)

Public: Should language lines be wrapped

Returns true or false



380
381
382
# File 'lib/linguist/language.rb', line 380

def wrap
  @wrap
end

Class Method Details

.[](name) ⇒ Object

Public: Look up Language by its name or lexer.

name - The String name of the Language

Examples

Language['Ruby']
# => #<Language name="Ruby">

Language['ruby']
# => #<Language name="Ruby">

Returns the Language or nil if none was found.



229
230
231
# File 'lib/linguist/language.rb', line 229

def self.[](name)
  @index[name]
end

.ace_modesObject

Public: A List of languages compatible with Ace.

Returns an Array of Languages.



267
268
269
# File 'lib/linguist/language.rb', line 267

def self.ace_modes
  @ace_modes ||= all.select(&:ace_mode).sort_by { |lang| lang.name.downcase }
end

.allObject

Public: Get all Languages

Returns an Array of Languages



152
153
154
# File 'lib/linguist/language.rb', line 152

def self.all
  @languages
end

.by_type(type) ⇒ Object

Detect languages by a specific type

type - A symbol that exists within TYPES

Returns an array



45
46
47
# File 'lib/linguist/language.rb', line 45

def self.by_type(type)
  all.select { |h| h.type == type }
end

.colorsObject

Public: A List of languages with assigned colors.

Returns an Array of Languages.



260
261
262
# File 'lib/linguist/language.rb', line 260

def self.colors
  @colors ||= all.select(&:color).sort_by { |lang| lang.name.downcase }
end

.create(attributes = {}) ⇒ Object

Internal: Create a new Language object

attributes - A hash of attributes

Returns a Language object



54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/linguist/language.rb', line 54

def self.create(attributes = {})
  language = new(attributes)

  @languages << language

  # All Language names should be unique. Raise if there is a duplicate.
  if @name_index.key?(language.name)
    raise ArgumentError, "Duplicate language name: #{language.name}"
  end

  # Language name index
  @index[language.name] = @name_index[language.name] = language

  language.aliases.each do |name|
    # All Language aliases should be unique. Raise if there is a duplicate.
    if @alias_index.key?(name)
      raise ArgumentError, "Duplicate alias: #{name}"
    end

    @index[name] = @alias_index[name] = language
  end

  language.extensions.each do |extension|
    if extension !~ /^\./
      raise ArgumentError, "Extension is missing a '.': #{extension.inspect}"
    end

    @extension_index[extension] << language
  end

  language.interpreters.each do |interpreter|
    @interpreter_index[interpreter] << language
  end

  language.filenames.each do |filename|
    @filename_index[filename] << language
  end

  language
end

.detect(blob) ⇒ Object

Public: Detects the Language of the blob.

blob - an object that includes the Linguist ‘BlobHelper` interface;

see Linguist::LazyBlob and Linguist::FileBlob for examples

Returns Language or nil.



101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# File 'lib/linguist/language.rb', line 101

def self.detect(blob)
  name = blob.name.to_s

  # Check if the blob is possibly binary and bail early; this is a cheap
  # test that uses the extension name to guess a binary binary mime type.
  #
  # We'll perform a more comprehensive test later which actually involves
  # looking for binary characters in the blob
  return nil if blob.likely_binary? || blob.binary?

  # A bit of an elegant hack. If the file is executable but extensionless,
  # append a "magic" extension so it can be classified with other
  # languages that have shebang scripts.
  extension = FileBlob.new(name).extension
  if extension.empty? && blob.mode && (blob.mode.to_i(8) & 05) == 05
    name += ".script!"
  end

  # First try to find languages that match based on filename.
  possible_languages = find_by_filename(name)

  # If there is more than one possible language with that extension (or no
  # extension at all, in the case of extensionless scripts), we need to continue
  # our detection work
  if possible_languages.length > 1
    data = blob.data
    possible_language_names = possible_languages.map(&:name)

    # Don't bother with binary contents or an empty file
    if data.nil? || data == ""
      nil
    # Check if there's a shebang line and use that as authoritative
    elsif (result = find_by_shebang(data)) && !result.empty?
      result.first
    # No shebang. Still more work to do. Try to find it with our heuristics.
    elsif (determined = Heuristics.find_by_heuristics(data, possible_language_names)) && !determined.empty?
      determined.first
    # Lastly, fall back to the probabilistic classifier.
    elsif classified = Classifier.classify(Samples.cache, data, possible_language_names).first
      # Return the actual Language object based of the string language name (i.e., first element of `#classify`)
      Language[classified[0]]
    end
  else
    # Simplest and most common case, we can just return the one match based on extension
    possible_languages.first
  end
end

.detectable_markupObject

Names of non-programming languages that we will still detect

Returns an array



36
37
38
# File 'lib/linguist/language.rb', line 36

def self.detectable_markup
  ["CSS", "Less", "Sass", "SCSS", "Stylus", "TeX"]
end

.find_by_alias(name) ⇒ Object

Public: Look up Language by one of its aliases.

name - A String alias of the Language

Examples

Language.find_by_alias('cpp')
# => #<Language name="C++">

Returns the Lexer or nil if none was found.



180
181
182
# File 'lib/linguist/language.rb', line 180

def self.find_by_alias(name)
  @alias_index[name]
end

.find_by_filename(filename) ⇒ Object

Public: Look up Languages by filename.

filename - The path String.

Examples

Language.find_by_filename('foo.rb')
# => [#<Language name="Ruby">]

Returns all matching Languages or [] if none were found.



194
195
196
197
198
199
200
# File 'lib/linguist/language.rb', line 194

def self.find_by_filename(filename)
  basename = File.basename(filename)
  extname = FileBlob.new(filename).extension
  langs = @filename_index[basename] +
          @extension_index[extname]
  langs.compact.uniq
end

.find_by_name(name) ⇒ Object

Public: Look up Language by its proper name.

name - The String name of the Language

Examples

Language.find_by_name('Ruby')
# => #<Language name="Ruby">

Returns the Language or nil if none was found.



166
167
168
# File 'lib/linguist/language.rb', line 166

def self.find_by_name(name)
  @name_index[name]
end

.find_by_shebang(data) ⇒ Object

Public: Look up Languages by shebang line.

data - Array of tokens or String data to analyze.

Examples

Language.find_by_shebang("#!/bin/bash\ndate;")
# => [#<Language name="Bash">]

Returns the matching Language



212
213
214
# File 'lib/linguist/language.rb', line 212

def self.find_by_shebang(data)
  @interpreter_index[Linguist.interpreter_from_shebang(data)]
end

Public: A List of popular languages

Popular languages are sorted to the top of language chooser dropdowns.

This list is configured in “popular.yml”.

Returns an Array of Lexers.



241
242
243
# File 'lib/linguist/language.rb', line 241

def self.popular
  @popular ||= all.select(&:popular?).sort_by { |lang| lang.name.downcase }
end

.unpopularObject

Public: A List of non-popular languages

Unpopular languages appear below popular ones in language chooser dropdowns.

This list is created from all the languages not listed in “popular.yml”.

Returns an Array of Lexers.



253
254
255
# File 'lib/linguist/language.rb', line 253

def self.unpopular
  @unpopular ||= all.select(&:unpopular?).sort_by { |lang| lang.name.downcase }
end

Instance Method Details

#==(other) ⇒ Object



496
497
498
# File 'lib/linguist/language.rb', line 496

def ==(other)
  eql?(other)
end

#all_extensionsObject

Public: Return all possible extensions for language



410
411
412
# File 'lib/linguist/language.rb', line 410

def all_extensions
  (extensions + [primary_extension]).uniq
end

#colorize(text, options = {}) ⇒ Object

Public: Highlight syntax of text

text - String of code to be highlighted options - A Hash of options (defaults to {})

Returns html String



487
488
489
# File 'lib/linguist/language.rb', line 487

def colorize(text, options = {})
  lexer.highlight(text, options)
end

#default_alias_nameObject

Internal: Get default alias name

Returns the alias name String



446
447
448
# File 'lib/linguist/language.rb', line 446

def default_alias_name
  name.downcase.gsub(/\s/, '-')
end

#eql?(other) ⇒ Boolean

Returns:

  • (Boolean)


500
501
502
# File 'lib/linguist/language.rb', line 500

def eql?(other)
  equal?(other)
end

#escaped_nameObject

Public: Get URL escaped name.

Examples

"C%23"
"C%2B%2B"
"Common%20Lisp"

Returns the escaped String.



439
440
441
# File 'lib/linguist/language.rb', line 439

def escaped_name
  EscapeUtils.escape_url(name).gsub('+', '%20')
end

#groupObject

Public: Get Language group

Returns a Language



453
454
455
# File 'lib/linguist/language.rb', line 453

def group
  @group ||= Language.find_by_name(@group_name)
end

#hashObject



504
505
506
# File 'lib/linguist/language.rb', line 504

def hash
  name.hash
end

#inspectObject



508
509
510
# File 'lib/linguist/language.rb', line 508

def inspect
  "#<#{self.class} name=#{name}>"
end

#popular?Boolean

Public: Is it popular?

Returns true or false

Returns:

  • (Boolean)


460
461
462
# File 'lib/linguist/language.rb', line 460

def popular?
  @popular
end

#primary_extensionObject

Deprecated: Get primary extension

Defaults to the first extension but can be overridden in the languages.yml.

The primary extension can not be nil. Tests should verify this.

This method is only used by app/helpers/gists_helper.rb for creating the language dropdown. It really should be using ‘name` instead. Would like to drop primary extension.

Returns the extension String.



426
427
428
# File 'lib/linguist/language.rb', line 426

def primary_extension
  extensions.first
end

#searchable?Boolean

Public: Is it searchable?

Unsearchable languages won’t by indexed by solr and won’t show up in the code search dropdown.

Returns true or false

Returns:

  • (Boolean)


477
478
479
# File 'lib/linguist/language.rb', line 477

def searchable?
  @searchable
end

#to_sObject

Public: Return name as String representation



492
493
494
# File 'lib/linguist/language.rb', line 492

def to_s
  name
end

#unpopular?Boolean

Public: Is it not popular?

Returns true or false

Returns:

  • (Boolean)


467
468
469
# File 'lib/linguist/language.rb', line 467

def unpopular?
  !popular?
end