Class: Linguist::Heuristics

Inherits:
Object
  • Object
show all
Defined in:
lib/linguist/heuristics.rb

Overview

A collection of simple heuristics that can be used to better analyze languages.

Constant Summary collapse

ACTIVE =
true

Class Method Summary collapse

Class Method Details

.active?Boolean



78
79
80
# File 'lib/linguist/heuristics.rb', line 78

def self.active?
  !!ACTIVE
end

.disambiguate_c(data, languages) ⇒ Object

.h extensions are ambiguous between C, C++, and Objective-C. We want to shortcut look for Objective-C and now C++ too!

Returns an array of Languages or []



33
34
35
36
37
38
# File 'lib/linguist/heuristics.rb', line 33

def self.disambiguate_c(data, languages)
  matches = []
  matches << Language["Objective-C"] if data.include?("@interface")
  matches << Language["C++"] if data.include?("#include <cstdint>")
  matches
end

.disambiguate_cl(data, languages) ⇒ Object



64
65
66
67
68
69
# File 'lib/linguist/heuristics.rb', line 64

def self.disambiguate_cl(data, languages)
  matches = []
  matches << Language["Common Lisp"] if data.include?("(defun ")
  matches << Language["OpenCL"] if /\/\* |\/\/ |^\}/.match(data)
  matches
end

.disambiguate_ecl(data, languages) ⇒ Object



47
48
49
50
51
52
# File 'lib/linguist/heuristics.rb', line 47

def self.disambiguate_ecl(data, languages)
  matches = []
  matches << Language["Prolog"] if data.include?(":-")
  matches << Language["ECL"] if data.include?(":=")
  matches
end

.disambiguate_pl(data, languages) ⇒ Object



40
41
42
43
44
45
# File 'lib/linguist/heuristics.rb', line 40

def self.disambiguate_pl(data, languages)
  matches = []
  matches << Language["Prolog"] if data.include?(":-")
  matches << Language["Perl"] if data.include?("use strict")
  matches
end

.disambiguate_r(data, languages) ⇒ Object



71
72
73
74
75
76
# File 'lib/linguist/heuristics.rb', line 71

def self.disambiguate_r(data, languages)
  matches = []
  matches << Language["Rebol"] if /\bRebol\b/i.match(data)
  matches << Language["R"] if data.include?("<-")
  matches
end

.disambiguate_ts(data, languages) ⇒ Object



54
55
56
57
58
59
60
61
62
# File 'lib/linguist/heuristics.rb', line 54

def self.disambiguate_ts(data, languages)
  matches = []
  if (data.include?("</translation>"))
    matches << Language["XML"]
  else
    matches << Language["TypeScript"]
  end
  matches
end

.find_by_heuristics(data, languages) ⇒ Object

Public: Given an array of String language names, apply heuristics against the given data and return an array of matching languages, or nil.

data - Array of tokens or String data to analyze. languages - Array of language name Strings to restrict to.

Returns an array of Languages or []



14
15
16
17
18
19
20
21
22
23
24
25
26
27
# File 'lib/linguist/heuristics.rb', line 14

def self.find_by_heuristics(data, languages)
  if active?
    if languages.all? { |l| ["Perl", "Prolog"].include?(l) }
      result = disambiguate_pl(data, languages)
    end
    if languages.all? { |l| ["ECL", "Prolog"].include?(l) }
      result = disambiguate_ecl(data, languages)
    end
    if languages.all? { |l| ["Common Lisp", "OpenCL"].include?(l) }
      result = disambiguate_cl(data, languages)
    end
    return result
  end
end