Class: Linguist::Heuristics

Inherits:
Object
  • Object
show all
Defined in:
lib/linguist/heuristics.rb

Overview

A collection of simple heuristics that can be used to better analyze languages.

Constant Summary collapse

ACTIVE =
true

Class Method Summary collapse

Class Method Details

.active?Boolean

Returns:

  • (Boolean)


91
92
93
# File 'lib/linguist/heuristics.rb', line 91

def self.active?
  !!ACTIVE
end

.disambiguate_c(data, languages) ⇒ Object

.h extensions are ambiguous between C, C++, and Objective-C. We want to shortcut look for Objective-C and now C++ too!

Returns an array of Languages or []



36
37
38
39
40
41
# File 'lib/linguist/heuristics.rb', line 36

def self.disambiguate_c(data, languages)
  matches = []
  matches << Language["Objective-C"] if data.include?("@interface")
  matches << Language["C++"] if data.include?("#include <cstdint>")
  matches
end

.disambiguate_cl(data, languages) ⇒ Object



77
78
79
80
81
82
# File 'lib/linguist/heuristics.rb', line 77

def self.disambiguate_cl(data, languages)
  matches = []
  matches << Language["Common Lisp"] if data.include?("(defun ")
  matches << Language["OpenCL"] if /\/\* |\/\/ |^\}/.match(data)
  matches
end

.disambiguate_ecl(data, languages) ⇒ Object



50
51
52
53
54
55
# File 'lib/linguist/heuristics.rb', line 50

def self.disambiguate_ecl(data, languages)
  matches = []
  matches << Language["Prolog"] if data.include?(":-")
  matches << Language["ECL"] if data.include?(":=")
  matches
end

.disambiguate_pl(data, languages) ⇒ Object



43
44
45
46
47
48
# File 'lib/linguist/heuristics.rb', line 43

def self.disambiguate_pl(data, languages)
  matches = []
  matches << Language["Prolog"] if data.include?(":-")
  matches << Language["Perl"] if data.include?("use strict")
  matches
end

.disambiguate_pro(data, languages) ⇒ Object



57
58
59
60
61
62
63
64
65
# File 'lib/linguist/heuristics.rb', line 57

def self.disambiguate_pro(data, languages)
  matches = []
  if (data.include?(":-"))
    matches << Language["Prolog"]
  else
    matches << Language["IDL"]
  end
  matches
end

.disambiguate_r(data, languages) ⇒ Object



84
85
86
87
88
89
# File 'lib/linguist/heuristics.rb', line 84

def self.disambiguate_r(data, languages)
  matches = []
  matches << Language["Rebol"] if /\bRebol\b/i.match(data)
  matches << Language["R"] if data.include?("<-")
  matches
end

.disambiguate_ts(data, languages) ⇒ Object



67
68
69
70
71
72
73
74
75
# File 'lib/linguist/heuristics.rb', line 67

def self.disambiguate_ts(data, languages)
  matches = []
  if (data.include?("</translation>"))
    matches << Language["XML"]
  else
    matches << Language["TypeScript"]
  end
  matches
end

.find_by_heuristics(data, languages) ⇒ Object

Public: Given an array of String language names, apply heuristics against the given data and return an array of matching languages, or nil.

data - Array of tokens or String data to analyze. languages - Array of language name Strings to restrict to.

Returns an array of Languages or []



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# File 'lib/linguist/heuristics.rb', line 14

def self.find_by_heuristics(data, languages)
  if active?
    if languages.all? { |l| ["Perl", "Prolog"].include?(l) }
      result = disambiguate_pl(data, languages)
    end
    if languages.all? { |l| ["ECL", "Prolog"].include?(l) }
      result = disambiguate_ecl(data, languages)
    end
    if languages.all? { |l| ["IDL", "Prolog"].include?(l) }
      result = disambiguate_pro(data, languages)
    end
    if languages.all? { |l| ["Common Lisp", "OpenCL"].include?(l) }
      result = disambiguate_cl(data, languages)
    end
    return result
  end
end