Class: Swot
- Inherits:
-
Object
- Object
- Swot
- Extended by:
- SwotCollectionMethods
- Includes:
- NaughtyOrNice
- Defined in:
- lib/swot.rb,
lib/swot/academic_tlds.rb
Constant Summary collapse
- VERSION =
"0.4.2"- BLACKLIST =
These are domains that snuck into the edu registry, but don’t pass the education sniff test Note: validated domain must not end with the blacklisted string
%w( si.edu america.edu californiacolleges.edu australia.edu cet.edu ).freeze
- ACADEMIC_TLDS =
These top-level domains are guaranteed to be academic institutions.
%w( ac.ae ac.at ac.bd ac.be ac.cn ac.cr ac.cy ac.fj ac.gg ac.gn ac.id ac.il ac.in ac.ir ac.jp ac.ke ac.kr ac.ma ac.me ac.mu ac.mw ac.mz ac.ni ac.nz ac.om ac.pa ac.pg ac.pr ac.rs ac.ru ac.rw ac.sz ac.th ac.tz ac.ug ac.uk ac.yu ac.za ac.zm ac.zw cc.al.us cc.ar.us cc.az.us cc.ca.us cc.co.us cc.fl.us cc.ga.us cc.hi.us cc.ia.us cc.id.us cc.il.us cc.in.us cc.ks.us cc.ky.us cc.la.us cc.md.us cc.me.us cc.mi.us cc.mn.us cc.mo.us cc.ms.us cc.mt.us cc.nc.us cc.nd.us cc.ne.us cc.nj.us cc.nm.us cc.nv.us cc.ny.us cc.oh.us cc.ok.us cc.or.us cc.pa.us cc.ri.us cc.sc.us cc.sd.us cc.tx.us cc.va.us cc.vi.us cc.wa.us cc.wi.us cc.wv.us cc.wy.us ed.ao ed.cr ed.jp edu edu.af edu.al edu.ar edu.au edu.az edu.ba edu.bb edu.bd edu.bh edu.bi edu.bn edu.bo edu.br edu.bs edu.bt edu.bz edu.ck edu.cn edu.co edu.cu edu.do edu.dz edu.ec edu.ee edu.eg edu.er edu.es edu.et edu.ge edu.gh edu.gr edu.gt edu.hk edu.hn edu.ht edu.in edu.iq edu.jm edu.jo edu.kg edu.kh edu.kn edu.kw edu.ky edu.kz edu.la edu.lb edu.lr edu.lv edu.ly edu.me edu.mg edu.mk edu.ml edu.mm edu.mn edu.mo edu.mt edu.mv edu.mw edu.mx edu.my edu.ni edu.np edu.om edu.pa edu.pe edu.ph edu.pk edu.pl edu.pr edu.ps edu.pt edu.pw edu.py edu.qa edu.rs edu.ru edu.sa edu.sc edu.sd edu.sg edu.sh edu.sl edu.sv edu.sy edu.tr edu.tt edu.tw edu.ua edu.uy edu.ve edu.vn edu.ws edu.ye edu.zm es.kr g12.br hs.kr ms.kr sc.kr sc.ug sch.ae sch.gg sch.id sch.ir sch.je sch.jo sch.lk sch.ly sch.my sch.om sch.ps sch.sa sch.uk school.nz school.za tec.ar.us tec.az.us tec.co.us tec.fl.us tec.ga.us tec.ia.us tec.id.us tec.il.us tec.in.us tec.ks.us tec.ky.us tec.la.us tec.ma.us tec.md.us tec.me.us tec.mi.us tec.mn.us tec.mo.us tec.ms.us tec.mt.us tec.nc.us tec.nd.us tec.nh.us tec.nm.us tec.nv.us tec.ny.us tec.oh.us tec.ok.us tec.pa.us tec.sc.us tec.sd.us tec.tx.us tec.ut.us tec.vi.us tec.wa.us tec.wi.us tec.wv.us vic.edu.au ).to_set.freeze
Class Method Summary collapse
- .academic? ⇒ Object
- .domains_path ⇒ Object
-
.from_path(path_string_or_path) ⇒ Object
Returns a new Swot instance for the domain file at the given path.
- .get_institution_name(text) ⇒ Object (also: school_name)
- .is_academic? ⇒ Object
Instance Method Summary collapse
-
#academic_domain? ⇒ Boolean
Figure out if a domain name is a know academic institution.
-
#institution_name ⇒ Object
(also: #school_name, #name)
Figure out the institution name based on the email address/domain.
-
#valid? ⇒ Boolean
Figure out if an email or domain belongs to academic institution.
Methods included from SwotCollectionMethods
Class Method Details
.academic? ⇒ Object
25 |
# File 'lib/swot.rb', line 25 alias_method :academic?, :valid? |
.domains_path ⇒ Object
32 33 34 |
# File 'lib/swot.rb', line 32 def domains_path @domains_path ||= File. "domains", File.dirname(__FILE__) end |
.from_path(path_string_or_path) ⇒ Object
Returns a new Swot instance for the domain file at the given path.
Note that the path must be absolute.
Returns a Swot instance or false is no domain is found at the given path.
40 41 42 43 44 45 46 47 |
# File 'lib/swot.rb', line 40 def from_path(path_string_or_path) path = Pathname.new(path_string_or_path) return false unless path.exist? path_dir, file = path.relative_path_from(Pathname.new(domains_path)).split backwards_path = path_dir.to_s.split('/').push(file.basename('.txt').to_s) domain = backwards_path.reverse.join('.') Swot.new(domain) end |
.get_institution_name(text) ⇒ Object Also known as: school_name
27 28 29 |
# File 'lib/swot.rb', line 27 def get_institution_name(text) Swot.new(text).institution_name end |
.is_academic? ⇒ Object
24 |
# File 'lib/swot.rb', line 24 alias_method :is_academic?, :valid? |
Instance Method Details
#academic_domain? ⇒ Boolean
Figure out if a domain name is a know academic institution.
Returns true if the domain name belongs to a known academic institution;
false otherwise.
83 84 85 |
# File 'lib/swot.rb', line 83 def academic_domain? @academic_domain ||= File.exist?(file_path) end |
#institution_name ⇒ Object Also known as: school_name, name
Figure out the institution name based on the email address/domain.
Returns a string with the institution name; nil if nothing is found.
71 72 73 74 75 |
# File 'lib/swot.rb', line 71 def institution_name @institution_name ||= File.read(file_path, :mode => "rb", :external_encoding => "UTF-8").strip rescue nil end |
#valid? ⇒ Boolean
Figure out if an email or domain belongs to academic institution.
Returns true if the domain name belongs to an academic institution;
false otherwise.
54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/swot.rb', line 54 def valid? if domain.nil? false elsif BLACKLIST.any? { |d| to_s =~ /(\A|\.)#{Regexp.escape(d)}\z/ } false elsif ACADEMIC_TLDS.include?(domain.tld) true elsif academic_domain? true else false end end |