comment_extractor

Gem Version Build Status Coverage Status

Description

comment_extractor extracts the comment out from a source code.

Installation

CommentExtractor has been tested with ruby 2.1.

git clone https://github.com/alpaca-tc/comment_extractor
cd comment_extractor
rake install

Usage

Parser

Given a file path to Parser.for, it finds Extractor and returns an instance of self which is initialized by extractor. Getting the comments from file by using it.

require 'comment_extractor'

path = 'path/to/file'
if parser = CommentExtractor::Parser.for(path)
  comments = parser.parse
  comemnts.is_a?(CommentExtractor::CodeObjects)

  comment = comments.first
  comment.file  #=> 'path/to/file'
  comment.line  #=> 1
  comment.value #=> 'I am a comment'
end

Extractor

You can use Extractor directly.

require 'comment_extractor'

file_path = 'path/to/file.rb'
if extractor = CommentExtractor::Extractors.find_by_filetype('ruby')
  parser = CommentExtractor::Parser.initialize_with_extractor(file_path, extractor)
  comments = parser.extract_comments
end

# Other way to find extractor
extractor = CommentExtractor::Extractors.find_by_shebang('#! /usr/local/bin/ruby')
extractor = CommentExtractor::Extractors.find_by_filename('path/to/file.rb')

How to use extractor of specific filetype.

require 'comment_extractor/extractor/d'

# Remove shebang and encoding content
content = CommentExtractor::File.open('path/to/file.d', 'r') { |f| f.read_content }
comments = CommentExtractor::Extractor::D.new(content).extract_comments

Supported FileTypes

  • Bash / Zsh
  • C / C++
  • Class
  • C#
  • Clojure
  • Coffee-Script
  • D
  • EmacsLisp
  • Erlang
  • Fortran
  • Go
  • Haml
  • Haskell
  • HTML
  • Java
  • JavaScript
  • Tex
  • Lua
  • PHP
  • Perl
  • Python
  • Ruby
  • SASS
  • SCSS
  • SQF
  • SQL
  • Scala

TODO

  • Markdown
  • SQS; I can not implement it because I do not know the syntax of sqs.

Create a new Extractor

If you see something missing from the supported file type, please either file an issue or submit a pull request:) And I would be glad if I could have you send the new filetype's source code via an issues.

# lib/comment_extractor/extractor/file_type.rb
require 'comment_extractor/extractor'

class CommentExtractor::Extractor::FileType < CommentExtractor::Extractor
  include CommentExtractor::Extractor::Concerns::SimpleExtractor

  shebang /ruby$/            # (Optional)
  filename /\.(extention)$/  # (Required)
  filetype 'filetype'        # (Required) file type name. g.c 'ruby', 'python'

  # define_ignore_patterns(*given regexp)

  # define_bracket('"')   #=> define_ignore_patterns(/".*?(?<!\\)"/)
  # define_regexp_bracket #=> define_ignore_patterns(%r!/(?=[^/])!, /(?<!\\)\//)

  # define the rule of comment
  comment start_with: /;+/
  comment start_with: /;--/, end_with: /--\|/, type: BLOCK_COMMENT
end