ECFS

ECFS helps you download and parse filings from the FCC's Electronic Comment Filing System.

Installation

Add this line to your application's Gemfile:

gem 'ecfs'

And then execute:

$ bundle

Or install it yourself as:

$ gem install ecfs

Usage

Proceedings

Search for a proceeding

proceedings = ECFS::Proceeding.query.tap to |q|
  q.bureau_code = "WC"  # Wireline Competition Bureau
  q.per_page    = "100" # Defaults to 10, maximum is 100
  q.page_number = "1"
end.get
#=>
# returns an instance of `ECFS::Proceeding::ResultSet`, which is a subclass of `Hash`:
{
  "constraints"   => {
    "bureau_code"  => "WC", 
    "page_number"  => "1", 
    "per_page"     => "100"
  },
  "fcc_url"       => "http://apps.fcc.gov/ecfs/proceeding_search/execute?bureauCode=WC&pageNumber=1&pageSize=100",
  "current_page"  => 1,
  "total_pages"   => 16,
  "first_result"  => 1,
  "last_result"   => 100,
  "total_results" => 1504,
  "results"       => [
    {
      "docket_number" => "10-90",
      "bureau"        => "Wireline Competition Bureau",
      "subject"       => "In the Matter of Connect America Fund A National Brooadband Plan for Our Future High-Cost\r\nUniversal Service Support. .",
      "filings_in_last_30_days" => 182
    },
   {
      "docket_number" => "05-337",
      "bureau"        => "Wireline Competition Bureau",
      "subject"       =>
      "In the Matter of Federal -State Joint Board on Universal Service High-Cost Universal\r\nService Support.  .. .",
      "filings_in_last_30_days" => 102
    },
  #...
  ]
}

Get the next page of results:

next_page = proceedings.next
#=>
{
  "constraints" => {
    "bureau_code" => "WC",
    "per_page"    => "100",
    "page_number" => "2" # automagically incremented the page number
  },
 "fcc_url"       => "http://apps.fcc.gov/ecfs/proceeding_search/execute?bureauCode=WC&pageSize=100&pageNumber=2",
 "current_page"  => 2,
 "total_pages"   => 16,
 "first_result"  => 101,
 "last_result"   => 200,
 "total_results" => 1504,
 "results"       => [
  # ... 
  ]
}

See ECFS::ProceedingsQuery#constraints_dictionary for a list of query options.

Fetch info about a proceeding from the results:

proceeding = proceedings["results"].select {|p| p["docket_number"] == "12-375"}.first
proceeding.fetch_info!
pp proceeding
#=>
{
  "docket_number" => "12-375",
  "bureau"        => "Wireline Competition Bureau",
  "subject"       => "Implementation of the Pay Telephone Reclassification and Compensation Provisions of the Telecommunications Act of 1996 et al.",
  "bureau_name"   => "Wireline Competition Bureau",
  "prepared_by"   => "Aleta.Bowers",
  "date_created"  => "2012-12-26T00:00:00.000Z", # iso8601 string
  "status"        => "Open",
  "total_filings" => "292",
  "filings_in_last_30_days" => "58"
}

Find a proceeding by docket number

proceeding = ECFS::Proceeding.find("12-375")
#=>
{
  "docket_number" => "12-375",
  "bureau"        => "Wireline Competition Bureau",
  "subject"       => "Implementation of the Pay Telephone Reclassification and Compensation Provisions of the Telecommunications Act of 1996 et al.",
  "bureau_name"   => "Wireline Competition Bureau",
  "prepared_by"   => "Aleta.Bowers",
  "date_created"  => "2012-12-26T00:00:00.000Z",
  "status"        => "Open",
  "total_filings" => "292",
  "filings_in_last_30_days" => "58"
}

Fetch filings for a proceeding:

proceeding = ECFS::Proceeding.find("12-375")
proceeding.fetch_filings!
proceeding["filings"]   # See Filings section below for sample responses

Filings

Search for filings

filings = ECFS::Filing.query.tap do |q|
  q.docket_number = "12-375" 
end.get
#=> 
[
  # Each result is instance of `ECFS::Filing`, which is a subclass of `Hash`
  {
    "name_of_filer"  => "Leadership Conference on Civil and Human Rights",
    "docket_number"  => "12-375",
    "lawfirm_name"   => "",
    "date_received"  => "2013-05-14T00:00:00.000Z",  # iso8601 string
    "date_posted"    => "2013-05-14T00:00:00.000Z", 
    "exparte"        => true,
    "type_of_filing" => "NOTICE OF EXPARTE",
    "document_urls"  => [
      "http://apps.fcc.gov/ecfs/document/view?id=7022313561",
      "http://apps.fcc.gov/ecfs/document/view?id=7022313562",
      "http://apps.fcc.gov/ecfs/document/view?id=7022313563"
    ]
  },
 {
    "name_of_filer"  => "The Leadership Conference on Civil and Human Rights",
    "docket_number"  => "12-375",
    "lawfirm_name"   => "",
    "date_received"  => "2013-05-13T00:00:00.000Z",
    "date_posted"    => "2013-05-13T00:00:00.000Z",
    "exparte"        => true,
    "type_of_filing" => "NOTICE OF EXPARTE",
    "document_urls"  => [
      "http://apps.fcc.gov/ecfs/document/view?id=7022313134"
    ]
  },
 # ...
]

See ECFS::FilingsQuery#constraints_dictionary for a list of query options.

Working with filing documents

ECFS::Filing#documents returns an Array of ECFS::Document instances.

document = filings.first.documents.first
pp document
#=> 
#<ECFS::Document:0x007fed7c95bf48
 @filing=
  {
    "name_of_filer"  => "Leadership Conference on Civil and Human Rights",
    "docket_number"  => "12-375",
    "lawfirm_name"   => "",
    "date_received"  => "2013-05-14T00:00:00.000Z",
    "date_posted"    => "2013-05-14T00:00:00.000Z",
    "exparte"        => true,
    "type_of_filing" => "NOTICE OF EXPARTE",
    "document_urls"  => [
      "http://apps.fcc.gov/ecfs/document/view?id=7022313561",
      "http://apps.fcc.gov/ecfs/document/view?id=7022313562",
      "http://apps.fcc.gov/ecfs/document/view?id=7022313563"
    ]
  },
 @pages=[#<ECFS::Document::Page @text=String, @page_number=1>],
 @url="http://apps.fcc.gov/ecfs/document/view?id=7022313561">

To get the text from a given document, you can use ECFS::Document#full_text.

You can also keep track of page numbers with ECFS::Document#pages, which returns an Array of ECFS::Document::Page instances. ECFS::Document::Page#text and ECFS::Document::Page#page_number are self-explanatory.

Bulk Queries

None of this works (leaving here for posterity):

This has been a problem that's been bothering me for a while: ECFS filing pages won't create spreadsheets when a query returns more than 10,000 filings. A simple solution is to add date constraints to the query until you have a set of queries where each result set contains 10,000 or fewer filings.

I implemented a convenience method that make these queries for you:

docket_number = "11-109"
query = ECFS::BulkFilingsQuery.new(docket_number)
filings = query.get

In the background, ECFS::BulkFilingsQuery#get will perform as many queries as necessary to retrieve all the filings for the given proceeding.

SOLR Search

The FCC has a SOLR search page which is not limited to 10,000 results. The bad news is that each page of results is maxed out at twenty. So this is all scrapable, but every 20 results requires a new HTTP request. Nevertheless, here's how you can scrape it:

filings = ECFS::SolrScrapeQuery.new.tap do |q|
  q.docket_number = '12-83'
end.get

p filings.first
#=> 

{
  'proceeeding'=>"12-83",
  'name_of_filer'=>"Media Bureau Policy Division",
  'type_of_filing'=>"PUBLIC NOTICE",
  'url'=>"http://apps.fcc.gov/ecfs/comment/view?id=6017027798",
  'date_recieved'=>"03/30/2012",
  'pages'=>10
}

Daily Releases

This feature parses these types of pages: http://transition.fcc.gov/Daily_Releases/Daily_Business/2014/db0917/.

The documents listed are PDFs, text files, and .docx files.

releases = ECFS::DailyReasesQuery.new.tap do |q|
  q.month = '12'
  q.day   = '17'
  q.year  = '2014'
end.get

txt_urls = releases.txts
pdf_urls = releases.pdfs
docs_urls = releases.docxs

p txt_urls.first
#=> 
{
  title: "DA-14-1835A1.txt",
  url: "http://transition.fcc.gov/Daily_Releases/Daily_Business/2014/db1217//DA-14-1835A1.txt"
}

Testing

$ bundle exec m

Contact

If you've made it this far into the README/are using this gem, I'd like to hear from you! Email me at [github username] at [google's mail] dot com.

Contributing

Fork it
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request