S3Search
S3Search is an add-on for providing powerful document-based full-text indexing and search to your application.
Adding S3Search to your application will allow you search your documents based upon the actual text within the documents, as well as any metadata fields you assign to them. Yes, that's right! S3Search will index the text inside your documents.
Do you already have documents stored and want to index them and make them searchable? No worries. S3Search works by you sending it a URL to fetch the document content to index, along with a hash of metadata attributes to record against it. You can them perform powerful queries against that indexed data, based on the rich features of elasticsearch.
S3Search is a Heroku add-on.
Installation
Add this line to your application's Gemfile:
gem 's3search'
And then execute:
$ bundle
Provisioning the add-on
S3Search can be attached to a Heroku application via the CLI:
$ heroku addons:add s3search
-----> Adding s3search to sharp-mountain-4005... done, v18 (free)
Once S3Search has been added a S3SEARCH_URL
setting will be available in the app configuration and will contain your custom URL to access the newly provisioned S3Search service instance. This can be confirmed using the heroku config:get
command.
$ heroku config:get S3SEARCH_URL
https://user:[email protected]
After installing S3Search the application should be configured to fully integrate with the add-on.
Using with Rails 3.x
Ruby on Rails applications will need to add the following entry into their Gemfile
specifying the S3Search client library.
gem 's3search'
Update application dependencies with bundler.
$ bundle install
Write some application code to index some documents. Use the special field, _content_url
, if you want to specify a location to download content and make it searchable. S3Search will download the content and make it searchable via the special field _document_content
.
S3Search.create title: 'MyDocument', _content_url: 'https://s3-us-east-1.amazonaws.com/my_bucket/my_document.pdf'
S3Search.create name: 'Bob Lob Law', resume_id: 25, _content_url: 'https://s3-us-east-1.amazonaws.com/resumes.mycompany.com/bob.pdf'
The documents don't even really need to be in S3.
S3Search.create name: 'Bitcoin Pirate', resume_id: 42, _content_url: 'https://user:[email protected]/docs/jenny.pdf'
S3Search.create title: 'Bitcoin', author: '[email protected]', tags: ['bitcoin', 'manifesto'], _content_url: 'http://bitcoin.org/bitcoin.pdf'
The documents don't even really need to be documents! You can use S3Search to use its powerful search capability over just your custom metadata.
S3Search.create customer_id: 32, first_name: 'His Holiness', last_name: 'The Dalia Lama', religion: 'Buddhist', twitter_handle: '@DalaiLama'
S3Search.create customer_id: 99, first_name: 'George', middle_name: 'R. R.', last_name: 'Martin', job_title: 'Author'
Now retrieve some documents via the powerful query API.
Search by a single metadata field
results = S3Search.search('title:MyDocument')
Search all metadata fields AND the content of the documents.
results = S3Search.search('bitcoin')
Search only the content of the documents.
results = S3Search.search('_document_content:bitcoin')
Boost the search ranking of a certain field.
results = S3Search.search('bitcoin', boost: { title: 2.5 })
Find a single document based on its unique id.
document = S3Search.get '833FCA4EEEF2943AC2D8E0'
Monitoring & Logging
Stats and the current state of S3Search can be displayed via the CLI.
$ heroku s3search:status
documents_indexed: 32842
index_size: 640MB
S3Search activity can be observed within the Heroku log-stream.
$ heroku logs -t | grep 's3search'
Dashboard
The S3Search dashboard allows you to view the current status of your S3Search cluster.
The dashboard can be accessed via the CLI:
$ heroku addons:open s3search
Opening s3search for sharp-mountain-4005…
or by visiting the Heroku apps web interface and selecting the application in question. Select S3Search from the Add-ons menu.
Migrating between plans
Use the heroku addons:upgrade
command to migrate to a new plan.
$ heroku addons:upgrade s3search:newplan
-----> Upgrading s3search:newplan to sharp-mountain-4005... done, v18 ($49/mo)
Your plan has been updated to: s3search:newplan
Removing the add-on
S3Search can be removed via the CLI.
$ heroku addons:remove s3search
-----> Removing s3search from sharp-mountain-4005... done, v20 (free)
Before removing S3Search a data export can be performed by contacting [email protected] directly.
Support
All S3Search support and runtime issues should be submitted via on of the Heroku Support channels. Any non-support related issues or product feedback is welcome at [email protected].
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request