Twroute - Route Twitter status updates over http to your web app

Twroute uses the Twitter Stream API or the more stable Twitter Search API to watch status updates and convert them to HTTP post requests.

Twoute uses Delayed Job as a message queue ensuring that http requests are retried if the targeted web app is out of service for a bit.

As a Rails developer the motivation to write this gem is because I wanted a super simple way to write a twitter app where my app can recieve messages from twitter over http and respond to the sender by simply responding to the http request.

In other words an ordinary Rails/Merb/Sinatra controller can recieve a tweet and then reply to the sender by simply rendering a response.

Install

sudo gem install bhauman-twroute

Making your first Twroute app

Let’s say we are going to create an app that rewards you for shooting someone in the ass and reprimands you for shooting someone in the arm, face, etc. It gives you a point for the bum shot and takes one away for the the others.

twroute bad_shot

This will create the following directory structure:

bad_shot
  |- config
       config.yml        # the configuration file
       twroutes.rb       # the routing file with sample routes
  |- db                  # holds the sqlite3 database
  |- log                 # holds the database log files
  |- test
       test_helper.rb    # defines the should_route_to macro     
       twroutes_test.rb  # sample tests for the sample routes

App Configuration

The config.yml file has 3 sections:

“Submit to” section

This is where you define the host, port and HTTP Auth inforomation for the web app that twroute will be routing requests to.

submit_to:
  host:               localhost
  port:               3000
# http_auth_user:     ''
# http_auth_password: ''

The port field is not required. Be sure to set the host to the target web app. For instance if you have a web application that lives at tweetrecorder.com, the config would look like:

submit_to:
  host:               tweetrecorder.com

Twitter section

This is where you specify the username and password for the twitter account that is making the requests to the Twitter Stream or Rest Search API and also authoring the replies to the senders.

Twitter Stream Api Configuration

This section also specifies which Twitter stream api to use. Most likely spritzer, follow or track unless you are privileged. See the Twitter Stream API docs for more information.

This section also describes the post query parameters that you want to send to the twitter-api call.

The following is an example configuration if you want to see all the tweets that have the word shoot in them.

twitter:
  user:            example_account_name
  password:        example_password
  stream_api:      track
  stream_api_args:  
    track: shoot
Twitter Search API Configuration

See the Twitter Search API for more information.

This section the post query parameters that you want to send to the Twitter search api call.

The following is an example configuration if you want to see all the tweets that have the word shoot in them.

twitter:
  user:            example_account_name
  password:        example_password
  search_params:
    q: 'shoot'  # search for updates with the text 'shoot'
    rpp: '99'     # return 99 results on one page : limit is 100

Database section

This section should be ready to go as is. If you want to use MySQL this configuration goes straight to ActiveRecord so set it up the way you would for a Rails project. I haven’t used it for MySQL so no guarantees that it will work.

Routing

Testing First

We are going to use test driven development here. The system currently uses regex’s. And even the best programmers have trouble getting regex’s to work correctly.

For our example app bad_shot we have set of twitter updates that we would like to map to calls to our web app.

shoot @johnboy in the ass yeah buddy => /goodshot/create
shoot @johnboy in the arm            => /badshot/create
shoot @johnboy in the head           => /badshot/create
shoot @johnboy in the foot           => /badshot/create

Let’s make these into tests first.

Open test/twroutes_test.rb it will have some sample tests in it. Delete them and replace them with these routing tests.

class TwroutesTest < Test::Unit::TestCase
  should_route_to "shoot @john_Boy3 in the ass yeah buddy", "/goodshot/create"
  should_route_to "shoot @john_Boy3 in the arm",            "/badshot/create"
  should_route_to "shoot @john_Boy3 in the head",           "/badshot/create"
  should_route_to "shoot @john_Boy3 in the foot",           "/badshot/create"
end

Now you can run rake test and see all of your tests fail. But that is a way better start than not having any tests at all.

Writing your routes

Let’s try to make some of our tests pass.

First comment out all the tests except for the first one.

class TwroutesTest < Test::Unit::TestCase
  should_route_to "shoot @john_Boy3 in the ass yeah buddy", "/goodshot/create"
  # should_route_to "shoot @john_Boy3 in the arm",            "/badshot/create"
  # should_route_to "shoot @john_Boy3 in the head",           "/badshot/create"
  # should_route_to "shoot @john_Boy3 in the foot",           "/badshot/create"
end

We are going to only work on this one and then repeat the process for the other routes.

Open the config/twroutes.rb file in your editor of choice remove or comment the exiting routes. So it should read something like this:

Twroute::Routes.draw do |map|

end

Then add this first route:

Twroute::Routes.draw do |map|
  map.regex( {:whole_tweet => /shoot @john_Boy3 in the ass/},
             '/goodshot/create' )
end

Then execute rake test

You will notice that the test passed. (if not fix it). This is a good base to start from. Now lets refine the route one step at a time.

We want to match any username so add another test in twroutes_test.rb:

should_route_to "shoot @john_Boy3 in the ass yeah buddy", "/goodshot/create"
should_route_to "shoot @jannie_fly5 in the ass oh yeah", "/goodshot/create"

# the rest of the tests are commented out

Now the tests should fail. Change the Regex in the route like below:

map.regex( {:whole_tweet => /shoot @[\w\d_]+ in the ass/},
           '/goodshot/create' )

After each step check to see that the test passes. This should be passing now. We decided that white space shouldn’t matter so we add a new test.

should_route_to "shoot @john_Boy3 in the ass yeah buddy", "/goodshot/create"
should_route_to "shoot @jannie_fly5 in the ass oh yeah", "/goodshot/create"
should_route_to "shoot   @jannie_fly5   in    the   ass     oh    yeah", "/goodshot/create"

# the rest of the tests are commented out

Tests fail, adjust the route:

map.regex( {:whole_tweet => /shoot\s+@[\w\d_]+\s+in\s+the\s+ass/},
           '/goodshot/create' )

So this is a passable route but it will match all kinds of things you wouldn’t expect like “skeetshoot @marko in the ass” so more work could be done. Add a test:

should_route_to "shoot @john_Boy3 in the ass yeah buddy", "/goodshot/create"
should_route_to "shoot @jannie_fly5 in the ass oh yeah", "/goodshot/create"
should_route_to "shoot   @jannie_fly5   in    the   ass     oh    yeah", "/goodshot/create"
should_not_route_to "skeetshoot @marko in the ass oh yeah", "/goodshot/create"

Tests fail, adjust the routes:

map.regex( {:whole_tweet => /^shoot\s+@[\w\d_]+\s+in\s+the\s+ass/},
           '/goodshot/create' )

Alright you get the picture. This is the best way to go to ensure you are picking up the right tweets. As anomalies happen add tests and adjust the regex.

How the regex matcher works.

I am going to show you an advanced route.

map.regex( { :whole_tweet =>  /^shoot\s+@[\w\d_]+\s+in\s+the\s+(\w+).*/,
             :who_got_shot => [/^shoot\s+@([\w\d_]+)\s+/, 1]
             :shot_where =>   [/\s+in\s+the\s+(\w+).*/, 1]
            }, '/badshot/create' )

A few things are going on here. ALL the regexes defined have to be matched. The result of the match is a hash of values. :whole_tweet will be assigned the value of the whole tweet. The second match :who_got_shot has a back reference so it will only be assigned the name of the user who was shot. The third match :shot_where gets assigned the place where the person was shot.

Now what happens to the hash of matches? Two things

1 It gets posted along with the request as a hash named parsed[]. So for example in a Rails app you can refer to these parsed out goodies as

@name = params[:parsed][:who_got_shot]
@shot_where = params[:parsed][:shot_where]

2 The keys are available as substitutions in the target URL. So in the above example if we wanted to include some of the parsed items in the resulting url we could do this:

map.regex( { :whole_tweet =>  /^shoot\s+@[\w\d_]+\s+in\s+the\s+(\w+).*/,
             :who_got_shot => [/^shoot\s+@([\w\d_]+)\s+/, 1],
             :shot_where =>   [/\s+in\s+the\s+(\w+).*/, 1]
            }, '/badshot/create/name/:who_got_shot/where/:shot_where' )

Regex matcher and procs

We could rewrite the above example to use a proc as well.

map.regex( { :whole_tweet =>  /^shoot\s+@[\w\d_]+\s+in\s+the\s+(\w+).*/,
             :who_got_shot => [/^shoot\s+@([\w\d_]+)\s+/, 1],
             :shot_where =>   lambda { |tweet_text|
                match_data = tweet_text.match(/\s+in\s+the\s+(\w+).*/) 
                match_data ? match_data[1] : nil
              }
            },
            '/badshot/create/name/:who_got_shot/where/:shot_where')

If the proc returns nil it will not be considered a match and the next route will be tried.

Routing order

The routes are executed in a similar manner as Routes in Rails. They are tried in top down order. As soon as a match is found the rest of the routes are ignored.

What gets posted?

When a route is finally selected and a path is chosen three hashes get posted to the selected url:

The Parsed Hash

In our example this will be:

parsed[whole_tweet]:   shoot @johnny in the wild 
parsed[who_got_shot]:  johnny
parsed[shot_where]:    wild

The Twitter Tweet Hash - for Search API

tweet[id]:                       3317086732
tweet[source]:                web
tweet[profile_image_url]:     http://s.twimg.com/a/1252620925/images/default_profile_normal.png
tweet[to_user_id]: 
tweet[from_user]:             dailydid_dev
tweet[iso_language_code]:     en
tweet[text]:                  #test 123 #dailydid
tweet[from_user_id]:          50354397
tweet[created_at]:            Fri, 04 Sep 2009 17:03:56 +0000

The Twitter Tweet Hash - for Stream API

For example:

tweet[in_reply_to_screen_name]: 
tweet[id]:                       3317086732
tweet[created_at]:               Fri Aug 14 22:31:44 +0000 2009
tweet[in_reply_to_user_id]: 
tweet[favorited]:                false
tweet[truncated]:                false
tweet[source]:                   <a href="http://www.tweetdeck.com/" rel="nofollow">TweetDeck</a>
tweet[in_reply_to_status_id]: 
tweet[text]:                     shoot @johnny in the wild

The Twitter User Hash - for Stream API only

We pull this out of the tweet and post it as the sender[] hash:

sender[following]: 
sender[friends_count]: 86
sender[followers_count]: 113
sender[profile_link_color]: "990000"
sender[protected]: false
sender[profile_sidebar_border_color]: DFDFDF
sender[notifications]: 
sender[screen_name]: Sarahndipitea
sender[name]: Sarah
sender[profile_sidebar_fill_color]: F3F3F3
sender[created_at]: Tue Oct 21 03:06:15 +0000 2008
sender[id]: 16880192
sender[location]: Stumptown
sender[profile_image_url]: http://s3.amazonaws.com/twitter_production/profile_images/332336440/EllieUp_normal.jpg
sender[description]: You found me! Were you even *looking* for me?
sender[favourites_count]: 12
sender[profile_background_image_url]: http://static.twitter.com/images/themes/theme7/bg.gif
sender[statuses_count]: 9601
sender[profile_background_tile]: false
sender[verified]: false
sender[profile_background_color]: EBEBEB
sender[profile_text_color]: "333333"
sender[time_zone]: Pacific Time (US & Canada)
sender[utc_offset]: -28800
sender[url]: http://Sarahndipitea.wordpress.com

(_This tweet was completely chosen randomly. I didn’t want to take time to come up with fake data. If you have some funny data. Send me a pull request._)

Gentlemen start your servers.

So now we setup the config file and we have have tests and routes.

First initialize the Sqlite3 DB

Execute the following rake command:

rake twroute:init

Create a sample app to receive the post requests

You might want to set up a sample Rails app and tail the development log at this point.

rails sample_twroute_app
cd sample_twroute_app
./script/server

No need to create controllers that do anything. This just so you can see that the requsts are being made.

Start the twroute_runner daemon

This is the daemon that pulls down the tweets from the twitter stream api, routes them and stores them as a delayed job.

Change to the bad_shot directory and execute:

twroute_runner start

At this point IF there are tweets that match your stream api query AND these tweets also match your routes defined in twroutes.rb then you should see some delayed_jobs being created in log/database.log

You can stop this daemon by executing:

twroute_runner stop

Start the twroute_worker daemon

Seeing as we are creating Delayed Jobs we have to have a daemon that executes the jobs. You can start it in a similar fashion.

Change to the bad_shot directory and execute:

twroute_worker start

At this point IF there are jobs stored in the Sqlite3 database then you should see activity in the log/delayed.log

You can stop this daemon by executing:

twroute_worker stop

Start/Stop Short Cut

You can start and stop both the daemons using the rake twroute:start and rake twroute:stop commands.

Troubleshooting

No output

You can check that your Twitter stream request is functioning by temporarily making a catchall route at the end of your routes like so:

map.regex( { :match_all_tweets =>  /.*/ }, '/badshot/create' )

This will force all tweets from the stream to be sent to your app. This is very useful to verify that tweets are coming down and going through the system.

Responses will be Tweeted!

When your web app responds with a 200 status and with the ContentType header set to text/twitter Twroute will tweet the response.

It won’t respond to the sender. It will simply tweet the response. So if you want to respond you need to specify the @user at the beginning of the tweet.

Monit and God

Don’t forget to setup either Monit, God or some other server process monitor because Twitter reserves the right to close the connection whenever they want. I would say that it’s fairly safe to simply restart the daemons periodically as well.

License

(The MIT License)

Copyright © 2009 Bruce Hauman <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Copyright © 2009 Bruce Hauman, released under the MIT license