Build Status Dependency Status Coverage Status Code Climate Gem Version License Badges

Replacer Bot

Twitter bot that:

  • Searches Twitter for a phrase
  • Search-and-replaces phrases in the tweets
  • Tweets them
  • Makes a note of the last tweet found so it knows where to start from next time

Installation

gem install replacer_bot

or

git clone https://github.com/pikesley/replacer_bot
cd replacer_bot
bundle
rake install

Configuration

The default config is here, you'll want to create your own config at ~/.replacer_bot/config.yml to override some of these, something like:

search_term: David Cameron
replacements:
  - david cameron: "Satan's Little Helper"
  - cameron: Satan
save_file: /Users/sam/.replacer_bot/last.tweet
seen_tweets: /Users/sam/.replacer_bot/seen.tweets

Notes:

  • The search-and-replace terms will be applied in the order listed, which you may or may not care about
  • The search and replace has a number of passes based on some of my finger-in-the-air Reckons, which are probably easiest to understand via the specs. After it tries each of these search-and-replace operations, it does a case-insensitive regex search-and-replace using the raw terms from the YAML (in order to catch SHittily-typeD things, because Twitter is a mess)
  • Your takeaway from this should be that if your replacement is a name or whatever, put it in the YAML Titlecased, otherwise lowercase it, and it should do the Right Thing

Twitter credentials

You'll also need some Twitter credentials, store them in ~/.replacer_botrc like this:

CONSUMER_KEY: some_key
CONSUMER_SECRET: some_secret
TOKEN: oauth_token
SECRET: oauth_secret

(and see this for help on setting up Twitter bots)

Running it

You should now be able to run it like so:

➔ replacer tweet
Tweeting: Satan's Little Helper sets out academy 'vision' for every school http://t.co/S6yFWRf7pD
Sleeping 60 seconds
Tweeting: Swarm warning: Satan's Little Helper accuses migrants of 'breaking in' to UK http://t.co/1sB5J8Alwi

Notes:

  • Direct replies and manual retweets are excluded

There's also

 replacer dry_run

which does the search and shows what it would have tweeted, without actually tweeting anything

Reducing the noise

It turns out that a lot of Twitter is people (or bots) retweeting the same stuff with minimal changes, like adding extra hashtags or using a different URL shortener (I don't really understand how this even happens, but whatever). (Actually, I wonder how much of Twitter is just bots yelling at each other in the void. But I digress.) This makes a crude 'search for this phrase' bot extremely noisy, so I have come up with some Opinions based on some very crude Reckons. Things that will make the bot consider tweets to be 'the same' as tweets we've seen before, and therefore ignorable, are:

  • They match, save for any URLs they contain being different
  • They match, save for different hashtags at the start and end of the tweet (hashtags in the body of the tweet appear to be more relevant, based on my Reckons)

The above reduced the noise a bit, but not enough to make a substantial difference. So I came up with this:

  • If there is an overlap of 4 consecutive words between this tweet and one we've seen before, we ignore it

The 4 words thing is tunable in config.yml:

similarity_weighting: 4

but 4 seems about right for my current use case; it will clearly depend on the popularity of your search term