Arthropod
Arthropod is an easy way to run remote ruby code synchronously, using Amazon SQS.
Do not use it yet, the API isn't stable at all and it wasn't tested enough on production
Installation
gem install arthropod
Or in your Gemfile
gem 'arthropod', '~> 0.0.2'
Configuration
You will need the following environment variables AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
and AWS_REGION
. Optionally the Arthropod::Client.push
and Arthropod::Server.pull
methods can take a client
argument with your own instance of Aws::SQS::Client
, see https://docs.aws.amazon.com/en_en/sdk-for-ruby/v3/developer-guide/sqs-examples.html for more information.
Usage
A simple use case first, let's say you want to push a video encoding task to another server:
url_of_the_video = "https://my_storage.my_service.com/my_video_file.mp4"
response = Arthropod::Client.push(queue_name: "video_encoding", body: { url: url_of_the_video })
puts response.body
# => "https://my_storage.my_service.com/my_reencoded_video_file.mp4"
On the "server" side:
Arthropod::Server.pull(queue_name: "video_encoding") do |request|
video_url = request.body["url"]
# Do the encoding stuff
encoded_video_url = VideoEncoder.encode(video_url)
encoded_video_url # Everything evaluated here will be sent back to the client
end
As you see, it's all synchronous and, since SQS will save your messages until they are consumed, your server doesn't even have to be listening right when you push the task (more on that later).
It is also possible to push updates from the server:
Arthropod::Server.pull(queue_name: "video_encoding") do |request|
video_url = request.body.url
# Do the encoding stuff but this time the VideoEncoder class will give you a percentage of completion
VideoEncoder.encode(video_url) do |percentage_of_completion|
request.respond { percentage_of_completion: percentage_of_completion }
end
encoded_video_url # Everything evaluated here will be sent back to the client
end
And on the client side:
url_of_the_video = "https://my_storage.my_service.com/my_video_file.mp4"
response = Arthropod::Client.push(queue_name: "video_encoding", body: { url: url_of_the_video }) do |response|
puts response.body.percentage_of_completion # => 10, 20, 30, etc
end
puts response.body
# => "https://my_storage.my_service.com/my_reencoded_video_file.mp4"
Errors
Any exception raised on server side will cause the client side to close immediately and raise a Arthropod::Client::ServerError
exception.
API
response = Arthropod::Client.push(queue_name: "video_encoding", body: { url: url_of_the_video }) do |response|
puts response.body
end
This method pushes a job to the SQS queue queue_name
and waits for the job completion, a block can be optionally provided if you expect the server to send you some updates along the way. The return value is the last value evaluated in the server block.
Arthropod::Server.pull(queue_name: "video_encoding") do |request|
request.respond "some_update"
"final_result"
end
This method will take a job from the queue and give it to the block, if no job are available the method will return immediately, it's your responsiblity to put this call in the loop if you want to. The last value from the block will be sent back to the client.
Why would you do take an asynchronous thing and make it synchronous?
This library is here to solve a few real use-cases we encoutered, most of them involves running heavy tasks on remote servers or on remote computers that are not accessible through the internet. For example:
- running some CUDA-related stuff from a cheap server, sometimes it's way cheaper to have a pretty beefy computers in house and run your heavy tasks on them instead of renting one for several hundred dollars each month.
- sometimes you need to access some data that are only accessible locally, thing about 3D rendering where your assets, cache, etc are all stored locally for better performance. Now your local computer can pull tasks from the SQS queue, run them and push the results.
Of course you can also achieve that by simply using SQS or any kind of message system, what Arthropod does is just to make it easier, however it's your responsibilty to run it in an asynchronous environment, think about an ActiveJob task for example. At its core Arthropod is just a thin layer around SQS.
Example: the poor man's video encoding service
If you're not concerned about latency, you can for example push some heavy video encoding task from and ActiveJob job in your Rails task and run a little cron job every minute on your uber-CUDA-powered computer at home to pull those jobs and reencode your videos. It should be reliable enough and it may be even be way faster than doing it with the CPU of a regular server.