forthebadge forthebadge Gem Version

This gem was last updated on the 03.07.2022 (dd.mm.yyyy notation), at 22:08:42 o'clock. forthebadge

The Universal Pipe Handler Project

Handling Pipes - via Ruby

„Because one pipe can rule them all ...“

Description

This project aims to implement pipes, in ruby. The idea here is to be able to use a fully OOP-centric implementation of pipes, rather than "merely" a simulation of UNIX/Linux pipes as such. Thus, the project attempts to include ideas from other projects, such as the powershell on Windows. It also attempts to include ideas gained from projects such as Avisynth/VirtualDub, which was a really useful project when it came to applying filters to image, video and audio files.

The actual implementation of the project may be split up into separate projects, but will be unified within the Universal Pipe Handler (UPH).

The general input-output follows this simple schematic:

How to require the project

Use this to require the project:

require 'universal_pipe_handler'

Usage examples

This subsection will try to show various usage examples that should be guaranteed to work, provided that you have installed various other projects (gems). For instance, for multimedia-related actions you will require the gem called multimedia_paradise.

Extract audio from a .mp4 file:

foobar.mp4 | extract_audio
foobar.mp4 | extract audio # both variants must work

Extract audio from a .mp4 file and convert it into a .wav file:

foobar.mp4 | extract audio | convert into .wav file
foobar.mp4 | extract audio | convert_into_wav_file
foobar.mp4 | extract audio | to_wav
foobar.mp4 | extract audio | towav
foobar.mp4 | extract audio | 2wav
foobar.mp4 | extract_audio | 2wav

Camelcase all files in the current working directory:

ls | camel_case
ls | camel case
ls | camelCase
ls | camel

The unimportant history of this project

I initially started this project back in 2007 or so, under another name - the Hermes pipe. The idea here was that Hermes, from the Greek mythology, would deliver messages to others. In German the name is "Götterbote".

After the Hermes pipe I created the Solar pipe, then the Master Pipe, then PipeHandler and eventually renamed the project to PipeParadise. Then it was idle for about 5 years or so, until June 2022. Quite a lot of history here, eh?

Most of my more successful projects end up with the string paradise, such as ftp_paradise and so forth.

I will most likely stick to the name universal pipe handler, but who knows really. I thought I'd stick with PipeParadise as a name as well, largely because many of my other ruby-based projects have the suffix paradise, but then I changed my mind again. So who knows.

Why so many rewrites, aside from the various name changes, though?

The Hermes pipe ended up being way too hard to maintain, so I thought about simplifying* it and using a better, simpler concept, including a full specification of the behaviour. This part is very important: I wanted to document what is available, so that if I need to rewrite anything in the future again, one day, I can just refer to that specification and continue from there.

Thus, this is one goal of the universal pipe handler: to really specify what it can do, at all times.

Current status of the Universal Pipe Handler project

The current status of the Universal Pipe Handler project is not complete at all, but it is still better than the prior status quo, which had no specification at all whatsoever.

Definition and Specification for the Universal Pipe Handler

This subsection will contain some definitions and specifications for the universal pipe handler project.

The definition of a cmdlet, short for Commandlet. An alternative name for a cmdlet would be a command snippet.

A cmdlet, as far as the universal pipe handler project is concerned, is everything before and after a pipe token (the | token). The pipe must be able to interprete these cmdlets.

If you look at the above example again:

foobar.mp4 | extract_audio | to_wav

Then you can see two | pipe tokens, and a total of three individual cmdlets.

Let's look at another example:

ls | nl | remove comments from mp3 files

This would list all entries from the correct working directory, number them (from 1 to n), then remove all comments from mp3 files found, and finally display the result to the user.

Thus, in this case you have three individual cmdlets:

ls
nl
remove comments from mp3 files

So everything before and after any individual '|' token is an individual "cmdlet". The primary objective for the universal pipe handler is to interprete and evaluate the cmdlet correctly.

We also have to store data that has to be exchanged between the individual command-snippets. This data will be stored in a toplevel variable called @dataset.

If there is an unknown instruction given, that is, an unknown cmdlet, then this must be reported to the user. Then the pipe will either exit (and thus terminate) or continue nonetheless, depending on the behaviour used for this pipe instruction.

Individual cmdlets described

This subsection will describe individual cmdlets. Ideally every cmdlet should be described, so that we can evaluate the behaviour and check it against the main specification.

Note that each individual cmdlet must be registered in the Array that is stored in the file called allowed_cmdlets.yml.

Keep the following list sorted alphabetically.

add_audio: This cmdlet attempts to add an audio to a video file. It also supports shortcuts via BeautifulUrl.

all: This is a general wrapper towards doing an action on "all" files or directories. It allows you to pattern-match towards a specific pattern. For instance, if you do "all rb files" then we will fetch all .rb files of a given directory.

all_images_from: this cmdlet will obtain all images from a subdirectory. It also responds to :symbols, such as in the following example: all_images_from :njoy_dir. The symbols specify from where something should be read.

any: This is a commandlet for convenience. For instance, if you do "any avi" then we will fetch any avi file (randomly). This can be used in pipes like: "any mp3 | stat" to report the size of that file. Not too terribly useful, unless you want to quickly test something.

ascii_video: This commandlet allows the user to play a video, in mplayer, as ASCII Art, using the aalib library.

assign: this cmdlet allows the user to assign to a specific file. Is the default for the first pipe processing task, to allow the user to operate with existing local files.

copy_directories: will copy all directories found within a directory into another target.

colourize: This little cmdlet allows the user to colourize string input in any way they see fit. Right now it is a bit limited though, you have to extend it as you use that.

crop: Crop can be used to crop (make it smaller) any given image or a video, such as from 1024px to 800px.

cut: cut audio ... like cut 30%. This would take from the beginning at 0% up to 30% of the audio file. We start from the left side by default, which is where audio and video files are normally started. If you provide a negative number here, like -30%, then we instead count from 100% to the left, in other words, -30% would mean 70% up to 100%. To recapitulate, if you use -30%, we start from the right side instead. (This allows you to clip away only the last part of an audio file.)

Here are two examples to highlight this: assign foo.mp3 | cut -30% ^^^ cut away the last 30% of the audio file. assign foo.mp3 | cut -30 ^^^ cut away the last 30 seconds of the audio file.

Note though that the general action "cut" could also mean to "cut this video file", so since December 2011 we assume that cutting a video file means that we want to just cut away some parts of it.

You can also use percentages, such as "30%-80%". This will start cutting at 30% and proceed up to 80% of the file length.

decolourize: this cmdlet get rid of colours from a video file specifically.

download: Like wget, this allows the user to download something. This uses a pure ruby solution, so the user is independent of wget actually.

extract: this is a meta instruction. Applied on video data, it will try to extract the audio and the video separately. Applied to an archive, such as a .zip file, it will extract said archive.

extract_all: this cmdlet allows the user to extract all packages in a given location.

find_all: this cmdlet allows the user to find a substring. Grep is an alias to this, so this is mostly a simulation of grep.

generate_string: this cmdlet allows us to generate a string, in particular a random string.

get_all_files: This commandlet will obtain all files of a given directory. It should equal the UNIX command "ls", more or less.

get_last_characters: How many characters we can get from a string or a file.

help?: lists all available commands that are supported by the universal pipe handler.

identify: This cmdlet can be used to identify a multimedia component, be it audio or video related data.

install: We try to install something with this cmdlet. This can be used like in "install | htop" to install htop. The latter will make use of the rbt project.

match_regex: This cmdlet allows the user to match to regexes found in a given file.

n_words: report how many words are in a given file, or somewhere else.

play: use this cmdlet to play an audio or a video file. Some aliases exist to this, i.e. play_video_file.

random line: fetch a random line from a file. This is usually done to show (display) that line.

read_file: This cmdlet will read the content of a file, similar to the unix "cat" command. If you use a : then we will use BeautifyUrl here. This allows us to access registered files more readily using symbols as shortcuts. Specific example for this functionality: read_file :pc

read_line: Read a specific line in a file. Like "read_line 9" allows you to read the ninth line. (Show line number x). Note that the keyword "last" will fetch the last line of a file.

remove_audio: This action attempts to remove the audio found in a video container or somewhere else. It will then create a new file, for instance called output.avi

remove_comments: this commandlet will remove comments from a file. The default comment specifier is the '#' character. Anything after that token will be removed, including that token itself.

remove_directories: removes all directories. Use with care. Note that / will never be assumed to be a valid directory for removal via this functionality.

remove_html: This cmdlet will remove all tags found in a given String. It can be used to "sanitize" a downloaded .html page, for instance, if the user needs the raw text.

remove_newlines: This cmdlet will simply remove newlines. In pure ruby code, this would be equal to .delete("\n"). strip_newlines is an alias to this cmdlet.

repackage_to: allows us to fill in the @result into a method and repackage that. A typical example for this may be repackage_to .tar.xz.

resize_image: This cmdlet allows the user to resize a given image. Right now, resize is an alias to resize_image but in the future we may wish to use resize on videos as well.

search: the search-cmdlet is primarily used to search for torrents that can be downloaded, since as of June 2011. However had, it may be extended eventually towards a more comprehensible search-functionality.

shuffle_csv: This commandlet can be used to shuffle CSV values. shuffle_csv 1,5,6 means that the entries at 5 and 6 will become right next to 1. It thus reoganizes the entries in a csv file.

Specific Example:

assign /Depot/j/test.csv | shuffle_csv 1,5,6 | save_as /Depot/j/output.csv

starts_with?: This queries whether a line starts with a specific substring or not. If yes, then we add it.

this_dir: This is not a commandlet-action as such, but a special alias. It is expanded into "assign Dir.pwd" for the most part (or, more accurately, into the method return_pwd, which is more or less equal to Dir.pwd, but with a trailing /).

to_camel_case: This will camelCase the given input. If a list of files is passed then this will be applied onto each file. Thus, use with caution.

to_pdf: This cmdlet can be used to create a .pdf file.

word_count: This will simply count the word frequency of a file or all files in a given directory. In case a directory is provided here, the current behaviour is to find all files in that given directory and count the words found therein, in all of these files.

word_wrap: wraps words after . word_wrap 76 would wrap after 76 characters. Aliases like wrap at exist, with wrap at 30 wrapping at every 30th character.

How to use several cmdlets

You can combine these cmdlets like in the following manner:

handler = UniversalPipeHandler.new 'assign foo.avi | filter huffyman | save to /Depot/bla.avi'

This will run the passed cmdlets - first assign to foo.avi, then run through a huffyman filter, and then save the result of that into a new file. The whole line is called a pipe.

Rather than UniversalPipeHandler.new() you could also use the slightly shorter UniversalPipeHandler[] method call.

class UniversalPipeHandler::CmdletsHandler

This class will handle individual cmdlets. Individual cmdlets will, in turn, be handled by class UniversalPipeHandler::Cmdlet, so class UniversalPipeHandler::CmdletsHandler is the kind of cmdlet-"controller".

Readline support

The universal_pipe_handler should be able to make use of Readline, in particular upon completion of all registered cmdlets. This should also work with aliases towards cmdlets.

Readline can be used for tab completion. The current behaviour is to tab-complete on every available cmdlet snippet, including the aliases to it. This should allow the user to complete individual instructions towards a cmdlet.

The variable @result

Data that is processed between two different cmdlets is always stored in the toplevel instance variable @result. Access to this toplevel instance variable should occur via the method .result?.

Compatibility to Avisynth

Avisynth was a really interesting program that would allow the user to do various video-related manipulations in a pipe-oriented manner. You could do numerous things with it, including overlaying commercial ads on a TV show with some ad-hoc bar that would censor it away, programmatically. This focus on using avisynth like a script was what really was nifty. I am trying to add support for this into the universal pipe handler from the get go, via ffmpeg, but who knows how far I'll get with this.

The following subsection shall show some examples how avisynth could be used:

# Add a subtitle
AviSource("foobar.avi")
Subtitle("Hello World!", size=56, align=5)

# Flip the video vertically
DirectShowSource('foobar.avi')
FlipVertical

# resize the dimensions of the video frame to 320x240
LanczosResize(320, 240)

# fade-in the first 15 frames from black
FadeIn(15)

Contact information and mandatory 2FA coming up in 2022

If your creative mind has ideas and specific suggestions to make this gem more useful in general, feel free to drop me an email at any time, via:

shevy@inbox.lt

Before that email I used an email account at Google gmail, but in 2021 I decided to slowly abandon gmail for various reasons. In order to limit this explanation here, allow me to just briefly state that I do not feel as if I want to promote any Google service anymore, for various reasons.

Do keep in mind that responding to emails may take some time, depending on the amount of work I may have at that moment.

In 2022 rubygems.org decided to make 2FA mandatory for every gem owner: see https://blog.rubygems.org/2022/06/13/making-packages-more-secure.html

As I can not use 2FA, for reasons I will skip explaining here (see various github issue discussions in the past), this effectively means that Betty Li and others who run the show at rubygems.org will perma-ban me from using rubygems as a developer.

As I disagree with that decision completely, this will mean that all my gems will be removed from rubygems.org prior to that sunset date, e. g. before they permanently lock me out from the code that I used to maintain. It is pointless to want to discuss this with them anymore - they have made up their minds and decided that you can only use the code if 2FA is in place, even though the code itself works just fine. If you don't use 2FA you are effectively locked out from your own code; I consider this a malicious attack. See also how they limited discussions to people with mandatory 2FA on the ruby-bugtracker, thus banning everyone permanently without 2FA:

https://bugs.ruby-lang.org/issues/18800

Guess it may indeed be time to finally abandon ruby - not because ruby is a bad language, but there are people now in charge who really should not be part of ruby in the first place.