An engine for capturing data from almost anywhere, parsing into an almost universal data type, automatically finding specitic sub-types by matching data, and passing it to SQL or R (statistical analysis). Its current development status is Proof-of-Concept and is likely to remain at this level since the project is excessively ambitious and I am still learning (aren't we all?). Proof-of-Concept means some parts are still research projects while others are incompletely implemented. Realizing this, the last year has been focused on developing a git workflow and tools for sorting code into git branches on a deliverable to experimental spectrum. Also interested parties are warned that since one of my curent research topics is the appropriate use of git rebase, the latest development may not be visible (awaiting rebase) or published rebases that complicate merging.
Currently this is developed as a single repository as this is simplest for a single developer with frequent changes in architecture. If a developer were interested in working on a subset, sub-repositories could be created. The following is an incomplete outline of the structure of the project and likely sub-repositories:
Domain Specific Language
|Class / Module||branch|