MovieDB
MovieDB is a ruby wrapper for fetching raw Movie or TV Data from IMDb and performing a variety of statistical analysis and computation. The objective and usage of this tool is to media producers make high level structured decision decisions based off realistic figures.
The fetched data is stored in memory using Redis. An expiration time of 1800 seconds has be set for every load.
Basic functions and Data Analysis:
- Data Analysis
- Exploratory Data Analysis
- Confirmatory Data Analysis
Installation
Add this line to your application’s Gemfile:
gem 'movieDB'
And then execute:
$> bundle install
Or install it yourself as:
$> gem install movieDB
Require - loading the libraries
$> irb
$> require 'movieDB'
Usage - Fetch Raw Movie Data From IMDb
$> imdb_ids = ["0369610", "3079380"]
$> MovieDB::Movie.get_data(imdb_ids)
/* YOU CAN ADD AS MANY IMDB IDs AS YOU LIKE. BUT DO NOT EXCEED THE MAXIMUM REQUEST RATE. */
IMDb Data
When IMDb data is fetched, two things happen.
First a reports folder is created in the movieDB gem.
Second, the fetched data is written to an xls format and stored in the reports directory.
The file name is ‘imdb_’ + name title of the films you requested + today’s date
For example, the fetched data used
$ open /reports/imdb_JurassicWorld_Spy_20150611.xls
Usage - Analyze Raw Data and Generate Statistical Results (4 Steps)
$ irb
> require 'MovieDB/data_analysis'
> require 'MovieDB/data_process'
> MovieDB::DataProcess.send(:basic_statistic, 'imdb_JurassicWorld_Spy_20150611.xls')
Exported - Analyzed Data
The exported analyzed data is stored in your reports directory.
$ cd /reports/basic_statistic_20150611.xls
What’s Next
More statistical computations coming soon:
:GaussNewtonAlgorithm
> Iteratively_Reweighted_Least_Squares
> Lack_Of_Fit_Sum_Of_Squares
> Least_Squares_Support_Vector_Machine
> Mean_Squared_Error
> Moving_Least_Sqares
> Non_Linear_Iterative_Partial_Least_Squares
> Non_Linear_Least_Squares
> Ordinary_Least_Squares
> Partial_Least_Squares_Regression
> Partition_Of_Sums_Of_Squares
> Proofs_Involving_Ordinary_Least_Squares
> Residual_Sum_Of_Squares
> Total_Least_Squares
> Total_Sum_Of_Squares
:EstimationOfDensity
> Cluster_Weighted_Modeling
> Density_Estimation
> Discretization_Of_Continuous_Features
> Mean_Integrated_Squared_Error
> Multivariate_Kernel_Density_Estimation
> Variable_Kernel_Density_Estimation
:ExploratoryDataAnalysis
> Data_Reduction
> Table_Diagonalization
> Configural_Frequency_Analysis
> Median_Polish
> Stem_And_Leaf_Display
> Data_Mining
> Applied_DataMining
> Cluster_Analysis
> Dimension_Reduction
> Applied_DataMining
> RegressionAnalysis
> Choice_Modelling
> Generalized_Linear_Model
> Binomial_Regression
> Generalized_Additive_Model
> Linear_Probability_Model
> Poisson_Regression
> Zero_Inflated_Model
> Nonparametric_Regression
> Statistical_Outliers
> Regression_And_Curve_Fitting_Software
> Regression_Diagnostics
> Regression_Variable_Selection
> Regression_With_Time_Series_Structure
> Robust_Regression
> Choice_Modeling
> Resampling
> Bootstrapping_Population
> Sensitivity_Analysis
> Variance_Based_Sensitivity_Analysis
> Elementary_Effects_Method
> Experimental_Uncertainty_Analysis
> Fourier_Amplitude_Sensitivity_Testing
> Hyperparameter
> Time_Series_Analysis
> Frequency_Deviation
Contact me
If you’d like to collaborate, please feel free to fork source code on github.
You can also contact me at [email protected]