RockBuster and XGBoost

Modeling the most important features for rental profitability.

RockBuster SQL

The RockBuster data is a mock movie rental company database in snowflake schema. In order to gain insight into profitability improvements, SQL, Python, and Tableau will be utilized. A sample of initial SQL queries are collected below as well as the data dictionary.

The subquery could be avoided altogether to get this average data, but it would leave us with a less dynamic query that would need to be updated with an evolving database.

RockBuster XGBoost and SHAP

XGBoost, short for eXtreme Gradient Boosting, is a popular machine learning algorithm known for its scalability and efficiency in handling large datasets. It uses an decision tree ensemble technique called gradient boosting to recursively improve weak nodes, resulting in highly accurate models for classification and regression tasks.

SHAP (SHapley Additive exPlanations) values quantify the impact of individual features on model predictions to offer insights into the importance of each input variable.

Targeting gross profit, the XGBoost decision tree model has a Mean Squared Error 0.0087, indicating the robustness of the predictive capability. Variable leakage is likely a key issue, but highlighting the underlying components provides direction towards some useful insights. Let's look at the model interpretation of a simple business heuristic, "Buy low, sell high."

Conclusions

The main takeaway from this analysis is the importance of base costs and pricing. Interpretation of the model puts quantifiable strategy in focus. Here we can see the three levels of rates that RockBuster pays for each movie copy, and the

Full RockBuster PDF

More Projects

Economy of Women's Rights

A 5 decade journey into Women's Rights and economic growth.

Learn more

Influenza Staffing Strategy

CDC dataset analysis of flu season in the US.

Learn more