Backorder Prediction for Supply Chain: A Data-Driven Approach

For this project, I took on the challenge of developing a backorder prediction model for a supply chain using Python and its various libraries, including Panda, Matplot, and Seaborn. The data set I worked with consisted of a staggering 10 million rows and 23 columns, requiring me to utilize my data wrangling skills to clean and prepare the data for analysis.

I began by handling missing values and dropping unnecessary columns, as well as addressing any outliers present in the data. I then conducted an in-depth exploratory data analysis, utilizing various graphical techniques to understand the relationships between different features and the likelihood of a backorder occurring.

With a solid understanding of the data and its patterns, I set out to model a prediction algorithm using the Scikit-learn library. After experimenting with various models and carefully evaluating their performance through metrics such as accuracy and F1 score, I was able to select the best model for this task.

Finally, I applied this model to the testing data to confirm its effectiveness and fine-tune its performance. This project allowed me to showcase my skills in data manipulation, analysis, and modeling, as well as my ability to work with large and complex data sets

Click here or image to download Report

Click here to see project on Github