Backorder Prediction for Supply Chain: A Data-Driven Approach
For this project, I took on the challenge of developing a backorder prediction model for a supply chain using Python and its various libraries, including Panda, Matplot, and Seaborn. The data set I worked with consisted of a staggering 10 million rows and 23 columns, requiring me to utilize my data wrangling skills to clean and prepare the data for analysis.
I began by handling missing values and dropping unnecessary columns, as well as addressing any outliers present in the data. I then conducted an in-depth exploratory data analysis, utilizing various graphical techniques to understand the relationships between different features and the likelihood of a backorder occurring.
With a solid understanding of the data and its patterns, I set out to model a prediction algorithm using the Scikit-learn library. After experimenting with various models and carefully evaluating their performance through metrics such as accuracy and F1 score, I was able to select the best model for this task.
Finally, I applied this model to the testing data to confirm its effectiveness and fine-tune its performance. This project allowed me to showcase my skills in data manipulation, analysis, and modeling, as well as my ability to work with large and complex data sets