Author: John W.S. Lee
This study was motivated by a simple question: "Can a trained machine learning model perform as well as an analytical solution?" To find out, this study was conducted using data in the field of polymer extrusion processes.
First, a dataset for machine learning was prepared by generating throughput data using the extrucal library. The data included various extruder sizes, screw geometries, polymer melt density, and screw RPMs. Basic exploratory data analysis was then performed to examine the distribution of features and the target variable. Skewed features were subjected to a log transformation. Using the transformed data, cross-validation was carried out with multiple machine learning models. The best model was selected based on the cross-validation score, specifically the mean squared error
. Once the best model was chosen, hyperparameter optimization was performed. The performance of the selected machine learning model was compared before and after optimization for extruders ranging in size from 25 mm to 250 mm.
The evaluation of the model's performance revealed good agreement between the throughputs predicted by the machine learning model and the analytical solution, extrucal. However, significant disparities were also observed for certain extruder sizes. The following is a summary report of this study, and the actual codes used can be found in the notebook folder.
Extrusion throughput dataset was generated using extrucal.throughput_cal()
function and the following 7 parameters.
extruder_size
: Sizes of extruders ranging from 20mm to 250mm with an increment of 10mm.metering_depth_percent
: Depths of metering section of extrusion screws ranging from 2% to 10% of extruder sizespolymer_density
: Melt density of polymer materials ranging from 800 to 1500 kg/m^3screw_pitch_percent
: Screw pitch ranging from 0.6D to 2Dflight_width_percent
: Flight width of screws ranging from 0.06D to 0.2Dnumber_flight
: number of flights with a choice of 1 or 2rpm
: Screw RPMs ranging from 0 to 90
In order to apply randomness to the throughputs in the dataset, +/- 5% variation was applied to the throughputs calculated by extrucal.throughput_cal()
.
The following graphs show the distribution of features. metering_depth
, screw_pitch
, and flight_width
show skewness to a certain degree.
Log-transformation was applied to the 3 features, and the following are the results.
The target, throughput, also showed a strong skewness as shown below. Therefore log-transformation was applied to it.
After log-transformation of the target, the skewness disappeared. However, since there were many zero throughput data for the screw RPM of zero, there was a sharp peak in the graph as shown below.
Cross-validation was carried out using 6 different machine learning models: Ridge
, Lasso
, RandomForestRegressor
, XGBRegressor
, LGBMRegressor
, and CatBoostRegressor
. mean_squared_error
was used as the metric, and the following table shows the results.
CatBoostRegressor
performed best among the models.
Optuna
library was used for the hyperparameter optimization of the CatBoostRegressor
model. The following shows the throughput results predicted by the CatBoostRegressor
models before/after optimization and the analytical solution (with extrucal
library).
The prediction was not that good for 25mm extruder for both models before and after hyperparameter optimization.
The CatBoostRegressor
model was trained with extruder_size
in the range from 20mm to 250mm with 10mm increment. The previous results showed that the model didn't perform well for 25mm extruder, which was not the size used for training the model. So, it was tested to see if the model would perform any better for the extruder sizes that were included in the train data.
There are clear disparities between the throughputs predicted by the model and those by the analytical solution(i.e. by extrucal
library) for the extruder_size
that were not in the Train Data. The disparity was bigger for the smallest extruder(i.e. 25mm) maybe because its throughputs were order of magnitude smaller than other sizes, and mean_squared_error
was used as the evaluation metric. On the other hand, the predicted throughputs predicted for the extruder_size
that were in the Train Data were almost identical to those calculated by the analytical solution(i.e. by extrucal
library).
When the two cases were compared using mean_absolute_percentage_error
, it was 1.14% for the extruder_size
present in Train Data, whereas it was 6.92% for the extruder_size
that were not in Train Data.
In order to find out if the machine learning model correctly learned the effect of each extrusion parameter on the throughput, the feature importances of a machine learning model were investigated by using shap
library. Just to save the computation time, the optimized LightGBM
model (whose optimization process is shown in Appendix 2) was used to check the feature importances.
Similarly to actual extrusion processes, rpm
and extruder_size
were two biggest processing parameter for the model. The rank for the rest of the processing parameters also made sense.
The effect of each processing parameter on the throughput was correctly displayed. For example, the throughput increased with increasing rpm
, extruder_size
, metering_depth
, screw_pitch
, and polymer_density
, whereas it decreased with increasing number_flight
and flight_width
.
In the beginning, this study started with a simple purpose of just demonstrating that machine learning model can learn very complicated pattern and can perform as well as an analytical solution. However, while I was working on modeling, I found out that the model didn't perform well for the smallest extruder (i.e. 25mm). Initially, I thought that it was due to the fact that the throughputs at zero screw RPM were included in the train data. I also suspected that either the log transformation of the throughput might have affected the performance of the model (because the distribution of throughputs after log transformation looked really weird) or the throughputs of the 25mm extruder were just too small to be considered significant by the model. In the end, it was clear that, since CatBoostRegresser
, which is a tree-based model, was used, the errors for the extruder_size
that were not included in the train data were higher than those sizes that were included in the train data. Moreover, the feature importances showed that the trained model correctly learned the effect of each processing parameter in extrusion. For example, the throughput increased with increasing rpm, extruder_size, metering_depth, screw_pitch, and polymer_density, whereas it decreased with increasing number_flight and flight_width.
In conclusion, this study clearly demonstrated that it might be possible to train machine learning models with the datasets generated by an analytical solution. It would be also interesting to apply machine learning to learn the patterns of the dataset that are generated by more sophisticated computational methods, which would be one of my future works.
To download the contents of this GitHub page on to your local machine, follow these steps:
-
Copy and paste the following link:
git clone https://github.com/johnwslee/extrucal_machine-learning.git
to your Terminal. -
On your terminal, type:
cd extrucal_machine-learning
. -
Create a virtualenv by typing:
conda env create -f env.yml
-
Activate the virtualenv by typing:
conda activate extrucal_ml
-
Run the notebooks in notebook folder in order.