Polycystic Ovarian Syndrome (PCOS) classification and feature selection by machine learning techniques
Date
2020-12Author
Satish, C. R Nandipati
Chew, XinYing
Khaw, Khai Wah
Metadata
Show full item recordAbstract
One of the most common endocrine system disorders which affect about 5 to 10 % of the adolescent women is Polycystic Ovarian Syndrome (PCOS). The symptoms include failure to ovulate and infertility, cardiovascular diseases, type 2 diabetes, etc. The detection of PCOS can be done through biochemical, clinical and ultrasonography methods. It is known that early diagnosis and treatment could reduce the chance of PCOS. Hence, it is necessary to know which classification model and features play a significant role in the prediction of disease, which is the objective of this study with Python-Scikit Learn package and RapidMiner. Despite different tools used, the highest accuracy is shown by Random Forest (93.12%, RapidMiner) with the complete dataset. On the other hand, KNN and SVM show similar accuracy performances (90.83%, RapidMiner) with 10 selected features. The average performances of 10 and 24 selected features show insignificance and significance with the combined dataset, indicating these features could be used and cannot be used for the prediction of PCOS, respectively. A comparison of both tools and their performances shows that the RapidMiner performs better than Python. However, it depends on the performance of the classification model which in turn dependent on the nature of the dataset and techniques used.