Comparative Analysis of Parametric, Semi-Parametric and Non-Parametric Methods on Real-Estate Data of the Kathmandu Valley
Abstract
Identifying the factors and variables that affect the prices of the real estate housing through the online real estate websites is a challenging task. Along with the thorough study of variables such as land size, number of storied of house, road length, and location, it also depends upon metrics such as people’s likes and dislikes indicated by the number of views on these websites for any real-estate property. The main aim of this research is to collect the relevant data (which is collected over 1 month of period) from the online real estate website called as Ghar Jagga Bazaar and apply different machine learning technologies to determine the price of the houses of Kathmandu valley and factors affecting it. After thorough preprocessing of data, the initial test were done to check the linearity of the data and found that several assumptions of the linearity of data such as homoscedasticity, independence of errors, normality of residuals, and multicollinearity test were violated. Even with the log and square root (sqrt) transformation of the data, there is no sign of significant improvement. Therefore, more robust semi parametric models with lenient assumptions such as Gradient Boosting method is experimented and its results are reported. Similarly, non-parametric methods such as Gaussian Processes, and Generative Adversarial Networks is explored. The optimum result is found with state of art non-parametric Gaussian Processes model with mean absolute error of 0.522, Jensen-Shannon distance of 12.79 and negative log-likelihood of -1071.91.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Sachin Kafle, Kiran Kandel, Nawaraj Paudel
This work is licensed under a Creative Commons Attribution 4.0 International License.