SPATIAL PLANNING TEXT INFORMATION PROCESSING WITH USE OF MACHINE LEARNING METHODS
- 1Institute of Spatial Economy, Wroclaw University of Environmental and Life Sciences, Poland
- 2Institute of Geodesy and Geoinformatics, Wroclaw University of Environmental and Life Sciences, Poland
- 3Institute of Automatic Control and Robotics, Poznan University of Technology, Poland
- 4Faculty of Computer Science and Management, Wrocław University of Science and Technology, Poland
- 5Department of Cartography and Visual Communication, Leibniz Institute for Regional Geography, Leipzig, Germany
- 6Wroclaw Institute of Spatial Information and Artificial Intelligence, Poland
Keywords: spatial planning documents, zoning plan, unsupervised machine learning, LSTM, neural networks, NLP
Abstract. Spatial development plans provide an important information on future land development capabilities. Unfortunately, at the moment access to planning information in Poland is limited. Despite many initiatives taken to standardize planning documents, the standard for recording plans has not yet been developed. Each of the planning areas has a symbol and a category of land use, which is different in each of the plans. For this reason, it is very difficult to carry out an analysis enabling aggregation of all areas with a specific, the same development function.
The authors in the article conduct experiments aimed at using machine learning methods for the needs of processing the text part of plans and their classification. The main aim was to find the best method for grouping texts of zones with the same land use. The experiment consists in an attempt to automatically classify the texts of findings for individual areas into the 10 defined categories of land use. Thanks to this, it is possible to predict the future land use function for a specific zone text regulation and aggregate all zones with specific land use type.
In the proposed solution for the classification problem of heterogeneous planning information authors used k-means algorithm and artificial neural networks. The main challenge for this solution, however, was not the design of the classification tool but rather the preprocessing of the text. In this paper an approach for text preprocessing as well as selected methods of text classification is presented. The results of the work indicate greater use of CNN's usability to solve the problem presented. K-means clustering produces clusters, in which texts are not grouped according to land use function, which is not useful in the context of zones aggregation.