CADASTRAL AND URBAN MAPS ENRICHMENTS USING SMART SPATIAL DATA FUSION

: Cadastral and urban map enrichment/upgrading is an essential requirement for smart urban management. The high pace of development and change in megacities can cause different challenges for urban organizations to reproduce their maps based on their need. New urban management aims and plans need new cadastral and urban maps with different standards and elements which may have existed in the other urban organization. Producing an original map or checking the maps of different organizations visually in a megacity is very costly and time-consuming. These challenges require an advanced integration approach to overcome them. Therefore, enriching maps with concerned organizations' maps and intelligent and automatically identifying as well as applying the changes in urban and cadastral maps will save time and cost for informed urban decision-making. This paper has employed the data of the third zone of the District six of the Municipality of Tehran, the capital of Iran, and identifies changes in the parcel’s geometry of the cadastre maps in comparison with the recently produced maps of the municipality of Tehran. After pre-processing the data, some spatial and attribute information are added to each feature, and the land parcels are enriched. By matching the algorithm and comparing the parcels geometry and attributes, suspicious parcels are identified by the logistic regression algorithm. The Accuracy and F1-Score of this model were 0.845 and 0.780, respectively. Finally, the suspicious parcels are checked and the parcels are located, deleted, merged, splitted and geometrically modified in the base map and the base map is enriched. This paper has successfully proposed a new framework for cadastral and urban map enrichment intelligently


INTRODUCTION
Up-to-date, accurate and enriched urban maps are a basic need for urban management organizations.As increment of urban life, the complexity of cities, and the pace of change and construction in metropolitan areas, responsible organizations must update, enrich and detect changes in their cadastral and urban maps for better and more informed and intelligent decisions.An important aspect of urban management is the proper property tax collection, reducing land conflict and coordination of the concerned land and property management organizations.Producing a new map is very costly and time-consuming due to the size and density of the metropolises.Visual inspection and comparation of different maps, satellite and aerial images need experts and cause huge costs and time for urban management organizations.On the other hand, different urban organizations have produced maps of cities for different purposes and periods.As a result, using existing means to find urban changes reduces urban management costs and increases urban organizational productivity.First, organizations should find the differences of their maps with more up-to-date and enriched maps and, if necessary, survey and update their information or locate, change (split, merge) or upgrade parcel boundaries or modify the land parcels which are identified as changed parcels in the change detection process.
Integrating different cadastral and urban maps has several challenges.(Hajiheidari et al., 2022) have mentioned some of the important challenges in the integration of cadastral and urban map from different sources.They have compared three different datasets of organizations in Tehran, Iran and found some challenges in this regard such as heterogeneous map production methods, inconsistencies in the employed maps, different map scales in different urban organizations, production methods, lack of cooperation and coordination in map data formats, interoperability and standardization issues.One of the important challenges of urban management by cadastral map is the frequent changes in urban land use.Land use plays an important role in different parts of urban management and planning and is an important factor to calculate tax, reduce land conflicts, mitigate the damages of natural and man-made disasters, and so one.Hence, updating land use in cadastral maps is an essential step in sustainable development of countries (Cienciała et al., 2021).(Safra et al., 2006) have presented a complete process of integrating data from maps on the Web.They provide three algorithms for using spatial and additional information as features' attributes in the matching and integration processes.They have tested the matching algorithm with three datasets for integrating hotel information in the Soho area of Manhattan, New York by using Google Earth and Yahoo Maps data.They have shown additional information can improve map updating and upgrading results if they import as suitable information as possible.(Jafari et al., 2022) have implemented the integration of 31 descriptive variables of residential properties to estimate the price of residential parcels in urban maps.They have employed four statistical and machine learning models such as random forest (RF), Ordinary Least Squares (OLS), Weighted K-Nearest Neighbor (WKNN), and Support Vector Regression (SVR) as basic models.the outputs of these models have been integrated by the stacking method to modeling the uncertainty.The Mean Absolute Percentage Error (MAPE) of the result has been reduced from 10.18 to 9.81 in comparison to the voting method.Change detection in spatial and land use have been studied in South Korea to update the urban maps.(Park and Song, 2020) have implemented two steps to update the existing cadastral map from aerial images.First, they have created an up-to-date land cover map from supersonic drones' (UAVs) images which are classified by a combined two-and three-dimensional convolutional neural network.Secondly, they have used a threestep incompatibility comparison to produce discrepancy map, which includes the area ratio that is different from the registered land use in each plot.Finally, incompatible components have been detected automatically which represent the objects that need updating process in a contrast map between land cover maps and existing cadastral maps.(Pullar and Donaldson, 2022) have examined the possibility of spatial updating of digital cadastral maps with other datasets such as orthophotos and evaluated their accuracies.They have used available data and checked the accuracy of the data which has been updated with aerial images and orthophotos in two different regions (a test study of approximately 15 hectares for the University of Queensland and a hypothetical grid for simplified testing in Australia).Finally, they have modified parcels boundaries using some ground control points.(Song et al., 2019) have classified land information and compared the classification results with cadastral maps in South Korea employing hyperspectral images and deep learning technique.This research has led to extract areas that need updating and modifying some feature information.Automatic building extraction from remote sensing data is another method that can be used in updating cadastral maps intelligently.(Wierzbicki et al., 2021) have implemented the modified fully convolutional U-Shape Network (U-Net) which has been used for segmentation of high-resolution aerial imagery and dense LiDAR data to automatically extract general building lines.The proposed method has led to automatically extract the buildings with an overall accuracy of 89.5% and 80.7% completeness which had an important role in Poland cadastral modernization.
The rest of the paper is as follows: Section 2 discusses the research methodology including the research conceptual model.Section 3 presents the case study and implementation covering the study area, the employed data, the analysis undertaken, the results achieved, and their evaluation method.Section 4. is considered the discussion and Section 5 has concluded the paper.

RESEARCH SCENARIO
The focus of this research is on identification and collecting cadastral and urban maps in a District of Tehran as a case study and the research is employing different strategies to enrich the cadastral and urban maps.The input data consist of cadastral maps produced by Cadastre office of Iran Deeds and Property Registration Organization.In addition, some urban maps have been produced by Municipality of Tehran.The output of the urban and cadastral map integration would be the enriched cadastral and urban maps to be used for better informed decision making in urban management.
The input data in this research is the data of government organizations that are responsible for urban management.The main objective is to identify and apply the urban changes in an urban spatial database with the help of more up-to-date data and to enhance the urban map information content to finally enrich the maps.The pre-processing steps include checking the quality of data, delete errors and prepare data for cadastral and urban integration and enrichment.The processes include data comparison in the datasets and identifying changes in the cadastral and urban maps and evaluating the output.Locating suspicious land parcels in their true locations and changing their geometry is the next step.In addition, providing a framework for detecting and applying the changes in cadastral and urban maps by updating and enriching maps is the other objective of this research.The research scenario has been illustrated in Figure (1).

Figure 1. Research scenario
The objective of this research is how to enrich and detect the changed parcels in an integrated cadastral and urban map and evaluate the detected changes considered as susceptible changed parcels.

PROPOSED METHODOLOGY
The framework of this investigation is enriching an urban and cadastral map by finding changed polygons intelligently and automatically in the more up-to-date map.In this research, at first, the datasets are prepared, quality controlled, and preprocessed, to make the data ready for data fusion steps where geometric, topologic, and thematic errors have been corrected.The data have been harmonized from the projection system, coordinate system, datum, scale, and format perspectives.In the next step, some geometric and attribute information for each parcel such as area and node counts have been added and each polygon is converted to the centre point.The next step would be to compare the points and match them with each other to find changes by the use of buffers around the base map parcels' points.Then, the output is evaluated.After that, the suspicious parcels are checked visually and located, deleted, merged or separated to enrich the base map.Finally, the output is a framework to detect changes in cadastral maps intelligently and automatically which -A base cadastral map needs to be enriched.
-Production of an updated map for the enriching the base map Inputs -Data quality control.
-Eliminating the errors and converting the map data to the same projection system, coordinate system datum and format.

Study Area
The study area is the third zone of Tehran municipality District six.This region has a high population density with a huge number of parcels that are located in the middle of Tehran megacity.The study area has been shown in

Urban Cadastral Datasets
Cadastral maps are used as a geospatial layer for land administration information and land use (Enemark and Williamson, 1996).Each urban organization has produced maps with specific standards depending on its goals.Therefore, the map accuracy varies based on the purpose (Grant et al., 2020).Cadastral maps have an essential role in each country.They can affect central government sustainable development, security of tenure and property ownership rights, increase the investment in the real estate market and the transactions, as well as supports a transparent and efficient property evaluation.The continuous cadastral development and its integration with other urban spatial databases is a vital need to provide reliable data updating on real state (Mehrassa et al., 2017).
In Tehran metropolis, various organizations require updated and enriched urban cadastral and topographic maps.Some of these organizations produce their own maps such as a cadastral office and municipality, others use other organizations' maps as base maps and add their attributes and features to these maps such as utility organizations.Both of these groups need to update, enrich, and detected changes in their maps.In this research, two datasets have been used including cadastral maps produced by Iran Deed and Property Registration Organization and the urban maps produced by Tehran Municipality.Tehran Municipality maps have been used to enrich the cadastre office map and detect changes in the cadastral maps.

Iranian Cadastral Organization Dataset: Iranian
Cadastral Organization is a subset of the Deed and Property Registration Organization which prepare judicial cadastral maps and documents property registration records.This organization's cadastral maps include registration information.The map which is produced by the National Cartography Centre of Iran (NCC) is one of the basic maps of the Cadastre organization.Cadastre organization enrich its map with the data produced within the cadastral organization as well as with other organizations' maps.

Tehran Municipality Dataset:
Tehran municipality is the main organization that is charged with managing the city and designing and implementing new plans for urban development.
Hence, it has to have all the changes and latest updated urban maps to control, develop and manage the city in the most effective way.On the other hand, any changes in buildings such as geometric and/or land use changes needs Tehran municipality approval.

Data Pre-processing
After obtaining data from Cadastre Organization and Tehran Municipality, data should be quality controlled and preprocessed to be ready for the integration phase.Comparing the two datasets has been shown in  1), Tehran Municipality maps are 12 years newer than those of the Cadastre Organization and they are produced at the larger scale.For integrating and comparing the two datasets, it is essential to convert the Cadastre maps from DWG (AutoCAD format) to Shapefile format (Esri Format) and check the parcels topology such as dangles.After that both the datasets should be modified in their scale to have the same scale which needs to generalize the Tehran municipality map.Finally, parcels geometry needs to be checked and modified.The differences in the parcels and blocks number make it clear that there are changes in the cadastral maps which is essential to be enriched.

Map Integrating Model
The map integration process has several steps.Three main steps include extracting geospatial objects from maps, matching similar objects among the maps, and representing the result to users and updating map features (Crommelinck et al., 2016;Heipke et al., 2008;Safra et al., 2006) In this research after the pre-processing step, area and the number of nodes have been calculated for each parcel in each dataset.Then the polygons have been converted to their centre points and the data are ready for matching by a buffer distance and finding suspicious polygons by the machine learning algorithm (logistic regression).

Buffer Distance and Matching Process:
Each polygon centre point in the cadastral maps is compared with centre points in a special buffer distance in Tehran municipality to find polygon or polygons that match with this polygon.Considering the scale of the cadastre maps, the buffer distance is set by Equation (1) (National Cartographic Centre (NCC) Specifications, 2007).According to Figure ( 4), after considering the buffer for the centre point of each parcel in the cadastral maps, if any municipality parcel centre point has been found in that buffer distance, the parcels are considered as matched parcels.

Spatial Data Fusion for Finding Suspicious Parcels by Logistic Regression:
Each polygon which was selected in the matching process, is compared with source polygon in area and node counts.If the differences between the two polygons is too much, the polygon in the base map is marked as a suspicious parcel.If a parcel has no matched parcel, it is marked as a nonsuspicious parcel.The Logistic Regression (LR) have been employed as a machine learning method to classification data into two different classes (suspicious and non-suspicious parcels).The logistic regression hypothesis is defined as Equation ( 2 These steps have been run again with Tehran Municipality map as the base map and cadastre maps for comparing the maps to find parcels which have not been available in the cadastre maps. The pre-processing steps, changing the parcels to centre points, applying the buffer operation for centre parcel points in the cadastral map to find the corresponding polygon in the municipality map have been implemented in the ArcGIS software (ESRI Development Team, 2014).The machine learning algorithm has been programmed in the MATLAB environment (MathWorks Development Team, 2018).

Model Evaluation
Accuracy, precision and recall are the statistical measures which are used for evaluation.F1-Score, which is based on precision and recall is an important factor that are used to evaluate outputs.
To calculate each of the measures, confusion matrix should be constructed.

Confusion Matrix:
This matrix has four main elements using them the outputs are classified to four parts as follow: • Parcels that are correctly detected positives are called true positives (TP).These are polygons which are changed and are found as a suspicious parcel correctly.

•
Parcels that are wrongly detected negatives are called false negatives (FN).These are the polygons which are changed, however, are not found as a suspicious parcel wrongly.

•
Parcels that are correctly detected negatives are called true negatives (TN).These are the polygons which have not been changed and are not found as a suspicious parcel correctly.

•
Parcels that are wrongly detected positives are called false positives (FP).These are the polygons which have not been changed, however, are found as a suspicious parcel wrongly.
The standard confusion matrix has been shown in Table (2) (Chicco and Jurman, 2020).

Detected positive Detected negative
Actual positive True positives (TP) False negatives (FN) Actual negative False positives (FP) True negatives (TN) Table 2.The standard confusion matrix

Accuracy:
The accuracy represents the ratio of the correctly detected instances and all the instances in the dataset (Chicco and Jurman, 2020).The accuracy has a positive value between zero and one.The accuracy is calculated using Equation (4).

TP TN Accuracy
TP TN FP FN where Accuracy = the accuracy of the model TP = the number of true positive polygons TN = the number of true negative polygons FP = the number of false positive polygons FN = the number of false negative polygons

Precision and Recall:
Precision represents the ration of correctly detected positive instances to the total detected positive instances.Recall represents the ratio of correctly detected positive instances to all the observations in the actual positive class (Chicco and Jurman, 2020).Precision and recall are calculated with confusion matrix elements as Equation ( 5) and ( 6).

TP Precision
TP FP = + (5) where Precision = the precision of the model TP = the number of true positive polygons FP = the number of false positive polygons where Recall = the recall of the model TP = the number of true positive polygons FN = the number of false negative polygons 4.5.4F1-Score: F1-Score is defined as the harmonic mean of precision and recall represented in Equation ( 7) (Chicco and Jurman, 2020).
where  1 −  = the  1 −  of the model Precision = the precision of the model Recall = the recall of the model 4.5.5 Model Evaluation: For evaluation of the results, 20% of learning data which are selected from cadastre map parcels are considered as test data.These test parcels are selected randomly.Then True Positives (TP), False Positive (FP), False Negative (FN) and True Negative (TN) are calculated and finally F1-Score is determined to evaluate the model.

Urban Cadastral Map Enriching
In contrast to other geospatial layers, cadastral maps need to be updated frequently.Spatial upgrading of digital maps is happened by increasing the spatial accuracy of all or parts of the digital spatial map content (Effenberg et al., 1999).Although spatial layers usually are updated completely by replacement with a new dataset, in cadastral maps, new parcels are added and adjusted to the maps (Rowe, 2003).
In this research each of the suspicious polygon has been checked visually and if the enriching process is needed, it would be enriched.Based on the data, enriching had five different types as it mentioned below: • Adding: If a parcel did not exist in the Cadastral Office map.This parcel needs to be located to the map (such as Figure ( 5)).

DISCUSSION
The objective of this research was to find and apply the changes to the urban and cadastral parcels to enrich the maps which is a vital process for the cadastre and urban management organizations.This proposed method of automatically finding the changed parcels leads to saving time and cost to realize which parcel needs to be checked again or surveyed and applied to the original maps as a part of the map enriching and updating process.On the other hand, this intelligently finding changed polygons or polygons required to be enriched, reduce visual checking, and due to the continued urban changes, it is a very important activity for urban management organizations in megacities with a large number of parcels with high density.
Determining buffer distance for objects and finding other objects in other sources in this distance has been considered as the method of object matching.This distance has been calculated based on a minimum of different employed maps' scales.Because the cadastre map has a smaller scale (1:2000) than that of the Municipality maps (1:1000), the buffer distance has been set 0.4 meters.For simplification, parcels have been converted to their central points, and the buffer has been set on these points and compared with another source map parcels' centre point.As the cadastral and urban maps contain parcels with noticeable areas, 0.4 meters is not a huge distance to cover a large number of features and the number of matched objects is limited for each parcel in the base map.This feature of cadastral and urban maps has made the enriching model fast and found changes easier.This method is able to find different kinds of changes in parcels' geometry such as adding, deleting, merging, disintegration (splitting), and geometrical changing.To find changes more accurate, different attributes have been compared such as area and node count for each parcel in different sources.
For evaluating the result, accuracy and F1-Score have been calculated based on the confusion matrix.20% of data has been selected randomly to calculate this matrix, accuracy and F1-Score.Accuracy and F1-Score values are between zero and one.Hence, greater values of these parameters show better model performance.The accuracy of this model is 0.845 and F1-score is 0.780 which can be valuable because after finding suspicious polygons, a user can double check them visually to ensure that polygons are correctly detected and after that applied to locate, delete or modify the polygons to enrich the base map.

CONCLUSION AND FUTURE DIRECTIONS
Updating and enriching maps is a vital process for all urban management organizations.Based on the rapid urbanization speed and the number of new constructions in megacities, cadastral maps should be enriched and upgraded frequently.However, because providing an urban cadastral map is a timeconsuming process and needs exorbitant costs, integrating the map with a more recently produced one that may have been produced in other organizations helps to find changes and enrich the maps.In this research, cadastral and urban maps have been integrated for applying the land parcel changes.The study area was Tehran, Iran and datasets were from Cadastre Organization of The Deed and Property Registration Organization and the Municipality of Tehran.To detect urban and cadastral map changes, the maps have been pre-processed, new fields calculated and corresponding polygons found in the maps.Hence the logistic regression has been learned by learning data and after that this method has been used to find suspicious.If the matched polygons have had different attributes, they were marked as a suspicious polygon.Accuracy of the change detection is about 0.843 and the F1-Score has been calculated as 0.721.Finally, the suspicious polygons have been checked visually and have been modified to enriched the base map.This map modification includes adding, deleting, merging, disintegrating and geometric changing.This method can be used as a framework for other datasets as well.
The innovation of this paper is to propose a novel framework for finding geometric changes in the land parcels in a cadastral base map using a more recent produced municipal urban map automatically to enrich the urban cadastral maps.It would be an effective method to enrich the cadastral map from time and cost point of views.
For future work, polygons can be combined by integrating methods such as ordered weighted averaging (OWA), Dempster-Shafer theory (DST), Rough Set Theory (RST), and Granular Computing (GrC).The suspicious polygons can be located to the correct position by finding transformation parameters and transferring polygon nodes from more recent maps to the base map.For finding the corresponding polygons or for detecting the changes more accurately, additional and descriptive information can be used.
and attributes for data.-Converting polygons to centre points.-Matching polygons from different sources -Detecting changes, undertaking the map evaluation and upgradation.Processes -Developing a framework for finding changes automatically and intelligently to enrich the maps.Output results to urban and cadastral map enrichment.The research methodology has been presented in Figure (2).
Development of a framework for finding changes automatically and intelligently to enrich the urban and cadastral maps

Figure 4 .
Figure 4. Buffer.Red point: Cadastral parcel point, Green point: Tehran Municipality parcel point.a) Point buffer b) The green parcel is in a buffer and matched to the red one c) The green parcel is out of buffer and not matched to the red one.
) = the logistic function x = the input variable  = the unknown parameters to be estimated from data where function g is the sigmoid function.The sigmoid function is defined as Equation (3base of the natural logarithm X is the input and h(x) is the sigmoid or logistic function output that is used for classify the input to the binary classes. is the parameter that should be calculated in the machine learning process by the training data and use to classify new input to each class.Learning data which includes the training data and the test data should be selected randomly with a good distribution among the cadastre map.After learning process and finding proper , each parcel attributes have been put in the model and the model return the class of that parcel which can be suspicious or nonsuspicious class.

•
Deleting: If a parcel has no matched parcel and has been deleted in the newer map.• Merging: If two or more polygons have been joined to one parcel (such as Figure (5)).• Splitting: If a parcel has been disintegrated into two or more parcels (such as Figure (5)).•Changing in geometry: If a parcel has been changed in node count or geometry (such as Figure (5)).

Figure 5 .
Figure 5. Red polygon: Cadastre map (base map), Green polygon: Tehran Municipality map.a) Enriching by adding the feature.b) Enriching by merging c) Enriching by splitting d) Enriching by modifying the feature.4.7 Results 1000 parcels have been considered randomly for learning the logistic regression and the machine learning algorithm.800 parcels are selected as training data and 200 parcels (20% of learning data) have been selected randomly as test data for evaluation the model by F1-score.By running the algorithm, 1420 items of 4444 parcels in the cadastre map have been identified as suspicious polygons.By checking the finding of the suspicious polygons, the output has been shown in Figure (6).

Figure 6 .
Figure 6.Red polygon: Cadastre Office map (base map), Red polygons with blue border: Suspicious polygons.The standard confusion matrix is calculated which is shown in Table(3).

Table 1 .
Table (1).The metadata of the employed maps As mentioned in Table (

Table 3 .
The standard confusion matrixThe F1-Score and accuracy are calculated based on the confusion matrix which are presented in Table (4).Accuracy is about 0.845 and F1-Score is about 0.780.

Table 4 .
Evaluation parameters