EFFECTIVE RAILROAD FRAGMENTATION AND INFRASTRUCTURE RECOGNITION BASED ON DENSE LIDAR POINT CLOUDS

: Monitoring the condition of railway infrastructure is essential for maintaining safety standards and preventing accidents. The regular inspections required for this task are still typically carried out in many countries with costly and time-consuming on-site human inspections. LiDAR point clouds collected by mobile laser scanning (MLS) already proved to be suitable for recognizing important railroad infrastructure elements, such as cables and the rail tracks. However, the computational requirement for processing large data sets like these often extremely dense point clouds is still a challenge nowadays, resulting in longer execution time than practically applicable. In our research, we have implemented and comparatively analyzed railroad fragmentation and object segmentation algorithms with the focus on robustness and high effectiveness: prioritizing automation and prerequisite reduction (e.g. the spatial relationship between the position of the railway track and the overhead contact line). These aspects also enable the easy parallelization for the processing of larger railroad segments.


INTRODUCTION
Railroad transportation is one of the most popular methods both for passenger traveling and cargo shipment. Public railroad transportation provides annually around 8 billion unlinked passenger trips and over 400 billion passenger-kilometers in the EU (EuroStat, 2021b), together with around 390 billion tonnekilometers in railway freight transport (EuroStat, 2021a). Regular monitoring and surveillance of the railroad infrastructure is crucial for safety concerns and accident prevention. This task is still carried out by expensive and time consuming manual visual inspections in many countries nowadays.
Automated detection of railroad infrastructure has been addressed based on LiDAR point clouds acquired both by mobile terrestrial laser scanning (MLS) (Arastounia, 2015), (Jwa and Sonh, 2015) or low-altitude aerial laser scanning (ALS), obtained usually from helicopters (Zhu and Hyyppa, 2014), (Jeon and Kim, 2019). Beside the generalized approaches, specialized algorithms on some characteristic of the surrounding environment have also been developed, optimizing their results in rural environments (Arastounia, 2015) (Cserép et al., 2018) or in urban environments (Arastounia and Oude Elberink, 2016). Either the powerline cable or the rail recognition in these studies depends on the previously calculated results, their position. Auxiliary data sources, either laser pulse return intensity (Yang and Fang, 2014) or high resolution ortho-imagery for RGB data could also be involved (Neubert et al., 2008), (Beger et al., 2011).
These state of the art methods can provide precise results, but their evaluation time is usually considerably high (magnitude of 5-10 minutes) even for relatively small railroad segments of a few hundred meters due to the heavy computational load. Developing concurrent algorithms can boost evaluation time resulting from the extensive size of the datasets (Arastounia, 2017). * Corresponding author Our research compares existing algorithms and contributes to the development and comparison of automated data-driven methods based on LiDAR point clouds for railroad fragmentation and infrastructure recognition. The proposed solution of our study focuses on the robustness and automation of the algorithm, through minimizing the assumptions (spatial relations between the cables and rail, flatness of the ground, known trajectory of the train and thus the rail tracks, etc.) Results are evaluated by their computational efficiency and the accuracy of the segmented objects.
The rest of the paper is organized as follows: Section 2 describes the utilized sample LiDAR datasets for the paper. Then Section 3 introduces the proposed methodology of our research and explains the processing steps. Section 4 presents the visual and numerical results and also contains a verification against a manually annotated point cloud. Finally, Section 5 concludes the paper and discusses the future work.

DATASET
The sample LiDAR datasets used in this study were collected by the Hungarian State Railways with a Riegl VMX-450 high density mobile mapping system (MMS) mounted on a railroad vehicle (shown in Figure 1), operating at 60 km/h. The imaging unit was composed of two 360 • field of view laser scanners and two high-resolution cameras. The navigation unit consisted of an integrated global navigation satellite system (GNSS) and an inertial navigation system (INS). The sensor was capable of recording 1.1 million points / sec with an average 3 dimensional range precision of 3 mm and a maximum threshold of 7 mm. Average positional accuracy was 3 cm with a maximum threshold of 5 cm. The acquired point clouds contain the georeferenced 1 spatial information (3D coordinates) with intensity and RGB data attached to the points.  Two datasets from different topographical regions of Hungary were selected and used in our research. The satellite view of the locations are depicted in Figure 2.
Dataset 1 is the Szabadszállás -Kiskőrös dataset, which covers an approximately 29 km long and 130 m wide rural railroad segment in Southern-Central Hungary and contains ca. 2.5 * 10 9 points. This area is generally flat with minimal to no slopes on the rail tracks.
Dataset 2 is the Szentgotthárd neighborhood dataset, which covers an approximately 5 km long and 90 m wide, partially rural, partially suburban railroad segment in Western-Hungary and contains ca. 0.8 * 10 9 points. Here, at the foothills of the Alps, topography is more varied and the sample contains slopes.
The complete datasets used in this research are proprietary, but the selected segments used for results and verification are made publicly accessible in the Data availability section at the end of the paper.

METHODOLOGY
The proposed methodology of our research contains of 3 major processing steps: i) railroad fragmentation receives a single large input point cloud and fragments it at the curves of the rail track. Hence the later processing steps, ii) cable recognition and iii) rail recognition will receive multiple smaller inputs, containing a mostly straight segment of the railroad. Cable and rail recognition can be evaluated independently and can be optimized to be executed parallelly. When required by the applied specific algorithm, cable recognition might depend on the result of rail recognition (or vice versa). This optional dependency disables the direct parallel execution of cable and rail detection algorithms for the same area. However, in case of a large amount of input fragments, where the complete dataset cannot be analyzed at once due to its size, this will not hinder the parallelization of the entire process. A possible follow-up, iv) processing step is the error analysis of the railroad infrastructure, which will not be addressed in detail in this paper. The described workflow of the methodology is depicted in Figure 3.
The following subsections 3.1, 3.2 and 3.3 will introduce these processing steps.

Railroad fragmentation
The fragmentation consists of the following parts: 1. A 2D projection of the input point cloud is generated. This 2D digital elevation model (DEM) is constructed from the point cloud along the Z axis, however instead of the usual inverse distance weighting (IDW) algorithm, the maximal Z coordinate in each grid cell is used as its value.
2. Vegetation is filtered through contour detection, since it can be a problem at the edge of the railway track: in some cases, the algorithm will not only be inaccurate, but it may even result in false splitting points.
3. The curve of the rail track is detected using one the following methods: Contour finding by first performing an Otsu thresholding (Otsu, 1979), followed by the contour finding with Suzuki's algorithm (Suzuki and Abe, 1985). Hough transformation (Duda and Hart, 1972) preceded by a Canny-edge detection (Canny, 1986).
Generalized Hough transformation or its Ballarddefined version (Ballard, 1981) to be more specific.
It is a modification of the normal Hough transformation so that it can recognize arbitrary shapes. However, this method is not completely automated: while it recognizes the precise occurrence of the searched shape, it is not able to rotate or resize the pattern during the search.
4. Finally, the point cloud is split based on the curve of the trajectory, resulting in one or more output point clouds.

Cable recognition
There are multiple types of cables to detect above the rail track (contact cables, catenary cables, return current cables), international and national legislation regulating their relative position to each other and to the rail tracks. In our study we aim to detect all kind of cables, but with no expectation to distinguish them. We present 3 algorithms in this subsection we have implemented and compared to achieve this goal.

Search from above with 2D Hough transform
The computational load usually grows with the dimension of the space, thus we used an algorithm that achieve point count reduction based on a 2 dimensional projection of the original point cloud (Cserép et al., 2018). Similarly like in Section 3.1, a 2 dimensional DEM is constructed from the point cloud along the z axis, with the maximal z coordinate in each grid cell as its value.
In order to reduce the noise, the projection grid is filtered by clearing all cells that have less than half of the maximum value. Afterwards we run a probabilistic Hough line detection on the projection first with permissive and then with strict parameters. Finally a cleaning phase of the algorithm goes through all the points and counts the cells around the actual cell with a similar value -a difference lower than a small threshold. In case this count falls below a given threshold, the cell must be removed, since on a continuous cable, cells with similar height should be located around it. The disadvantage of this approach would be the incapability to detect cables below each other. To address this issue, after the first run the selected points are removed from the cloud and the algorithm can be evaluated again to find the lower level cables also. Then, the detected cables from consecutive runs can be merged into a single result set. Figure 4 shows the main steps of the algorithm. In the first column the first run of the inner algorithm is displayed, and can be observed how the line detection initially finds the cables and the trees also, but then the cleaning step removes the false positive parts. Since our sample datasets contained three cables (with two below each other), the second run of the algorithm was deemed necessary. The second column of the subimages presents these results and how the additional cable was located correctly.

Hough transform for 3D line detection This ap-
proach is based on the work of Dalitz and his colleagues (Dalitz et al., 2017). They proposed a new scheme based on Roberts' minimal and optimal line representation (Roberts, 1988) to discretize the Hough parameter space in 3D. The discretization uses the tessellation of Platonic solids (in 3D space these are regular, convex polyhedrons). They used the following iterative modification of the transform. The method works well in case of outliers. 4. Finding all points Y ⊆ X close (i.e., distance less than cell width) to the line.
5. Determination of the optimal line going through Y with an orthogonal least squares fit.
6. Finding all points from X close to the fitted line and their removal from X and from the accumulator array.
7. Repetition of steps 2 to 6 until X contains too few points or the specified number of lines has been found.

Region growing algorithm
Region growing algorithms usually used for solving image segmentation problems, since this is the first step of a variety of image analysis and visualization tasks. The algorithms start with a point that meets a detection criterion to grow the point in all directions or a specified direction to extend the region. These procedures usually created for a specific task, thus don't have universal capability (Hojjat and Kittler, 1998).
The region growing approach is based on Zhang's and his colleagues method (Zhang et al., 2016). The original paper assumes that the trajectory of the train -and thus the rail track and the cables -are known. Since this information is not necessarily provided (e.g. for airborne laser scanning), we replaced this information with a small seed point cloud of the powerline cable as more robust solution, from which the trajectory can be calculated at the beginning of the algorithm with the RANSAC algorithm (Fischler and Bolles, 1981). Since the paper was not detailed enough some steps were changed in our implementation. Our version of the algorithm is summarized in Algorithms 1 and 2.
Algorithm 1 Self-adaptive region growing method, step 1 Func Find seeds (gridCount) 1: Find a line in the seed point cloud using RANSAC 2: Rotate the seed point cloud to be parallel with Y axis, using the parameters of the found line 3: Project the seed dataset onto y axis 4: Create grids with given number, gridCount 5: Select the grids which are not empty 6: Calculate the center of the points contained by the grids Algorithm 2 Self-adaptive region growing method, step 2 Func Extract cables (boxLength, maxP ointsP erBox) Add content of the bounding box to the cable point array 24: end while 25: Create grids with given size 26: Select the grids which are not empty 27: Calculate the center of the points contained by the grids

Rail recognition
Our solution is an adapted and optimized version of Arastounia's proposed algorithm (Arastounia, 2017). The original algorithm assumed that the trackbed is mainly flat, with very little variance, which we found not to be the case in our datasets. The developed algorithm was enhanced with proper slope detection and handling. The algorithm consists of three main parts.
1. Locating the trackbed within a small subset of the data (a) First, the railway direction axis and the start coordinate are computed. The initial step of the algorithm requires to cut out a small portion of the dataset, in which we detect the rail pairs. The problem emerges, that without directional data -which is not necessarily at our disposal -, it would not be defined where to cut the dataset.
(b) A course classification on a subset of the cloud is performed based on the heights of the points in the cloud portion. In a railway environment, the object with the most points in it should always be the trackbed, so the height of the trackbed is determined by searching for the most common height in our subset within a tolerance threshold of 0.75m.

Detecting the rail pairs in that subset
(a) Candidate seed points for the rails are selected. Since rail tracks by definition are narrow and relatively high objects, our aim is to locate points the trackbed, which are outliers in their respective local neighborhoods. Given p is point of the trackbed, this task can be achieved with the following algorithm: i. Calculate p's local neighborhood, Np. ii. Calculate Np's covariance matrix, C. iii. Apply eigendecomposition to C. iv. Classifying candidate rail seed points. The smallest eigenvalue of a local neighborhood without a rail piece should be below a low threshold close to zero, as the trackbed is usually constructed to have the smallest height variation possible in the longitudinal direction due to safety regulations.
(b) Lines are detected with 2D Hough transformation. In our algorithm, first the 3D point cloud of candidate rail seed points are converted into a 2D image. For this purpose we use the projection filter introduced and implemented in our previous work (Cserép et al., 2018). Since the Hough line transformation is dependent on the threshold given, there is a high chance that the same threshold will not provide appropriate results for two different datasets. To resolve this potential issue, the developed algorithm works as follows: i. Set the threshold to a high number. ii. Run the Hough transformation on the image. iii. If the Hough transformation did not give at least two lines, lower the threshold. iv. Repeat steps ii. and iii. until at least two lines are found or the threshold reaches zero.
When the Hough transform was executed successfully we now convert the 2D image back to 3D.
(c) Rail pairs can be recognized through their matching direction and the their predefined distance from each other, called the track gauge. Let d1 and d2 be the direction vectors of the lines calculated from the start and end points given by the Hough transform, and the following criteria can be constructed: 3. Growing the rail pairs throughout the rest of the data. An iterative algorithm was implemented which fully grows its input rail pair. Each point has to meet two criteria in order to become a candidate rail point. These are the following: H rail depicts the average height of the current segment we grow, Hp is the height of the point, v raildirection is the direction vector of the rail and vp is the vector connecting the point to the current rail segment. To grow a rail segment, first we calculate the local neighborhood Np for each p point with the radius being our grow size, then recognize candidate rail seed points from Np.
The flowchart in Figure 5 depicts the main steps of the algorithm. Figure 5. Flowchart of the developed rail recognition algorithm.

Fragmentation results
A curved rail track segment was selected from both sample datasets described in Section 2 to evaluate the railroad fragmentation. These test areas are shown in Figure 6 and 7.
The value of the maximum allowed path curve was 10 • , with this value the implemented methods in the framework worked properly. The execution time of each method for a given sample data can be found in Table 1.
The computed splitting locations of the methods are visualized in Figure 8. Each method is marked with a different color as denoted in the caption. In addition, the manually determined locations of the maximum trajectory of 10 • were also marked  to assess the accuracy of the methods. In both cases, the Hough transformation produced the best splitting locations (closest to the manually determined locations).

Object recognition results and verification
To evaluate and also verify the result and the accuracy of the object recognition algorithms, we annotated manually both the cables and the rails for a 100m long segment consisting of 7,316,298 points from Dataset 1 and tested the algorithms on it. This railroad segment is shown in Figure 9. The following metrics were examined: i) the runtime (without parallel execution), ii) the number of remaining points, iii) the number of false negatives and iv) the number of false positive detections 2 .
The results are shown in Table 2.
Among the cable detection algorithms, the region growing produced the best accuracy, however it had the benefit of receiving a small seed of the cable as an additional input, as discussed in Section 3.2.3. The 2D Hough transform algorithm for cable detection and the rail recognition method also produced a fairly good accuracy. The execution time for all evaluated methods are outstanding, since other novel approaches like (Arastounia, 2017) required over 5 minutes to process a 100m railroad segment even with concurrency. Unfortunately, concrete source code implementations and tested datasets are rarely made publicly available in the related literature, hindering the opportunity of a more precise comparison of results. Figure 10 shows the combined visual output of the best cable and rail track detection result.

CONCLUSION AND FUTURE WORK
Both MMS and low-altitude ALS point clouds of the railroad infrastructure are typically dense and therefore large point clouds, to guarantee that enough points are located on the important objects (e.g. cables) to recognize them. Therefore the automatic surveillance and monitoring of railroad infrastructure requires not only reliable, but also computationally efficient algorithms.   In our research we developed a software framework capable of detecting the most important railroad infrastructure, cables and rails in a large input file through a series of 3 processing steps. First, the trajectory of the rail tracks are detected and the input point cloud is fragmented into parts containing a straight segment of the rail track. By dividing the original input file into multiple fragments, this step already provides a high-level parallelization for future steps. After the fragmentation, the cable and rail recognition steps are performed, which could also be parallelized with each other. The study considered multiple algorithms for these steps and carried out comparative examination on their runtime and accuracy.
In our further work we will focus on the omitted fourth processing step mentioned in Section 3: the automated detection of possible errors and anomalies in the railroad infrastructure and its surrounding. Typical issues could be i) the improper height of overhead contact cable, ii) the horizontal deviation of the cables, iii) the dangerously close vegetation, iv) the deformation of the railway bedding or v) the sinking of the railway sleepers. We also aim to extend the involved algorithms with further available attributes of the points beside their position, like laser pulse return intensity or RGB data.

COMPUTER CODE AVAILABILITY
An open source prototype implementation for the discussed and compared algorithms were carried out in standard C++11 as part of our railroad infrastructure detection framework. Source code is available on GitHub, released under the BSD-3 license, at https://github.com/mcserep/railroad. The project was tested to build and run on Ubuntu Linux 20.04 LTS.

DATA AVAILABILITY
Datasets used in Sections 3 and 4 to reproduce results can be found at http://dx.doi.org/10.17632/ccxpzhx9dj.1, an open-source online data repository hosted at Mendeley Data (Cserép, 2022).