CALIBRATION AND VALIDATION OF A STEREO CAMERA SYSTEM AUGMENTED WITH A LONG-WAVE INFRARED MODULE TO MONITOR ULTRASONIC WELDING OF THERMOPLASTICS

: Ultrasonic welding of thermoplastics has become an important industrial manufacturing process in the fields of aerospace and transportation. High quality standards are demanded and a reliable quality assessment routine is fundamental. Most on-site and off-site approaches are time consuming or require additional active illumination. Especially temperature models have proven to be good indicators for welding quality. We present a methodology to measure the surface temperature in real-time and visualize it simultaneously as a 3D model. Through augmenting a stereo camera system with an additional passive thermal infrared camera, we are able to map the heat data of multiple successive welds of large, free-form structures into a common 3D data representation. A challenging calibration approach is used to derive the inner and exterior orientation for the trifocal camera system. Geometric and radiometric improvements for an aluminium chessboard allow the usage of wide-angle optics for the thermal infrared camera. Consequently, we verify the quality of each camera by means of their resolving power. Therefore, a Siemens star test pattern is used for the thermal camera as well. We demonstrate the effectiveness of our methodology on a robot-guided ultrasonic welding tool.


INTRODUCTION
Ultrasonic welding of thermoplastics (UWoT) is an industrial process with increasing importance in fields like aerospace and transportation manufacturing that demand high quality standards to ensure robust and safe products. The process speed, automation capabilities and the absence of additional compound materials make UWoT an overall cost-effective process that can produce clean, precise, and solid joints Gutowski, 1986, Grewell et al., 2003). On the other hand, weld quality highly depends on the used materials (e.g., type, thickness, quality, and geometry) and the chosen process parameters (e.g., vibration amplitude/frequency, weld/solidification time/force, and contact area) (Jia et al., 2012, Villegas, 2015. Therefore, each setup needs to be prepared carefully and necessitates a reliable quality assessment routine. Previous studies approached a quality assessment by investigating various sensors and models to link sensor data and weld quality. Sensors can be divided into on-site and off-site. On-site sensors include, for instance, a micro-controller that measures the power and displacement of a welder (Villegas, 2015), a nanovoltmeter for electrical resistance of the weld result (McGovern et al., 2019), or an optical microscope (Fischer et al., 2015) to inspect weld interface changes. The vast majority of researched off-site sensors involve active illumination sources (e.g., for shearography (Jia et al., 2012), pulsed thermography (Sharath et al., 2013,McGovern et al., 2019, lock-in thermography (Fischer et al., 2015) or x-ray scans (Fischer et al., 2015)) that emit radiation in different wave lengths and modulations, and measure its interaction with a specimen. Especially temperature models have been demonstrated to be a good indicator for weld quality Gutowski, 1989, Zhang et al., 2010, Levy et al., We follow the trend of employing temperature data as an informative quality indicator but propose a passive long-wave infrared (LWIR) camera setup measuring the surface temperatures that result during the welding in real-time. Moreover, we are convinced that a quality assessment process should concern the whole monitoring pipeline -starting from sensor quality to an appropriate data post-processing -to safeguard a quality assessment. To this end, we focus our study on three aspects following a top-down line of thought ( Figure 1 from right to left): First, we use a simple but effective welding verification that employs local peak temperatures during the UWoT process in order to decide on weld success or non-success. More specifically, we consider two types of non-success: failed bonding (too low temperature) and induced material degradations (too high temperature). Second, we propose a data pre-processing to visualize the monitored temperatures in a real-time 3D model. This allows us to map data of multiple successive welds of large, free-form structures into a common data representation. In order to do this, we augment an LWIR camera with a panchromatic stereo camera to a trifocal camera system and geometrically calibrate the cameras to each other. In this context, we introduce an improved trifocal chessboard calibration approach that builds on top of (Choinowski et al., 2019). Lastly, we approach to verify the quality of the single cameras. We therefore determine the camera's resolving power as one of the most important quality indicators by means of the well known Siemens stars. We demonstrate this approach on the example of our LWIR camera,

VIS/IR 3D-Model Generation
Welding Assessment Figure 1. Proposed monitoring framework. Construction: hardware and software integration of a panchromatic stereo (VIS) and a long-wave infrared (IR) camera. Calibration/Validation: trifocal geometric calibration with an improved chessboard (left) and resolving power validation (right) of VIS and IR cameras. 3D-Model Generation: thermal-colored 3D point cloud model for each frame-set of the synchronized cameras (10 Hz). The 3D model may be reinforced by a pre-created CAD model. Welding Assessment: Two common types of defects can be inferred if the measured surface temperature is greater or smaller than some pre-determined threshold values.

Multi-Sensor System Construction
the first time in the literature (to the best of our knowledge).
In the rest of this paper, we first provide a brief overview of the welding setup, our proposed trifocal camera system, and the data pre-processing, including 3D modelling and the thermal mapping (Sec. 2). We then introduce the improved geometric calibration routine for the trifocal system (Sec. 3) and how we determine the resolving power for our LWIR camera (Sec. 4). Following, we summarize our approach to derive welding quality using passive surface temperature measurements during the welding (Sec. 5). Finally, we present and discuss results from our experiments (Sec. 6) and conclude the work (Sec. 7).

MONITORING ULTRASONIC WELDING
Our experimental setup to perform and monitor the ultrasonic welding process is depicted in Figure 2. We first describe the composition of the welding system (Sec. 2.1), our camera system for data acquisition (Sec. 2.2) and our proposed data preprocessing steps (Secs. 2.3 and 2.4) in preparation for the data analysis.

Welding System
First we describe the welding system following Figure 2. The welding process is carried out on a curved rear pressure bulkhead segment (1) with external doubler patches (2). For the process, we employ an ultrasonic welding end effector (3) on a multi-axis robot manipulator (4). The semi-finished continuous carbon fiber reinforced products are placed and fixed on a contour-following anvil via a variety of suction cups from the bottom (5). Thereby there is enough space on the top for the run over the welding tool and for measuring the surface temperature optically at every position above the joining area. The camera system (6) is placed stationary at a distance of 1 m to ensure a safety gap to the robot manipulator while still maintaining high spatial resolution.
The rear pressure bulkhead segments are made of a carbon fiber reinforced thermoset (epoxy system) composite, produced in a vacuum infiltration and consolidation process. In the joining zone, functional layers of polyetherimide (PEI) foil were integrated. These enable thermoplastic welding. The 1.2 mm doubler patches were made of carbon fiber and thermoplastic (PEI) prepregs in a vacuum consolidation process.
The welding of the nearly 1.5 m weld seam is carried out in two steps: first, a multi spot welding process fixes the joining partners at twelve points along the overlap. In the second step, the end effector moves with constant speed along the contour following a path curve and generates a continuous welded joint.  (2), under the robot-guided (4) ultrasonic welding tool (3), fixed on an anvil with suction cups (5), and our camera system (6).

Camera System
We employ a multi-sensor system called Integrated Positioning System (IPS, see Figure 3) to monitor the welding process (Grießbach et al., 2012). The IPS stereo-camera consists of robust monochromatic industrial grade cameras with a global shutter. Picture release and synchronisation to other sensors, e.g., an inertial measurement unit (IMU), are handled by a field-programmable gate array. The data grabbing and postprocessings are done by a dedicated IPS software. The thermal camera was tightened on the IPS sensor head (top right in Figure 3). It has no possibility for triggering a picture release. Thermal image acquisition is done in free-run with 32 Hz. In addition, the capturing is done with rolling shutter, and the camera frequently performs automatic radiometric offset calibration during the recording session, which in turn makes it blind for a short moment. Using an industrial process interface, it is possible to provide the thermal cameras' framesync, flag and record status for the IPS trigger IN. Each frame from the thermal camera is hereby timely assigned to the IPS hardware recordings with inherent time lags of up to several milliseconds. The specifications of the used cameras can be found in Table 1.

3D Model Generation
The IPS was originally developed for accurate ego-localization of the moving system based on reliable visual inertial odometry estimation from accurately synchronized stereo images and IMU measurements. In the described stationary welding case, the camera system does not move and the ego-localization capability is not used, the goal is to generate time series 3D models of the scene. We represent the 3D models as point clouds. The stereo images are only used to generate high density depth maps and subsequently extracted 3D point sets. For this purpose, a semi-global matching algorithm with a census cost function is used (Hirschmüller and Scharstein, 2009). The computationally intensive matching step and the necessary image rectification are implemented in OpenCL and can be executed on a graphics processing unit (GPU), allowing real-time processing on a capable laptop PC.
The resulting point cloud quality mainly depends on the used camera setup (e.g., spatial/radiometric resolution, geometric calibration quality and viewing angle), scene properties (e.g., illumination and textureness) and the processing parameters. Due to the unfavorable flat viewing angle (which, however, ensures an unobstructed view on the work piece) and the low textured welding materials, we obtained relatively sparse point clouds (Figure 4(c)). We tackle this issue by incorporating prior knowledge: since the size of the work piece and its relative pose to the stationary camera system are known beforehand, we project a sampled point cloud from a simulated computer-aided design (CAD) model of the work piece into each point cloud frame ( Figure 5). Finally, we filter each point cloud frame using the CAD model point cloud to exclude points of no interest that do

Infrared Mapping
The IPS can record the data of an additional (LWIR) sensor synchronized with image and IMU data simultaneously. To obtain a time series with sufficient temporal resolution, stereo and LWIR image sequences were recorded with a frame rate of 10 Hz (see Sec. 2.2).
If the data from the additional sensor is represented in an imagelike format and the internal orientation, the distortion parameters of the LWIR sensor, and the relative orientation of the IPS stereo camera to the LWIR sensor are known, the thermal information can be mapped to corresponding 3D points. Due to the moving objects in the scene and changing LWIR values over time, the LWIR measurements from an image must be immediately mapped to the local 3D point cloud for that point in time. They can be added as label information to each 3D point or, if colorcoded, they can be mixed with the point colors obtained from the stereo images. The slightly different viewing angles of the stereo cameras and the LWIR camera can lead to occluded object parts even with this static measurement setup, therefore a resulting incorrect LWIR color assignment is counteracted by applying a voxel-based occlusion algorithm. Figure 4(a) and 4(b) show a left IPS image and a simultaneously acquired LWIR image. Figure 5(a) shows the 3D point cloud generated from one stereo pair with associated color-coded temperature values from the corresponding LWIR image. A point cloud reinforced with the CAD model point cloud is depicted in Figure 5(b).
In summary, the measurements of an additional sensor are referenced in space and time. A 3D point cloud time series with color-coded temperature values can be generated in real time, which can help to understand the temperature distribution in the observed scene in space and time.
(a) (b) (c) Figure 6. Synchronized calibration images of the trifocal camera system: left camera (a), right camera (b) and LWIR camera (c).

SYSTEM CALIBRATION
A precursor for the trifocal sensor integration is an accurate geometric camera system calibration. Basically, thermal imaging sensors could be geometrically calibrated like conventional cameras but calibration in the mid-wave to long-wave infrared spectrum poses several challenges. One is the relatively low number of image pixels which demands a good fit of target size and camera field of view. Another challenge is the creation of good contrast for the features on the calibration target. Suitable are targets either containing self-emitting elements or reflecting ambient radiation. Additionally, visual cameras have to recognize the target as well for the alignment of both sensor types. For a trifocal camera system, two additional cameras are rigidly mounted with respect to the first camera. This is modeled by two more relative orientations (R, t) c2 c1 and (R, t) c3 c1 , with a rotation part R, a translation part t and the cameras c1, c2 and c3. Capturing several images of a chessboard target is a common method in computer vision to derive intrinsic and extrinsic parameters at once. As in (Choinowski et al., 2019) we have chosen an aluminium chessboard target, which is portable and works well in the visual spectrum. Only the black parts are printed onto the board, while the rest remains blank. The blank chessboard parts are comparable to a mirror with high reflectance in the LWIR range. When positioned facing the sky, the thermal gradient between sky and ambient temperature combined with different emissivity of the printed and blank pattern yields in high contrast on the chessboard. An exemplary calibration image triplet can be found in Figure 6. A setup for calibration and spatial alignment of multi camera systems with planar reference targets can be found in (Luhmann et al., 2013). Corner detection and subsequent bundle adjustment is done on all synchronized calibration image triplets. For the automated detection of chessboard corners, as well as the following bundle adjustment, a solution presented in (Wohlfeil et al., 2019) is used.

DEPTH OF FIELD DETERMINATION
There are several lenses available for the LWIR camera module, varying in focal length and field of view (FOV, see Table  1). While the lens with FOV 33 • nominally offers best spatial resolution (assuming identical target distances) the 90 • lens has almost the same FOV compared to the visual modules which makes the co-registration more suitable. To identify and choose the best fitting option, effective spatial resolution has to be determined for every considered lens as it could vary significantly for target distance and the lenses themselves. In the following, we present the theory behind the determination process in Sec. 4.1 and our measurement procedure in Sec. 4.2.

Theory
Spatial resolution is an essential parameter of imaging systems (Meißner et al., 2018) as it defines a measure of imaged detail for every image taken by a sensor-lens configuration. Therefore, resolution estimation is important to quantify the potential of camera systems. Spatial resolution as an image quality parameter is part of the new upcoming German standard DIN 18740-8 "Photogrammetric products -Part 8: Requirements for image quality (quality of optical remote sensing data)".
Resolving power can be defined mathematically as follows: A point-like input signal U (x ′ , y ′ ) with object space coordinates x ′ and y ′ will be spread (or smeared) due to non-ideal imaging properties (Jahn and Reulke, 1995) and creates an output signal V (x, y) with image coordinates x and y: The spread output signal depends on the system impulse response H(r) = H(x, y, x ′ , y ′ ) for r = (x − x ′ ) 2 + (y − y ′ ) 2 which is therefore called point spread function (PSF) (Williams andBecklund, 1989, Jahn andReulke, 1995).
Furthermore, sharpness as an image property can be characterized by the modulation transfer function (MTF)H(k) which is the spatial frequency response of an imaging system to a given illumination. "High spatial frequencies correspond to fine image detail. The more extended the response, the finer the detail -the sharper the image." (Mix, 2005) and is equal to Fourier transform of PSF H(r): H(r) c sH (k). (2) The effective resolution of an imaging device can be determined in different ways. A classic approach is the use of wellknown test charts (e.g. USAF resolution test chart with groups of bars) (USAF, 1959). There, the (subjectively) identified image resolution corresponds to that distance where the smallest group is still discriminable. This is very similar to the Rayleigh resolution limit (Rayleigh, 1874). There, the response of an imaging system when illuminated with a point light source is defined and approximated by a sine cardinal function. Further, Rayleigh postulated the resolution limit as the minimal distance between the two sources where they are still discriminable. Using the definition that point light sources are approximated as sine cardinal functions the resolution limit is reached if the first maximum position of one function is identical to the first minimum of the other function. Besides subjective components included in this process, the function values are discrete instead of continuous, depending on resolution steps between groups of bars.
To reduce subjective influence with bar charts during the determ-ination process and to convert discrete function values to continuous ones, some approaches use signal processing techniques to calculate effective image resolution. The method described by (Reulke et al., 2004, Reulke et al., 2006 is one of the latter approaches. There, the contrast transfer function (CTF) and subsequently the MTF are calculated for images with a designated test pattern (e.g. Siemens-star -see Figure 7). According to the above mentioned approaches, the smallest recognizable detail or "the resolution limit is reached if the distance between two points leads to a certain contrast in image intensity between the two maxima." Using a priori knowledge of the original scene (well-known Siemens-star target) CTF, MTF and PSF can be approximated, e.g., by a Gaussian shape function (Honkavaara et al., 2006) or polynomial function.
Coordinate axis X for CTF and MTF is the spatial frequency k (3) and is calculated as the quotient of target frequency ks divided by current scan radius r multiplied by π. Target frequency ks is constant and equivalent to the number of blackwhite Siemens-star segments. Related (initially discrete) values for contrast transfer function C d (k) are derived using intensity maximum Imax and minimum Imin for every scanned circle (4). Simultaneously, the function value is normalized to contrast level C0 at spatial frequency equal to 0 (infinite radius). Continuous function values C are derived by either fitting a Gaussian function into discrete input data or, e.g., a fifth order polynomial. According to (Coltman, 1954) the obtained CTF describes the system response to a square wave input while MTF is the system response to a sine wave input. The proposed solution is a normalization with π 4 followed by series expansion using odd frequency multiples (5).
There are several criteria specifying resolving power of camera systems. The parameter σ (standard deviation) of the PSF (assuming Gaussian-shape) is one criterion. It directly relates to the image space and can be seen as objective measure to compare different camera performances. Another criterion is the width of the PSF at half the height of its maximum (full width half maximum -FWHM).
The value for MTF at 10 % modulation contrast is often referred to as resolution limit or cut-off frequency of MTFH(k) = 0.10 at spatial frequency kMT F 10, where it's reciprocal H(r) (PSF) corresponds to the least resolved scale in image domain. This scale factor multiplied by nominal ground sample distance then delivers the least resolved distance and is named ground resolved distance (GRD) (Kharfi et al., 2012, Artmann and Wueller, 2012, Valenzuela and Reyes, 2019, Nakamura, 2016.

Measurement Procedure
Using the introduced methodology, every lens system has been focalized according to the designated operating distance (i.e., 100 cm) and subsequently values for MTF10 have been determined for a series of different target distances starting from 20 cm up to 150 cm in 10 cm steps. This procedure then delivers a depth of field estimation for every lens system. Using the knowledge about focal length and pixel size, the effective resolution (GRD) can be determined and compared to find the best suited lens for the task at hand.
The used Siemens star target is equally manufactured as the chessboard described in Sec. 3 and faces the sky under clear weather conditions (see Figure 8). This way, the imaged infrared radiation is almost constant and homogeneously distributed. The infrared radiation emitted by the black patches and collected by the LWIR module can be considered as the maximum values when imaged. That means that a Siemens star image obtained this way is a photographic negative compared to visual black and white stars but does not affect the described methodology in the previous section.

WELDING VERIFICATION
The quality of the weld seam can be derived from the resulting temperature in the joining zone (Sec. 1), where the maximum temperature is expected during welding. This is because there is a pure matrix layer and/or a geometric shape of the interface where the mechanical energy is converted into thermal energy. The aim is to achieve a temperature TJ in the joining zone just high enough for the matrix material to melt in the entire overlap area. If the temperature is too low (TJ < TJ min ), there is no or only a local connection. If the temperature is too high (TJ > TJ max ), the matrix material degrades -its properties and the strength of the connection are then significantly reduced (Fischer et al., 2015). Hence, we aim for a TJ ∈ [TJ min , TJ max ].
At ultrasonic welding, as with all common joining processes for composites, there is no optical access to control the temperature TJ in the joining zone. Therefore, in this approach the surface temperature TS above the joining zone should be recorded in order to derive the melting temperature TJ in the seam from it, using a model that describes the relation TS ∼ TJ . The locally resolved maximum temperature is of particular interest because it determines the state of the matrix in the interface.
We follow the work of (Levy et al., 2014) for a modelling of TS ∼ TJ . Specifically, the authors show the relationship between the locally heat effected zone temperature and the associated maximum melt temperature during ultrasonic welding with CF PEI laminates in analyses and simulations.

RESULTS
We first evaluate results of our camera calibration experiments including the improved trifocal geometric calibration routine (Sec. 6.1) and the resolving power determination of the LWIR camera (Sec. 6.2). Finally, we present processed weld recordings for welding quality assessment (Sec. 6.3).

System Calibration
The chessboard pattern is captured with 28 differing poses under open sky (see Figure 9). Note that the different poses present a balanced set, with several rotations and distances to decorrelate intrinsic and extrinsic calibration parameters as good as possible.
All remaining re-projection errors are in the sub-pixel range and enable the system to generate metric point clouds, colored with thermal intensities in real-time. Compared to (Choinowski et al., 2019) our used chessboard contains one geometric and one radiometric improvement. On the one hand, the size of the chessboard patches is increased from 36 cm 2 to 72 cm 2 . Although this means roughly half the amount of corner points on a comparable area, the detection of corners itself becomes much more robust. On the other hand, only black patches are printed on custom static cling while the formerly white patches now remain blank aluminium. Under open sky the faint contrast is enhanced from 1 K to 10 K and more. All these advantages are used for the printed Siemens star target as well (see Figure 8). Another difference to (Choinowski et al., 2019) is the usage of a thermal camera with a fourfold amount of pixel resolution. Especially the corner detection benefits from these improvements, since it relies on good contrast and well defined edges along the pattern. Hereby, thermal optics with wider fields of view can be used without loosing much spectral and geometric information on the calibration chart. A comparison between both chessboard versions can be found in Figure 10. Practically, it is now possible to use the same field of view for thermal and visual cameras.

Depth of Field
According to the described measurement procedure in Sec. 4.2, modulation contrast at 10 % of the maximum modulation (MTF10) has been determined for every lens system (fixed focus to 100 cm) and several target distances starting from 20 cm and raising to 150 cm in 10 cm steps. Figure 11 provides a visual impression for all measured distances. It is obvious that the FOV 33 • lens system has a very narrow depth of field, significantly raising not until 80 cm and noticeable declining already at 140 cm. But the peak on the other hand, at a distance of 110 cm, is almost identical compared to the other lenses. The FOV 90 • lens has the widest depth of field but the FOV 60 • is only slightly behind has the best (effective) spatial resolution, MTF10 measurements can be translated to ground resolved distance (GRD, see Sec. 4.1) using the focal length and pixel size. Results are given in Figure 12. Even though the FOV 33 • lens has the largest focal distance and therefore the best (theoretical) ground sample distance, it can be seen that the narrow depth of field inhibits this advantage and GRD only is better in close surrounding of the operating distance. GRD of the FOV 60 • and FOV 90 • lenses are only short behind and do not change rapidly when moving away from the operating distance.
When considering the FOV 90 • lens system, it has the advantage of an almost identical field of view compared to the visual system (ideal for co-registration) but at the same time the least effective spatial resolution. In contrast, the FOV 33 • has the best resolution but only in close range around the designated operating distance. If this distance changes during monitoring the welding process, the FOV 60 • lens can be considered as sweet spot by having a rather wide depth of field and only a slight disadvantage in terms of GRD. Nevertheless, we chose the FOV 90 • lens as we favor the larger field of view in exchange for a slightly lower spatial resolution at our working distance compared to the FOV 60 • lens. Thus, we can record a complete welding pass (nearly 150 cm) without moving the camera with a GRD of around 2.3 mm per pixel.

Welding Verification
We first describe the process parameters used for the multi spot welding and the continuous welding process stages. The multi spot welding stage was carried out with a sonotrode amplitude of 100 % (in the used set up, peak to peak: 90 µm), a welding pressure of 700 N (sonotrode diameter of 22 mm, spherical work surface, thus up to 1.8 MPa). Each of the two clamping devices fixed the components with 1750 N. The welding time was 1 s, holding and cooling time was 5 s. The continuous welding stage was set with an amplitude of 90 %, welding pressure of 400 N, a cooling device pressure of 0.7 MPa, and a welding speed of 40 mm s −1 .
Next, we compare our measurements carried out with the results of (Levy et al., 2014). In contrast to our measurements, the authors calculated significantly lower surface temperatures. While we recorded local maximum surface temperatures TS between 113°C and 150°C (Figure 12(c)), the authors determined a surface temperature TS < 30°C and a joint temperature of TJ = 240°C in the simulation (the target temperature for welding PEI is around TJ = 300°C). This is certainly due to other process and infrastructure conditions. Due to the large temperature deviations, we do not apply the model and hence do not estimate TJ min and TJ max in this paper. A verification of the temperature correlation is therefore required in following examinations. Figure 13 shows the final thermal-colored 3D model of the continuous welding process stage. We choose TS min = 75°C and TS max = 110°C for demonstration purposes and hence consider temperatures below TS min as "too low" and above TS max as "too high". Figure 12(a) depicts raw maximum surface temperatures TS for each pixel projected onto the point cloud of the CAD bulkhead segment model, 12(b) thresholded temperatures with TS ∈ [75°C, 110°C], and 12(c) temperatures TS > 110°C with area-wise maxima labelled.
We make three major observations in Figure 12(a). First, the point-wise low temperature gaps that divide the continuous welding path into stripes. These areas are already welded in the preceding point-wise welding stage and thus remain cold in any further welding attempts. Second, we see circles above the welding path which indicate heat flow into the suction cups that wears them. Third, there are comparably lower temperatures recorded in the first two stripes which indicate a partial failed bonding.
Since we still detect point-wise higher temperatures and a heat flow into the above suction cups, we attribute this observation to a point-wise reduced connection between the welding materials at welding time (e.g., due to dust on the PEI foil). With 12(b) and 12(c) we demonstrate easily interpretable visualizations by filtering the surface temperatures with the chosen threshold values. Doing so, we can directly determine areas of welding success (color-coded) or non-success (grey) in 12(b), and areas with induced material degradations in 12(c).

CONCLUSION
We have proposed a real-time capable quality assessment routine for ultrasonic welding of thermoplastics using a carefully prepared stereo camera system augmented with a long-wave infrared module. The camera system follows a passive sensing approach and scans the surface temperatures of a region of interest during the welding process without the need of specific illumination sources.
However, the main idea of our routine is that a robust quality assessment considers the whole data pipeline, from ensuring a high sensor quality to preparing a data analysis appropriate to the application. To this end, we first evaluated the camera systems' quality during calibration: we improved the trifocal geometric calibration in contrast to previous works by adjusting the calibration chessboard, demonstrated the determination of the LWIR camera resolving power using Siemens stars for the first time, and emphasized the significance of both factors with respect to the camera setup experimentally. We have further enabled the examination of large free-form components that might entail multiple welding runs by projecting all scans into a common 3D representation. Finally, this 3D model is processed to directly highlight defects resulting from an absent or weak bonding (due to too low produced temperatures), or matrix material degradations (too high temperatures).
A possible extension of this work should be the experimental investigation of the critical temperatures that indicate welding defects, as they highly depend on the specific welding setup (e.g., material and process parameters). A trustworthy quality assessment also demands for an automatic sensor quality monitoring during the data acquisition (Wischow et al., 2021), which may include examinations of the LWIR cameras' radiometric characteristics (e.g., noise behavior) to indicate the need for a radiometric re-calibration. The literature basis for both is still sparse.