INTERACTING WITH 3D MODELS – 3D-CAD VS. HOLOGRAPHIC MODELS

A problem with 3D models is that devices used to display them are typically two-dimensional, i.e., computer monitors or printed maps. User interfaces of computer software are based on mouse, touchscreen, keyboards, etc. and are optimized for this dimensionality. However, this causes problems when working with 3D models and the user must adapt her actions by interpreting the missing third dimension. While this might not necessarily pose a problem for frequent users, infrequent users may find this quite challenging. Holographic models, on the other hand, float in front of the user, providing a 3D perspective. Interaction with this kind of models may thus be more intuitive than traditional interaction. In the paper we present the results from a first user test. 15 participants tested interaction with a holographic model visualized using Augmented Reality (AR) technology. The results were compared to those of 15 participants using a traditional 3D-CAD. It was found that the holographic approach is more intuitive leading to a lower frustration level although it is still restricted by technical limitations.


INTRODUCTION
3D data models are created because there are situations where they provide more information than 2D models. An example for such a situation is a 3D cadastre, a 3D representation of ownership. Traditional 2D representation is of limited value for topics like condominium rights (see Stoter et al., 2006). On the other hand, users of such data are not necessarily technically adept persons. Condominium owners can be people with all kinds of background and professions dealing with condominium include lawyers, craftspersons, or real estate agents. They should be able to work with and understand the 3D model, too. Traditional 3D-CAD can be quite difficult to use because the system must translate one-or two-dimensional movement of the input devices into three-dimensional movements of the model. There are small variations between different systems (e.g., effects of the different mouse buttons), which makes sporadic use challenging and calls for alternative approaches. We assume that Augmented Reality (AR) technology provides a new approach to make 3D models accessible for anybody. This paper presents the results of usability tests comparing the two types of systems for a data set representing condominium.
Creating and working with 3D computer models is a rather old topic. Research in the area of Human-Computer Interaction (HCI) often addresses the challenges that have to be overcome in order to enable effective interaction dialogues. Already in 1999, Zheng et al. pointed out that "Virtual Reality systems can offer a novel way for users to interact". They were using CyberGlove, a wearable system that measures the position and movement of the user's fingers. However, they were still using the computer screen as an output medium. Today, touchscreens are a tool to use gestures without the use of additional hardware (Radhakrishnan et al., 2013).
Interactive devices nowadays exceed the range of mouse and keyboard commands, or gestures. Shankar et al. (2014), performed user tests on creating 3D geometries using a Brain Computer Interface using an EEG headset. Although there exist problems concerning reliability, the work shows that communication channels are not restricted to what is currently used. However, although the tested channel itself is innovative, the test environment still used the standard computer screen to provide visual feedback. This could be changed by adopting holographic visualizations provided by AR technology. In this work we will focus on whether current interaction methods of AR systems are more effective than those of standard 3D-CAD systems for one specific application scenario.
To address the question of efficiency of interaction, a first experiment was conducted comparing a traditional screen-based approach for the visualization of a 3D model with an AR-based approach. The AR-based approach uses the Meta2 AR-headset, a head-worn display. The glasses of the headset are semitransparent, allowing to see both, the reality and the virtual augmentations. The model in Figure 1 was used as a proof-ofconcept to evaluate the question whether a holographic model is a suitable alternative for interaction with 3D models.

HOLOGRAPHIC MODELS USING AUGMENTED REALITY TECHNOLOGY
In mixed reality, real and virtual environments are combined. Virtual reality completely blocks the real environment and is thus the counterpart to the real environment. AR typically refers to the concept of enhancing (augmenting) the real world with virtual objects (Milgram and Kishino, 1994). The digital removal of real objects in the perceived environment of the user falls under the term AR as well. A virtual object may have a connection with the real environment, e.g., if virtual arrows support the user in a navigation task (see Cron et al., 2019). Sometimes, the absolute location of the visualization may be irrelevant, e.g., to visualize data. The graphical representation is in this case augmented at an arbitrary but fixed location. This allows to inspect the data from different perspectives by walking around it or even step into the visualization. In the context of this paper we call this kind of visualization a holographic representation of data.
AR systems (Carmigniani et al., 2011) comprise sensors for localization of the user and visualization of the data. The first part is usually achieved by a combination of cameras and depth sensors to detect objects. The position of the user is determined with respect to the surrounding objects or patterns and the position of the augmentation can be defined with respect to the user's position. This enables a rendering engine to compute correct images for the perspective of each eye. The visualization is done by monitors that show the images in the user's field of vision. Since each eye sees a slightly different image, the user sees the representation in real 3D.
One main goal is to achieve that the virtual objects blend in with the real world in a seamless way. Therefore, data needs to be exchanged between the localization device, the image rendering, and the monitors for visualization. Rendering uses positioning data and the digital model that shall be visualized. The computer used for the rendering may be either included in the headset or external. The advantage of an included computer is the provided flexibility concerning mobility, but weight restrictions today bound computing power, storage capacity, and battery lifetime. External computers can be connected to AR device by cable or wireless connection. A wireless connection has to deal with a restricted rate of transfer and power demand. A cable connection solves both problems but limits the range of movement and thus the system's flexibility. Microsoft's Hololens is an example for a system with an integrated computer and the additional possibility of a wireless connection. The Meta2 AR-headset adopts a cable-bound approach.
One of the advantages of holographic models is that metaphors like "take a step back to get an overview" or "point at a problem" work better than with traditional user interfaces Azuma, 1997). Once a holographic model is placed, it stays at this location. Thus walking around it changes the perspective, taking a step closer allows to focus on a detail, and taking a step back provides the necessary overview. Gestures like pointing or touching can be used to select parts of the model for further analysis or manipulation (Giannopoulos et al., 2019).

APPLICATION SCENRIO: 3D CADASTRE
An application area where simplicity of the user interface is of eminent relevance is 3D cadastre. Real estate cadastre provides information on rights on land, e.g., who the owner of a specific piece of land is and where the boundaries are. While many can understand 2D cadastral maps, it is more complicated when 3D cadastres are discussed. Modern cities frequently need to subdivide space in three dimensions and create rights for the resulting volumes (Paasch et al., 2016;Karabin et al., 2018). Typical examples are underground lines, bridges connecting buildings on different sides of the road, or condominium. In contrast to working with purely technical 3D geometries like in technical design, people with a broad spectrum of education should be able to visualize and understand the geometries represented in a 3D cadastre. Lawyers and real estate agents typically do not have training in CAD usage and people interested in the purchase of an apartment may have any kind of background. Thus, the user interface for visualizing data should be intuitive. Pouliot et al. (2018) point out a number of problems associated with visualization:  Title of the paper  Presenting a solid value proposition  Barriers to legal and institutional adoption  3D visualization for other applications  Multipurpose cadastral systems Pouliot et al. (2018) identified in their work AR as one of the emerging trends in 3D visualization.
It has already been shown that 3D representation for apartment buildings can be produced and used as a basis for 3D cadastres . Figure 1 shows the 3D model used for the proof-of-concept and also used in the subsequent tests. The basis for the 3D model generation are standard floor plans. This is all geometrical evidence on the boundaries of the 3D property that exists in the Austrian legislation. The modelling was performed using ArchiCAD 1 and it is compatible with Building Information Models (BIM)/Industry Foundation Classes (IFC). Figure 1. CAD-visualization of the sample data set  The model was then imported to Unity3D, a gaming engine suitable for the implementation of AR applications. Through this transformation, such models can be visualized as a holographic model in an AR environment .

EXPERIMENT SETUP
The experiment was designed to answer the question, whether the user experience differs between a standard 3D-CAD and a holographic model based on AR-technology. Two test conditions were formed for this purpose. The first one consisted of a standard (2D) computer screen combined with the BIM software ArchiCAD in 3D mode. ArchiCAD was chosen since many of the participants already had practical experience with standard CAD products. The second condition was a combination of the Unity3D model and the Meta2 AR-headset. Two aspects had to be considered when designing the experiment. First, it is obvious that the interaction capabilities of the two systems are not directly comparable. ArchiCAD is a BIM system in a very mature level that allows to create, modify, and visualize 3D models. Figure 2 shows the user interface of the software. It is a standard interface, where the floors can be selected and the graphics can be rotated by using the mouse. The Meta2-environment is more complex since it needs to merge reality with a correct visualization of the virtual objects.
Since, for example, eye distance varies between persons, calibration of the system is necessary. This is done before the actual usage of the headset and consists of several steps. Figure  3 shows the visualization of the same model, this time using the Meta-2 headset. The implemented menu in the middle allows to show or hide the various floors and can also be hidden in order to minimize visual clutter when interacting with the model. The interaction with the model and the menu is done by using the hands, which are recognized automatically by the headset. The test procedure had to consider these differences and thus only the functionality for visualization was used. Second, the familiarity of a participant with a specific software and hardware environment can influence performance. Since we had more participants having used ArchiCAD or any CAD system than participants with AR-technology experience, it is difficult to completely avoid a bias towards 3D-CAD. Next to demographic data, before the test, each participant had to fill in the Santa Barbara Sense-of-Direction (SBSOD) Scale (Hegarty et al., 2002) in order to determine their spatial abilities. The SBSOD is a self-assesment of spatial abilities. It is a measure for environmental spatial ability that correlates with objective measures of performance in several environmental spatial cognition tasks (Hegarty et al., 2002).
Nine tasks were prepared for the participants. To preclude learning effects in the collected data, each participant received the tasks in a different order. It was recorded how long it took the participants to complete each single task. Completion meant providing the correct answer or achieving a specific configuration (see list of tasks below). After completing all tasks, the participants had to fill out the System Usability Scale (SUS) (Brooke, 1996), the User Experience Questionnaire (UEQ) (Schrepp, 2019), and the raw NASA Task Load Index (TLX) (Hart, 2006). The SUS is a very common questionnaire to test the usability of a system and has been validated through many studies. Due to the fact that the first language of all participants was German, the word "cumbersome" in the SUS had to be translated for some participants. It was not tested if this has an effect on the results. The UEQ is a questionnaire with the goal to measure user experience. This should be done in a simple and immediate way and should provide a comprehensive impression of the experience the user had with the tested product (Laugwitz et al., 2008). The TLX is a subjective workload assessment tool. The addendum "raw" indicates, that the pairwise comparison between the different subjects (mental demand, physical demand, temporal demand, overall performance, effort, and frustration level) is dropped.
A sample size of 30 participants was chosen in order to perform meaningful statistical analyses. Due to a between-subject study design, half of the participants worked with the 3D-CAD and the other half with the holographic model. The mean age in the two groups was 24.6 years (with a standard deviation of 1.8 years) for group 1 (working with ArchiCAD) and 31.8 years (with a standard deviation of 12.4 years) for group 2 (working with the holographic model). Thus the group working with the holographic model was less homogeneous in age and older than the group working with ArchiCAD. This results from the fact that the approach with the AR-headset was more appealing. The group working with ArchiCAD consisted mainly of students while the other group also includes a larger variation in background. Therefore, the test results for the ArchiCAD may be biased towards acceptance of and familiarity with the technology.
Eight of the tasks for the participants were centered around one question. The steps the participants had to perform in order to answer the questions were not defined, so each participant could use the approach that looked most promising. The ninth task requested a specific change of the visualization. The tasks were dictated by the model and are not closely linked to usage of a 3D cadastre. The task questions were the following:  How many rooms are there on the first floor?  How many rooms are there in total?  How many rooms does apartment "Top5" contain?  How many rooms of apartment "Top3" adjoin the hallway?  How many apartments are there on ground floor?  How many apartments are there in total?  How many square meters has apartment "Top4"?  On which floor(s) are rooms of apartment "Top7"?  Hide all floors except of the basement!

EXPERIMENT RESULTS
The results of the SBSOD were similar for both groups. Group 1 had a value of 5.33 with a standard deviation of 0.76 and group 2 had a value of 4.95 with a standard deviation of 0.81. No statistically significant difference could be detected (t(28)=1.352, p=0.187).
The results of the SUS were 66.83 with a standard deviation of 18.46 for group 1 and 78.00 with a standard deviation of 18.93 for group 2. Both groups were not normally distributed. A statistically significant difference was found (p=0.05, Z=2.124).
Interesting conclusions resulted from the evaluation of the Nasa TLX, where the aspects "physical effort" and "frustration" yielded statistically significant results. Both were not normally distributed. Physical effort had a mean value of 1.93 (with a standard deviation of 0.96) for ArchiCAD and a mean value of 5.27 (with a standard deviation of 4.48) for the holographic model (p=0.01 Z=-2.710). Frustration had a mean value of 7.33 (with a standard deviation of 4.59) for ArchiCAD and a mean value of 2.67 (with a standard deviation of 3.68) for the holographic model (p=.05 Z=-3.457). The conclusion is that the physical effort is rated higher for the holographic model. There are probably several aspects that lead to this rating. Although the AR-glasses are not heavy, it does provide some inconvenience and the weight is perceivable. Sitting while working with the mouse on ArchiCAD is easier than standing and using large gestures in the holographic model. Conversely, the frustration level was rated higher for ArchiCAD. The results suggest that physical interaction, although the effort is higher, results in less frustration, possibly because of the involved natural interactions (e.g., physically grabbing) and perceived novelty of the tested approach. An obvious example is grabbing and then rotating the model. While the error rate for inexperienced CAD users is quite high (rotate in the wrong direction or around the wrong axis), the hand movements to achieve a specific result in the holographic model are straightforward because they emulate the interaction with real objects. Contrary to expectations before the experiment, no increase in performance from ArchiCAD to the holographic model was observed. Statistical analysis (t(28) =.194, p=.848) showed that the participants in both groups assessed their performance in solving the tasks they were given approximately the same.
The UEQ provided interesting results as well (see Figure 4). The color scheme is based on benchmark data (Schrepp et al., 2017) from 246 studies with 9905 persons. "Excellent" refers to results comparable to the top 10% of the tested products. "Good" refers to results where 75% of the tested products were worse and the other breaks are at 50% and 25%. In all categories, the holographic model received better values than ArchiCAD. The largest difference is visible for the "novelty" component. This is not unexpected since CAD is used for decades whereas first AR environments for the mass market were introduced 2012 (Google Glass). The difference is also large in the categories "attractiveness" and "stimulation". In both categories, the holographic model is in the best group and ArchiCAD in the worst. The score for "attractiveness" shows that the participants did not like the CAD approach whereas they were attracted by the idea of grabbing virtual objects to interact with them. The difference for "stimulation" suggests that the participants had more fun using the holographic model. The results for "efficiency" and "dependability" are not in the class "excellent" for the holographic model. The participants experienced problems with the response and predictability of the system (see also the discussion on reset times below). The lowest difference occurs for "perspicuity" where ArchiCAD receives the highest value and is classified as "good". This may be an effect of the time spent on user interface design for CAD in the last decades. All differences are significant except for perspicuity.
Time measurements were also analyzed. The average time per task was 72.7 seconds (with a standard deviation of 15.9 seconds) for ArchiCAD and 72.9 seconds (with a standard deviation of 23.2 seconds) for the holographic model. There is no significant difference between the results. The only difference to be observed was, that the data from ArchiCAD are normally distributed and those from the holographic model are not.
The questions have different complexity. Counting the number of rooms in an apartment is much simpler than adding up the room sizes. Thus it is not possible to compare the time required for different tasks. However, after each tasks, the participants had to undo all their changes on visualization. In case the visibility of several floors was switched off, these floors had to be shown again, if floors were shifted they had to be relocated to the original position. In ArchiCAD this is done in the layer visibility window, where each floor has to be marked as visible by clicking on a checkbox. In the holographic model, the menu shown in Figure 3 is opened by pushing a holographic button, the holographic button "Reset" is pushed for each layer, and then the menu is closed again. The average time to perform this task was 13.9 seconds for ArchiCAD and 19.1 seconds for the holographic model with a large difference in the standard deviation (2.9 seconds for ArchiCAD and 13.3 seconds for the holographic model). However, the results are not significant. Again, the data for ArchiCAD are normally distributed, those for the holographic model are not.
It was originally assumed, that the participants will improve the performance of this step when they have to perform them repeatedly. However, the holographic model showed no significant temporal correlation while ArchiCAD did show a correlation. An explanation could be that the user interface of ArchiCAD provides too much functionality to be simple to use but once a specific series of clicks is repeated, this can be done efficiently. The user interface of the holographic model, on the other hand, is simple to use and straightforward but technical limitations restrict the speed. Pushing the holographic button mentioned above requires collision detection between the real index finger of the user and the virtual button. Several problems can occur in this process: The index finger might not be identified correctly, the visual placement of the button may be slightly shifted, and the 3D perception of the user may deviate from the 3D model in the computer. The last problem occurred frequently and caused users to not extend their hand far enough.
There was also a significant difference in the correctness of the answers. On average, 80% of the answers were correct when using ArchiCAD (with a standard deviation of 12.0%) and 65.9% were correct when using the holographic model (with a standard deviation of 21.2%). It is assumed, that the novelty of the experience distracted the participants and prevented them from focusing on the task goals. Correctness was defined either as exact match with the true result or as a function of the deviation from the result. The second case allowed, e.g., rounding the room sizes when calculating total areas. The numbers shown above are for this second case, the results for the first case are slightly worse for both systems but the difference between the systems is still significant.

CONCLUSIONS
In the paper we present the results of an experiment on user experience. The systems compared in the experiment were a BIM product using a standard 3D CAD interface and a solution based on AR-technology. Since the interaction capabilities of AR-technology are not yet comparable with those of 3D-CAD, the tasks in the test are simple. Still, the experiment provided some interesting insights.
The experiment showed a number of different things:  The user interface for a holographic model seems to be more intuitive than the user interface of traditional 3D CAD. A reason for this may be that we are used to interact with the 3D reality and the holographic model is a seamless extension of reality. This seems to make interaction more intuitive.  The technical limitations of the current systems restrict the interaction in many respects. Problems with collision detection resulted in reduced interaction speed. This prevented users to improve the interaction speed with increased experience. The experiment did not focus on technical limitations of the system so there may be more problems to be uncovered. A problem may occur, for example, with the size of hands when trying to detect a finger since the resolution of the sensors may be insufficient.  Physical demand is higher when working with the holographic model than with the 3D-CAD. Zooming is done in standard CAD interfaces by mouse interaction. The user is sitting in front of the computer and has to perform minimal movement. The interaction with the holographic model provides two different methods, stepping closer or scaling the visualization. Both interactions require more physical effort than the mouse-based interaction and this could have an effect when using the system for an extended time.


The vision of fluid interaction as often shown in Hollywood movies is not yet possible with a reasonable technical effort. The system used in the experiment consists of the Meta2 AR-headset and a high-end desktop computer (Alienware Threadripper). This is far from being an affordable solution for a mass market. Still, hand and gesture recognition and depth perception are not yet perfect. Sometimes several attempts were necessary to perform a specific interaction. It was also necessary to keep the model simple to avoid problems with the visualization.
The experiment also showed the limitations of the current system. The efficiency of the user interface is not yet on the same level as with 3D-CAD. Command recognition must be more reliable because it currently restricts the interaction speed. A user will usually not visually check if a mouse-click was registered but the participants in the study always checked if the grab interaction was recognized when relocating a floor of the model. Work on precision and reliability will help to improve this and will have a direct effect on interaction speed and efficiency. Another limitation is that the interaction dialogues were custom made for the specific application scenario. A toolbox is necessary to automatically create interaction dialogues similar to the standardized dialogues developed for graphical user interfaces on computers. Finally, work is necessary to investigate the range of functionality that needs to be included in the user interface. The principal question is, what a user wants to do with a holographic model and when the user will prefer using the CAD? The answer to this question will affect the functions that need to be implemented.