Testing Interface Effectiveness




With the great strides made in the field of robotics, the importance of assisting humans to control robots has become increasingly more important. I conducted tests to (a) discover the effectiveness of an interface I have developed and (b) determine under what circumstances the various features are most beneficial. The task was simple, moving a robot around a maze and picking up flags. They did this while answering math questions. In one set of experiments, the users left the robots to answer the questions and another they did both simultaneously. In one set, they saw the map, the other they did not.

The Interface

This interface was designed to navigate robots in that game of capture the flag. When creating this interface, I have tried to increase efficiency by focusing on making the robot more tolerant to neglect, increasing situational awareness, decreasing user workload and improving performance. In order to achieve this, I have added a path planner and a notification system.

The path planner allows the user to set up multiple waypoint. The person can set the first waypoint with a mouse click. Successive points are set u[ with a right mouse click. The computer will draw a straight line between the points to show the order the points were created and the path the robot will follow. A left click will clear the path and create a new path with just a single point at the location of the click. Once one way point has been reached, the robot goes towards the next one. These waypoints and the intermediate path can be changed dynamically. The user simply clicks on the waypoint, or in-between waypoints and drags the path to the new location. Simply clicking and releasing the way point without dragging it deletes the waypoint. I hypothesize that the path planner will allow the user to neglect the robot for longer periods of time without needing to give further instructions. I also hypothesize that it will help the user recover from a distraction by helping the user remember what was being done before the distraction.

The notification system monitors the progression of the robot and notifies the user when needed. When the robot has completed all of the user’s instructions and is waiting for the next instruction, the computer beeps. A tab that indicated the robot will begin to slowly oscillate between similar colors to gently remind the user that the robot is not doing anything until the next instruction is received. When the robot is not progressing towards the waypoint, either because of technical difficulties or the path is blocked, the computer will play a long bong, alerting the user that the problem needs to be fixed. It will also flash on the same tab, but faster and with more contrasting colors. To find out why the tab is flashing, the user can move the mouse over the tab, and it displays the most pressing problem. All of the problems are also printed out in a history located at the bottom of the screen. Using multiple levels of attention grabbing has been proven effective in Obermayer's experiment. [l] These features should help manage the user's attention better so the robot can effectively be neglected longer without the user worrying if the robot needs attention. When the user's attention returns, the interface will direct the attention to where it is needed and help the user remember the current status of the robot.

Test Descriptions

I tested the efficiency of this interface by having the user run a single robot through a maze to pick up flags. Only one flag was shown at a time. When the robot picks up one flag, another flag was shown. The flags are produced at random with no two flags trivially close together.

The physical environment was created by the Brigham Young University Magicc Lab for a capture the flag world and is 4.5 meters by 4.5 meters. The robots are omnidirectional. There will be four flags that the user will search for.

I used a simulated environment that simulated a physical environment created by the Brigham Young University Magicc Lab for a capture the flag world. The simulated environment was 20 meters by 20 meters. The robots are omnidirectional. There will be three flags that the user will search for. The complexity of the environment was fairly uniform across the whole maze, so that any path between two flags had similar levels of complexity. There were no hallways, forks or traps that the flags could be inside of. This made sure that the complexity of a path between two flags had roughly the same complexity regardless of where the flags are located. This enabled the different paths to be compared by the same criteria without worrying about varying complexity or the chance that the user got stuck in a trap.

To add varying degrees of knowledge of the world, half of the test had all of the walls displayed and the other half did not. Without the walls, the user depended completely on sonar readings for navigation. The sonar placed an orange circle where it detected an obstacle and left a black dot where previous obstacles have been spotted. The most significant findings were the ones with the walls displayed. Unless otherwise noted, all statistics will reflect tests with walls.

Before the experiment, the subjects were asked their name and explained how the experiment will be run. They were told about each of the environments and the features that they would be given. They were told what questions were going to be asked at the end of each test and at the end of the whole experiment. The script for this is included in Appendix A. They then proceeded to the first test.

At the beginning of each test, the features and conditions that were pertinent to that test were explained to them. They had as much time as they want to get used the specific worlds and controls. When they were ready they pressed "Begin". Time started when they give the first direction. The script for this is included in Appendix B.

All together there will be 32 tests, 16 for each user. There will be all combinations of secondary task and mandatory distraction, walls being displayed and not being displayed, and no features, path planning, notification system and both features. These tests were given in a random to ensure that all distributions had an equal likelihood of happening. This avoided any discrepancies from improving after going through the maze. The whole test should took about an hour.

Neglect Tolerance

Olsen and Goodrich defined neglect tolerance as "measure of how the robot's current task effectiveness declines over time when the robot is neglected by the user." They hypothesized that "for a given robot and a given problem space there is a characteristic neglect curve;' [2] In order to determine the neglect tolerance for this environment, I created distractions and measured the increase or decrease in performance during these distractions. I asked the user to solve two-digit addition and subtraction problems. The user was presented with four options and asked to click on the correct answer. The math questions were placed at the side of the screen, allowing the user to answer these questions as often as desired. The task was left to the user as to when to answer the questions and when to direct the robot.

Olsen and Goodrich state that one way to increase neglect tolerance is by increasing the autonomy of the robot [2]. I tested this by adding the ability to give a sophisticated path to the robot. This allowed the robot to be more autonomous while the user is distracted, and hence more tolerant to the user’s neglect. With path planner, the robot was able to go for five times longer while the user was distracted than without the path planner. Even with the seeing the map, the robot went 30% longer with the path planner.

Another issue affecting neglect tolerance is trust. Olsen and Goodrich state, "If the user does not trust the robot they will intervene sooner“ [2]. This can be measured by how long the user can neglect the robot and the quality of the neglect. The length of time the user spends in the math section is the length of neglect for the robot and the number and accuracy of the math questions indicates the quality of neglect. With the path planner, users choose to spend 367% more time answering the math questions.

Although users spent significantly more time answering math questions, they only answered slightly more math questions. This indicates that they had less trust in the path planner. While they were answering math questions, they answered 20% fewer math questions. Part of the reason for the decrease is that the robot was more likely to be moving when the user had the path planner. This was partially ameliorated with the use of a notification system. This indicates that the users trusted the notification system to alert them to when the robot needed attention. The chart below shows the number of math questions the user was able to answer per minute while they were answering math questions.

Situational Awareness

Situational awareness is determined by how well the user understands what is going on in the robot's environment. A key to global situational awareness is having a good understanding of spatial relationships. Bailey found in his experiment that users spent between 5% and 40% longer on an interrupted task than a non-interrupted task [6]. He found the increase of time spent by the users correlated positively to the amount of memory load required the original task. This increase may be due to the fact that users take longer to regain awareness of the situation. For example, Bailer created a selection task that required the user to remember spatial locations in order to continue. When the users returned from the distraction, they often forgot spatial locations and had to start the assignment over to regain the spatial relationships.

The path planner helps the users retain spatial relationships by giving a visual representation of the path that the user has already planned and seemed to contribute the user's global situational awareness. They planned better paths, cutting the traveled distance by 40% and decreasing the frequency of running into walls by 60%. The robots were going 18% more of the time. However, with the increased global awareness, also came a decrease in local awareness. Although the robots stopped less frequently, when it did, it took the user twice as long to respond. The notification system was able to help partially overcome this problem, helping users respond to the robot getting stuck in half the time it took than when they didn’t have it. The first chart shows how many times a minute the robot gets stuck and the next chart shows the average number of seconds that it took the user to respond to a robot getting stuck.

Workload

One of the hardest aspects of measuring the effectiveness of an interface is that a lot of what determines a successful interface depends so strongly on internal processes and may or may not result in measurable action [4]. The Human Performance Group at the NASA Ames Research Center developed a user rating system that provides a measure of workload. The NASA TLX questions ask the users to scale from one to a hundred the level of temporal demand (time pressure), mental demand (thinking, deciding and remembering), frustration (irritation and discouragement), performance (success in completing the task), and overall effort (how hard the user worked) at the end of each experiment. [7]. Below is a chart of all the self reported values for the NASA TLX system. The greatest result was a decrease in frustration through the use of the path planner.

I also asked subjects how much attention was taken up with (a) planning and giving directions to the robot, (b) monitoring the robot, and (c) answering the math questions. The biggest gain was that they felt they spent less time monitoring the robot and were able to spend more time answering the math questions. The following graph shows the self reported percentage of time each activity took.

In Lin's research on determining effective and efficient interfaces, he measured eye movement and fixation. He found that it highly corresponded to performance and the user-rated measure [8]. Instead of eye movement, I tracked where the mouse was located and how often it switches between the math section and the map section. As mentioned earlier, each time the user had the path planner and began the secondary task, 367% more time was spent on it, allowing them to focus more on it. Users only had to move their mouse between the two screens a third the amount of time. I believe the ability to focus on each of the tasks separately contributed to the decrease in frustration.

Based on research done at Vanderbilt University, I had hypothesized that adding the path planner would decrease the number of clicks . In this research, Johnson measured click usage and found strong correlations to the results from the NASA TLX system [9]. There were strong correlations, though not what I had hypothesized. Users gave twice as many instructions with the path planner than without. This could be due to the fact that the path planner made it easier to give instructions. In my experiment, more clicks was inversely correlated with the perceived effort.

However, a positive correlation could be found with the number of wasted clicks. Without the path planner, users could only give one instruction at a time. Often they would give the next instruction before the robot reached the current instruction. Without the path planner, around 40% of the instructions were never reached. The path planner reduced this to 8% of the instructions. So not only did the path planner increase the number of instructions given, but a much greater portion of them were carried out.

Performance

One of the most obvious ways to tell how well the interface works is by how well the user performs. The definition of performance varies from one experiment to another. In my experiment, I measured the forward progress of the robot, the number of questions the user answered, the amount of time needed to give a new instruction when the robot needs a new instruction, the avoidance of a need for new instructions, the points the user plans ahead and the amount of time spent in the math and map sections.

The forward progress of the robot is the speed in which it travels the right direction. Therefore, it is possible to have negative forward progress if the robot is going in the wrong direction. There was about an 8% increase in the forward progress of the robot with the path planner.

One of the strong points of the path planner is it practically eliminated wasted time between instructions. Without the path planner, the user had two choices; wait until the robot was finished with the current instruction to give it the next instruction or give the next instruction before it finishes the current instruction. About seven times a minute, the user waited until the robot finished the current instruction. The user then waited a second on average to give the robot the next instruction, resulting in seven seconds that the robot was waiting for instructions every minute. With the path planner, this was greatly reduced. The robot ran out of instructions less than once a minute. The user did take twice as long to respond to this, partially because of planning and partially because of loss in local situational awareness, but this only less than two seconds a minute.

LINKS