Back to Academic Work

4 Preliminary studies

4.1 Introduction

Many of the assertions and questions introduced in Section 3.6 must be dealt with before more intensive research can proceed. These preliminary studies will also demonstrate some of the potential of the methods used to investigate them. Each preliminary study is based on a particular approach: rhetorical analysis, semiotic analysis, laboratory experiment and interview. The results of these studies are given below.

4.2 Analysis by trope of MS-DOS

4.2.1 Method

The first assertion put forward in the previous chapter was that a user interface is a sufficiently powerful semiotic system (even if not a true language) to naturally tend to develop through metaphor and metonymy. Computer literature shows few references to metaphor before the early 1980's, when the previously cited work by Carroll et al and the Xerox team started to appear. Terms used in MS-DOS (Microsoft Disk Operating System) have almost all been taken from previous command languages such as UNIX shell commands or CP/M and therefore pre-date any strong conscious decisions to employ metaphor. Consequently, it is reasonable to suppose that, where metaphor has been used in this language, it has not been introduced deliberately. In other words, the MS-DOS vocabulary provides a good test of any 'natural' tendency for metaphor and metonymy to play a major role in the evolution of user interfaces.

MS-Windows has adopted many of the metaphor-based features of the Macintosh interface and this might be expected to influence the terms used in more recent versions of MS-DOS. I have therefore analysed the commands listed in a manual for MS-DOS version 3.3 in 1987 (Compaq 1987). To carry out this analysis, I have consulted the definitions of the terms given in the complete Random House dictionary from the same year (Flexner 1987).

Many English words show elements of metaphor and/or metonymy in their evolution. For example, the word 'type' comes from the Greek typos (typos), the act of striking or making a mark, or the stamp used to make that mark. This has been applied, literally, to movable type used for printing and then to the use of a typewriter. The use of the verb 'type' when using a wordprocessor is a simple metaphor which is now long dead. The MS-DOS command 'TYPE', however, has made a further metaphoric translation in its adoption by the MS-DOS command language. The purpose of the MS-DOS command is defined as 'to display the contents of a file' on the screen (Compaq 1987). This is obviously not the literal meaning of 'type', even in its newest form, but a new metaphor meaning, "Put the contents of this file up on the screen as if someone were typing it at the keyboard."

Some computer terms have already become assimilated into the language. For example, the words 'disk' and 'program' (particularly with U.S. spelling) have assimilated into the language, even though their use in computing was originally metonymic and metaphoric respectively. As MS-DOS was primarily developed in the USA, I used an American dictionary from the same year to arbitrate on whether the metaphor or metonymy had become assimilated. Where the dictionary lists the word as a computer term and the word is used in that sense in MS-DOS, I have therefore listed it as a dead metaphor or metonymy.

4.2.2 Results

The results are shown in the table below:

Table 4.1: Trope analysis of MS-DOS.

Commands
Metaphor
Metonymy
Literal
Other
72
19 live
13 live
22
6
 
12 dead
5 dead
   
72
31
17
22
6

The total number of commands is less than the sum of the other categories. This is due to some commands, such as 'FASTOPEN', appearing in more than one category. In this case, the command copies file and directory locations into memory, in order to allow faster opening of them. Thus the command is named after a feature that is not the principal action of loading information into memory, but a feature resulting from that action (metonymy), that of allowing a file to be 'opened' (dead metaphor) faster. Some commands combine metaphor or metonymy with literal elements. In such cases, the literal element is ignored, as understanding of the command depends on the user's understanding of the metaphor or metonymy.

Commands listed as metonymy are of two types. In a command such as 'DISKCOPY', the term 'disk' is a metonym (though now assimilated into the language) for naming a particular type of data storage device. In most examples, it is the command itself which has a metonymic name. For example, the 'REPLACE' command can be used to replace the contents of one file with those of another but is equally likely to be used to add new information to an existing directory without replacing any existing files – in practical terms a diametrically opposed function. Thus the term 'REPLACE' is only one of two equally important attributes of the command.

In addition to these examples, limited metonymy (not shown in the table) is common. For example, the 'CHDIR' command is used for changing the current directory but, used with no parameters, can also provide information about the current directory. As it is named after only one of its uses, this could be classed as metonymy but changing directory is its major use and the name is predominantly a literal description of the command.

The category 'other' has been included to cover commands derived in other ways or of uncertain origin. Many are 'architectural' terms whose meaning was specific to the computer architecture at the time. Current use is more likely to be metaphoric. For example, most MS-DOS programs now run within windows and a command such as 'CLS' (clear screen) will not act on the entire screen. It will clear a window as if that window were a screen. However, the command was not created as a metaphor and is thus not included as a metaphor.

4.2.3 Conclusions

The original assertion to be tested was:

Assertion 1 A user interface is a sufficiently complex semiotic system (even if not a true language) to develop through metaphor and metonymy, as natural languages do.

The analysis found metaphor and metonymy present in the majority of MS-DOS commands. Only 22 of the 72 commands were purely literal, with 48 derived from metaphor and metonymy. By 1987, seventeen of these had already been assimilated into the language, giving some indication of the speed at which this process takes place.

Unlike the Macintosh, the use of metaphor in MS-DOS is not explicit and thus not structured around an underlying concept, such as the desktop. Literal expressions are used where a suitable context-free word or abstract noun has been identified, such as 'copy', but other adoptions have been in a haphazard manner. This analysis does not, of course, prove that a user interface is a semiotic system of a level of complexity comparable to a natural language, only that it is sufficiently complex to be capable of accumulating metaphoric and metonymic expressions.

4.3 A semiotic analysis of the Macintosh interface

4.3.1 Method

Barthes (1973b) and Eco (1987) have used the principles of semiotics to analyse various signs common in our society. Barthes and Eco do not offer methods for their analysis, although others have suggested procedures, such as Chandler (1995). An analysis of this type is highly subjective and demands considerable skill if it is to be complete. I do not suggest that it is practical or desirable for all designers to become experts in semiotic analysis. However, it is worth considering whether this sort of semiotic analysis can be applied to computer interfaces. If it is easy to uncover many layers of potential signification, this is sufficient to demonstrate the potential for such techniques in the analysis of user interfaces.

This exercise tests Assertion 2, that there is a massive range of signification inherent in a sign-system such as a user interface. If this is the case, it would be impossible to uncover every signification generated by the interface but the assertion may be regarded as valid simply by uncovering potentially useful significations which might not otherwise be noticed.

4.3.2 Analysis

The method employed was to start with a small part of the Macintosh user interface, showing a dialogue in progress as an example:

Figure 4.1: Part of a Macintosh interface.

Semiotic analysis is a self-reflective technique in which the researcher repeatedly asks him or herself, "What does this signify?" Carrying this out myself I found that, for example, the words on the screen signify:

'This is a sentence in the English language.'

The vocabulary used in the sentence, its spelling and its syntax also imply:-

'This is being written by someone who has received a reasonable education in this language.'

Note that the truth of the statement is not relevant to this form of analysis. Someone might have laboriously constructed a sentence in an unfamiliar language in order to falsely signify this signification. This does not make it a non-signification, only a false signification, possibly one the writer intended when writing the sentence. Eco has pointed out that the ability to lie is at the very heart of semiotics.

Every time there is a possibility of lying, there is a sign-function : which is to signify (and then to communicate) something to which no real state of things corresponds. A theory of codes must study everything that can be used in order to lie. The possibility of lying is the proprium of semiosis...
(Eco 1976, p.58-59 italics in original)

The potential of an interface sign to be misunderstood by the user is certainly of importance to the designer and must be allowed for in any analysis of its signification. In the case of my analysis, the statements are self-justifying in that I honestly believed that the sign could signify the levels I drew out in the analysis. Whether the sign was intended to carry this signification or it resulted from my mis-interpretation, other users could also make the same mistakes. It would be possible to draw a false analysis only by lying to oneself. If I claimed that the sign signified that 'the sun is hot', the statement would be dishonest, even though the sun is hot. There was nothing in the interface sign presented above to lead me to that signification, nor is there likely to be for any other analyst.

Continuing with this self-questioning technique, I uncovered 26 layers of signification. These started with simple statements, such as the fact that the interface is presented in the English language, or that it is a graphical interface. As the implications of these were considered, higher levels of signification were uncovered, such as the manner in which the Macintosh signifies 'this is a Macintosh' (not a PC), which in turn led to considerations of the relative images of the two architectures, the attitudes of their users towards them and even the potential political statements implied by them. The full list of significations uncovered by this exercise is given in Appendix B.

4.3.3 Conclusions

The analysis presents sufficient evidence that the original assertion was valid:

Assertion 2 Layers of signification are so numerous that it must be quite easy to uncover many of them in any interface.

It might be thought that the higher levels of signification are far removed from the Macintosh interface, but consider the advertising for the Macintosh. In 1997, advertisements were linked to the film 'Independence Day', in which the Earth is saved from an alien invasion by a scientist using a Macintosh. The slogan at the end was "Apple: The Power to Save the World." In a simpler way, the interface is 'selling' itself and the system to the user. This may be through a fiction, such as the Macintosh advertisement, or even through dishonesty – a false signification. One example of such dishonesty was examined in Chapter 2 (see figure 2.5), in which personification or user-friendliness can be used to signify a level of intelligence which the system does not possess.

Many of the individual points in this analysis are arguable, but it demonstrates that many layers of signification are likely to be present and even someone like me, who has never carried out a semiotic analysis before, can easily uncover many of them. Even the higher levels of signification have the potential to affect the ways in which users interact with systems and are thus factors that the designer might beneficially consider if the system is going to be used, and to be used effectively.

4.4 Comparing metaphor categories

In the previous chapter, an approach based on mental models was considered as the most promising alternative to a semiotic approach in investigating this field. In particular, an experiment by Anderson et al. (1994) was described in which they attempted to look at the match between the user's mental model of a system and the actual system functionality. This was based on the assumptions that the user forms a mental model of the system, and that the accuracy of the mental model affects usability. Their results showed that the greatest variation in accuracy between the interfaces occurred when examining concepts inherent in the vehicle but not present in the system. They described this as 'conceptual baggage', implying that too much conceptual baggage would hamper usage of the system. This obviously depends on accepting the underlying assumptions about the formation and accuracy of mental models.

It is easy to imagine that users can form mental models of systems as simple as those used in the experiment, but it is more difficult to imagine users forming a mental model of a commercial application which is far more complex. Model formation may well also be easier for computing students such as those who took part in their experiment than for the population in general. The question therefore must be asked as to whether the ability to form an accurate mental model plays any part at all in usability. I therefore developed a new experiment to test this assumption. Although based on the previous experiment, mine was not a development of it or a companion to it but a questioning of its underlying basis.

The experiment was based on the three categories of interface metaphor introduced in Section 2.4.4. In order to compare the results with the experiment reported in Chapter 3, described in Anderson et al (1994), the basic experimental method was identical. The experiment took place at BIBA (Bremer Institut für Betriebstechnik und angewandte Arbeitswissenschaft an der Universität Bremen). It should be noted that the experiments took place in German but, with help from Stephan Keuneke of BIBA, I have translated the instructions and other material into English for this thesis.

Various people at BIBA involved in the MITS project (including Hans Panse, Matthias Jankowiak and Stephan Keuneke) had developed ideas for metaphor-based interfaces. Based on an independent assessment of correspondence to the three metaphor classes by Christian Heath and Paul Luff from the University of Surrey, I took the three considered to mostly closely correspond to the categories and used them as the basis for developing working prototypes, where appropriate, further developing each to closer correspondence to its relevant category. All metaphors were presented as direct manipulation graphical user interfaces:

The interfaces were not intended to be original and, indeed, represent three of the commonest metaphors used for CSCW systems, as mentioned in Chapter 2: rooms/offices, agents/guides and books/newspapers. Each interface was examined separately, presenting an identical scenario to the subjects carrying out the experiments. Care was taken to avoid the use of metaphor in the description of the task involved in the scenario.

4.5 The Three Systems

4.5.1 Functionality

Each of the systems had the same underlying functionality and the same communications protocols. The first of the systems to be built was MILAN (Multimedia Industrial LAN). An earlier version of MILAN is described in Condon (1990), and in Hämmäinen and Condon (1991) where the use of the room metaphor for real-time interaction is compared with that of the form metaphor for a non real-time CSCW system.

The following functions were available to the users in the experiment, though most were not required for the scenario:

Some functions were removed from the interface as too heavily embedded in the spatial metaphors used by MILAN and therefore not implementable in the other interfaces. These included maps for high level orientation and virtual reality facilities to 'walk round' a three-dimensional building, etc. The system was built in SuperCard 2.0, an object-oriented prototype development environment, though not a complete object oriented language (Allegiant 1997). This allowed the creation of the two new systems, Little People and Link-Journal, with very different user interfaces but identical underlying functionality. As they used exactly the same objects and methods as MILAN, they also possessed the same response times, allowing comparison of the interfaces alone.

The scenario required the subject to set up a multiway audio/video conference, send an email and use the shared pointer. The additional functions available might give the users better information on context, helping the user to find the right functions; alternatively, the additional functionality and resulting interface complexity might confuse the user.

4.5.2 The Task

The users were given a scenario with a series of tasks to carry out concerning the design of a chocolate box. This was chosen to reflect the type of activity which takes place in engineering design but was deliberately set as a non-manufacturing task to avoid technical issues, such as arguing about which machine tool to use, that might get in the way of the experiment. No time limit was given but it was suggested to subjects that the experiment should not take more than 'around half an hour'. Although there is no evidence that, for example, spatial metaphors are more useful for spatial tasks, to eliminate possible bias the main tasks were chosen to cover the three types of task equally:

Set up an audio/video conference (interactional), involving:

Use the shared drawing facility (spatial), involving:

Mail a report to a colleague (activity-based), involving:

4.5.3 Interfaces

Figure 4.2: The MILAN Room.

A new version of MILAN was created to emphasise the spatial appearance of the interface. This was most notable in the redesign of the room, developed in a 3D CAD package, Virtus Walkthru (Virtus 1997), and presented to the user in perspective 2D. Three features were employed in this experiment, each represented by an object in the room: the out-tray for e-mail, the whiteboard for shared drawings and the television for the video connection.

Figure 4.3: A group-page of the Link-Journal.

Looking like a DTP-Program, Link-Journal mixes the roles of editor and reader. It is divided into sections with different aims: a personal section accessible only by the local user; group sections which can be read and edited by members of a specific interest group; and public sections, usable by anyone who logs onto the system. The shared drawing was presented to the user as a group page of the company described in the subject's task. Leaving messages for other users (e-mail) was translated into a fill-in-form for personal ads in the paper. The video connection was implemented by dragging the picture of the required person into the 'live' picture on the page.

Figure 4.4: Little People Main Screen.

Little People displays different characters, or agents, on the screen each representing a specific set of actions. The three functions required for the experimental task were each represented by a different person. The postwoman sends e-mail, the cameraman controls the live-video and the designer gives access to the shared drawing tools.

The degree to which the interface itself could be manipulated spatially varied significantly between the three interfaces. In MILAN, all spatial relationships are fixed, as these provide the underlying rationale for this spatial interface, and nothing can be moved. In Link-Journal, the newspaper page provides a fixed space within which users can do no more than re-arrange the existing text and picture areas, or create new ones within the sort of constraints typical of a DTP program. In Little People, however, users had complete freedom to drag the icons around the screen.

4.6 Experiment Design

4.6.1 General Principles

The main method of investigation was a questionnaire, with questions chosen to find how well the subjects had mapped out the functionality of the system, even where this deviated from the domain of the metaphor. Open-ended questions were also used in an attempt to gain some insight into the subjects' thinking about the metaphors.

4.6.2 The Subjects

33 subjects were chosen from the staff and associates at BIBA, 11 for each metaphor. The subjects were of both sexes, aged 17 to 60 and covered a wide range of experience. They were chosen to represent typical factory personnel, but biased towards future requirements based on current trends. They included managers, apprentices, shop floor workers, secretaries, CAD operators and students of manufacturing design. The subjects had varied experience with computing, ranging from people who had never used a computer to experts, but were biased towards experienced users, as computer literacy is generally growing more widespread. Attempts were made to match users across the three interfaces but it was impossible to exert full experimental control over this.

The spread of ages and computing experience is shown in the diagram below. Age is shown in years, computing experience on a scale of 0 (the subject has no previous experience of using computers) to 100 (computer use forms a major part of the subject's daily activities):

Figure 4.5: Age and computing experience of subjects.

Before starting the full experiment, a number of dummy runs were carried out with users who did not take place in the experiment itself. Subjects unfamiliar with the Macintosh interface had problems with some actions which were standard across all three interfaces. All subjects were therefore given written instructions on how to close windows, a short demonstration of dragging a mouse and a warning about waiting a few seconds for the system, which was sometimes rather slow, to respond to an action.

4.6.3 The Questionnaire

The questionnaire was presented to the subject immediately following successful completion of the scenario (no subjects failed to complete the tasks). Subjects were asked to sit in another part of the room for this so that they were unable to see the screen when answering the questions. Equal numbers of questions were chosen from the four categories based on the intersection of the user's model of the system domain and of the metaphor's domain. All questions were statements which the user was asked to judge as true or false. A 'correct' answer was one which showed that the user understood the functionality of the system, even where this deviated from the implied functionality of the metaphor. For example, the questions used for the MILAN system included the following, where S is the system and V the metaphor vehicle:

SUV Present in the system and implied by the metaphor vehicle: "You can see who else is in the room" (True).

SU~V Present in the system but not implied by the metaphor:
"You can tell who is knocking on the door of a room you are in" (True)

~SUV Not implemented in the system but implied by the metaphor:
"You can move the furniture" (False).

~SU~V) Not implied by the metaphor nor implemented in the system: "You can make a connection using a person's phone number" (False).

Statements were phrased so that half of them required the answer 'false', half 'true' for each category. The questions were then randomly mixed up, so that the categories would not be apparent to the users.

4.7 Results

4.7.1 Quantitative results

The amount of time it took the subjects to complete the scenario was less than expected from the previous experiment, where subjects took longer to complete a simpler scenario (Anderson 1994). Mean times taken for each interface (in minutes) were as follows:-

MILAN 08:40

Link-Journal 07:38

Little People 09:06

The variations in time between the three interfaces were not tested for significance, given the very wide variation in time taken by individuals within each category (standard deviation approx. 3 minutes for each category). There was little variation in the numbers of correct answers to the questions:-

MILAN 47% correct

Link-Journal 47% correct

Little People 57% correct

As random answers would have generated a score of 50%, it can be seen that, on average, the users did not form accurate mental models of the system functionality. No attempt was made to analyse these results any further. Even if the difference between Little People and the other interfaces is statistically significant, it is too marginal to be useful. This corresponds to the widely accepted distinction between statistical significance and clinical significance discussed in Sidman (1960) and Hersen and Barlow (1976). Robson (1993, p351-2; p367) discusses these views and those of Meehl 'who has claimed that reliance on statistical significance was one of the "worst things that ever happened in the history of psychology"' (Meehl 1978, quoted in Robson 1993, p351). It is not necessary to accept such an extreme view to see that the difference between the scores above is too small to provide useful guidance for an interface designer.

A more detailed analysis of the results is given in Appendix C. It should be noted that the results offer no support for the concept of conceptual baggage. This concept depends on accepting that users form mental models and that the accuracy of these models matters. It could be argued that this is because all three vehicles were conceptually richer than those used in the previous experiment. However, according to the concept of conceptual baggage, a conceptually rich vehicle should introduce more mismatches in the ~SUV case than in the others, an effect that was not found in this experiment.

4.7.2 Qualitative results

The questionnaire also included open-ended questions to ascertain how the users felt about the system. These were analysed to confirm the validity of the original metaphor classification. According to this, it would be expected that a user of MILAN would talk more about spatial relationships, a Link Journal user in terms of human interaction and a Little People user in terms of activities:

Spatial These were mainly comments about the positions of objects within the world represented by the interface (rather than simply the on-screen position).

Interactional Interactional aspects were separated from pure communication (see below). These were only comments on people working together or collaborating.

Activities All mentions of 'activity', 'function' or 'beruf' (this approximately translates as trade or profession but bears a stronger implication of a specific activity).

In order to strictly distinguish these categories, some additional categories were included in the analysis:

Metaphor Mentions of the specific metaphor chosen. It was important to exclude comments in this category from those above, otherwise the exercise would be self-justifying. For example, identification of the relevant icon as the 'postwoman' was placed in this category and not counted as a reference to a trade or profession.

Technical Comments on sound quality, system responsiveness, etc. As a prototype system, response times were poor and sound quality was not always very good. However, none of this was relevant to the issues under consideration.

Communication Technical communications, rather than comments on co-working, which would fall into the interactional category. For example, comparison with videotelephony or mention of computer, video and audio working together.

Task The task in the scenario, i.e. changing the design of the chocolate box, rather than the general comments about activities as classed above.

Interface Mainly comments on user-friendliness, etc. This category also included mention of the physical layout of menus and graphics, which needed to separated from comments on spatial relationships.

To avoid bias and possible linguistic difficulties, the task was handed to a fellow researcher who spoke German as a first language and had not been involved in choosing the categories. Figures given are the total number of subjects making a statement falling into a category: some mentioned more than one aspect of it. No consideration was given to whether the mention was favourable towards the system or not, only whether the subject felt that aspect worth mentioning:

Table 4.2: Categories of responses.

System Spatial Inter-
action
Activity Metaphor Tech Comms Task Inter-
face
MILAN 5 1 0 10 0 4 0 9
Link-Journal 0 7 0 3 2 2 1 3
Little People 1 4 4 8 3 1 0 6

The three response categories that directly respond to the metaphor categories have been highlighted. Apart from the three main comment categories, it is noticeable that fewer people commented on the Link-Journal interface or metaphor, despite the fact that this was more dramatically different from their general working environment than the other metaphors. This is considered in the conclusions below.

In each of the three main response categories the corresponding metaphor type scores much more highly than the other metaphor types. These can be compared to the statistically expected frequencies:

Table 4.3: Expected frequencies.

Spatial
Interact
Activity
Total
MILAN
1.64
3.27
1.09
6
Link J
1.91
3.82
1.27
7
Little P
2.45
4.91
1.64
9
Total
6
12
4
22

A c2 test is the commonest measure of statistical significance for this type of table but is only valid where all expected frequencies are above 5 in each cell ( Siegel 1988, p.123). Siegel & Gallagher's recommendation for a smaller sample such as this is to use the Fischer exact test. However, the Fischer exact test can only be used on a 2X2 table. The experiment was therefore considered as three paired experiments. This was valid as each of the three experiments was carried out independently of the other two.

The Fischer exact test was applied to each pairing in turn:

Table 4.4: Fischer exact test results.

  Spatial Activity-based Total
MILAN 5 0 5
Little People 1 4 5
Total 6 4 10
p= 0.0238
  Spatial Interactional Total
MILAN 5 1 6
Link-Journal 0 7 7
Total 5 8 13
p= 0.0047
  Interactional Activity-based Total
Link-Journal 7 0 7
Little People 4 4 8
Total 11 4 15
p= 0.0513

Accepting the convention that p < 0.05 is significant and p < 0.01 as highly significant:

the distinction between MILAN and Link Journal is highly significant;

that between MILAN and Little People is significant;

that between Link-Journal and Little People is not significant (though borderline).

4.8 Conclusions

4.8.1 The quantitative responses

This experiment aimed to test one assertion and ask a number of questions posed in Chapter 3. The first three of these are dealt with by looking at the quantitative results obtained from the main body of the questionnaire:

Question 1 Does it matter whether users form accurate models of the system?
Question 2 Would 'real world' users behave differently?
Question 3 Are the results valid for more complex computer systems and interfaces?

The second question has been very simply answered. None of the users was unable to complete the tasks in the scenario, despite some of them never having used a computer before. In their comments, subjects also claimed to find the system easy to use. Given that they were provided with no training, manuals or help facilities, the short time they took to perform the tasks demonstrates this quite convincingly. It was therefore concluded that the make-up of the user group had no effect on their successful use of the systems.

No breakdown of the times taken according to experience or other factor was attempted. Observation of users and the remarks they made during the experiment showed that their strategies varied considerably. Some expressed interest in the novelty of the interface and the facilities offered, exploring it thoroughly before starting to work through the scenario; others started the scenario tasks immediately. It is not clear whether this exploration time should be included in the time taken to complete the task. It is included in the times given above but accounts for much of the very high variance.

The first and third questions cannot be separated. It is obvious that forming coherent, overall, mental models of the system was not a condition for successfully using the interface, but the reason why these were not formed is less clear. It could be because of the greater functional richness of the system than that used in the study by Anderson et al (discussed in section 3.4) or the greater complexity of the interfaces. Certainly it is possible to say that inexperienced users working with a complex system were able to complete the tasks successfully without forming coherent mental models they could reason about. As corollary to this, there was no evidence of conceptual baggage.

If we believe that all users always generate mental models of the system, we have to conclude that these models were sufficient for efficiently completing the tasks, but the poor responses to the questionnaires demonstrate that subjects could not use their models to reason about the system as a whole. There is one possible explanation of this that remains consistent with the mental model view. In the case of MILAN, even though all users discovered that the television controlled the video and that the whiteboard was for shared editing, there was no need for them to integrate these separate objects into a coherent functional model of the total system.

When answering the questions, the subjects reasoned only about what they had to do at any moment to accomplish their tasks. The aim of the experiment was presented to the users in a task-oriented manner (making the changes to the chocolate box), so that the users built action-oriented mental models (Young 1983) which were not amenable to reasoning. This could lead to misinterpretation of some questions. For example, all but one of the subjects marked "true" for "It is possible to leave a message for someone without entering a room" (MILAN questionnaire) When carrying out this task, the users had been 'inside' a room. Even though they used the out-tray to send the message and should have recognised that out-trays were only present on the desks in the rooms, this was not relevant to the task at hand and did not feature within that part of their models of the system.

Thus, it is possible to maintain a view of human computer interaction based on the manipulation of mental models. However, to speak of the user forming a single, coherent model of the system is almost certainly wrong, but without such a model it is difficult to imagine how this approach could be used to develop a coherent model of the metaphor process. Conversely, the phenomenon of a user simultaneously holding a number of separate views of the system ties in well with the view of the system as a semiotic system leading to multiple signification.

4.8.2 The qualitative responses

Although the classification correlates significantly with the way that people perceived the systems, this does not mean that they identified the three classes in the same way. The responses of the MILAN users talked more of the metaphor and of the interface, and their spatial references were almost entirely about the layout of the room and the objects within it. The users had obviously formed a clear mental model of the interface, even if they failed to form one of the underlying system. For example, two users complained that the television was too far from the desk and one asked for a remote control (though this distance exists only within the perspective of the picture).

By contrast, the Little People users were more concerned with the functionality of the system, most notably the communication functions. Finally, the users of Link-Journal talked of interactional aspects in terms of the tasks that the system could support: cooperative working. In summary, the choice of metaphor did not only influence the user's view of the system; it far more fundamentally affects what the user sees, not just how it is seen: the level of signification.

The distinctions were particularly noticeable in the answers to the question asking what users thought the Grundidee (basic idea) of the system was. In the case of MILAN, almost everyone mentioned the metaphor. With Little People a more typical answer was 'presenting the functions of the system in a user-friendly way', whereas with Link-Journal, people frequently wrote of Zusammenarbeit (working together). In other words, MILAN users were most conscious of the interface, the immediate signification of the images on the screen (you are in a room, the television is on the other side of the room, etc.). Little People users were more concerned with the next level of signification, what the interface is for, i.e. supporting the functionality of the system (sending mail with the postwoman, setting video views with the cameraman, etc.). The Link-Journal users were concerned with a level higher still, what the functionality is for, i.e. supporting people working together (Zusammenarbeit, distributed manufacturing, etc.).

This confirms assertion 3 and answers question 4 positively:

Assertion 3 Interface metaphors create many different forms of signification not accounted for by the mental model approach.
Question 4 Does the type of metaphor affect the forms and levels of signification?

Although Link-Journal led to 'higher' levels of signification than the others, care should be taken before describing one approach to the system as 'better'. The best choice of metaphor will depend largely on what one wishes to get across to the users. Although the interactional interface (Link-Journal) appeared to turn the users' attention towards a 'more important' signification (what the system is to be used for), this is not always the first concern of the interface designer. For a system that is to be used for a short time, for example to support work groups that come together for short tasks before disbanding, it may be that the immediate appeal of an interface such as MILAN, in which the interface and the metaphor are foremost, is more important.

Although this moves beyond the general argument of this thesis, Appendix E builds on this experiment to examine the potential economic impact of metaphor choice, examining which type of metaphor is likely to be most successful in different industry sectors. This confirms that an interface based on an interactional metaphor is likely to be best in most industry sectors but that a spatial metaphor might be more useful in some. The number of interfaces used in the experiment (one interface based on each type of metaphor) is certainly not large enough to state this as a general case and it does not form a significant part of the conclusions of this thesis.

There is a common assumption that if user requirements and usability criteria are both met then users will use the services provided. There is considerable evidence that this is not always true. For example, Hutchinson & Rosenberg (1993) show that expert systems which meet identified needs and which are initially used by the users (implying reasonable usability) are then abandoned. Although they suggest other reasons, the results of this experiment suggest that it could be because the interface failed to 'sell' the system in the most appropriate way to that user group.

4.9 Summary of conceptual model

Chapter 3 proposed a semiotic model of HCI based on Layers of Signification (LoS). The studies described in this chapter have then checked the validity of the assumptions on which that model was based; the following chapter will then examine whether that model is effective and appropriate for use by designers. Before doing so, I will briefly summarise the main features of the model as checked by these preliminary studies.

Use of trope analysis has established that metaphor is ubiquitous in the computer interface, even where there is no apparent intention on the designer's part to employ it. This is consistent with the fact that, using the definition proposed in Chapter 3, the interface can be regarded as a semiotic system, consisting of related signs which affect the signification of other signs according to context. For example, the file saved by the 'Save' command in Word or WordPerfect depends on which file is in the active window.

Semiotics proposes that a sign consists of a signifier (the observer's immediate perception of the sign) which carries many significations, each leading to a separate signified. The simplest signification is known as denotation but even this is dependent on the observer – in English 'Gift' means 'present', whereas in German the word 'Gift' means 'poison.' Higher levels of signification, also known as connotation, will be dependent on many other social and psychological factors. Analysis of the Macintosh interface established that it is possible to uncover examples of this multiple signification in a single interface element.

This multiple signification implies that the mental models proposed by some researchers will be inadequate to explain the user's interaction with the computer, in that they assume that the user should build a complete and consistent model of the system. A semiotic approach suggests that users will often be aware of contradictory significations within a single sign, making such consistency impossible. The use of metaphor then compounds this complexity by introducing all the layers of signification which the user associates with the metaphor vehicle.

If an interface element or the metaphor vehicle used in its construction leads the user into inappropriate signification, the user's understanding and acceptance of the interface can be severely compromised. This will be particularly important when the user and the designer come from very different social groups. Examples might include educational level, profession, sex or age, all of which will influence higher levels of signification. Section 3.6.6 proposed interviewing users of computer interfaces, continually asking, "What is that for?" in response to their answers. It is proposed that this 'What for?' interview might help the designer to uncover some of this signification. The following two chapters will pursue this further. Back to Academic Work