Protocol Analysis is one of the most effective methods for assessing the usability of an information system and for targeting aspects of the system which should be changed to improve usability. While "protocols" usually bring to mind the formal step-by-polite-step procedures used by diplomats in international negotiation, the first definition listed in Webster's is "an original draft, minute or record of a document or transaction." The "protocol" in protocol analysis is the complete recording (in written, audio, and/or video form) of the interaction of a user with a system, while that user "thinks out loud" in order to allow the recording of his or her perceptions, reasoning, and reactions to the system. The "analysis" is provided by the researcher, who examines a number (3-5 for each type of user) of these protocols and reaches conclusions about aspects of the system that cause problems for users.
Protocol Analysis as an evaluation method is most effective when combined with rapid prototyping as a development methodology, since the results of these analyses should be employed to improve the system's usability, in an iterative manner. It requires only six to eight such recorded observations in order to reach conclusions, and the procedure may be completed in a week or less, start to finish. This gives it great advantages over methods such as surveys and experiments, which generally require a hundred or more subjects, and can take weeks, months, and even years to complete. As a method, it is deceptively simple; this does not mean that it is "easy." As with other forms of "participant observation" used for qualitative research on human subjects, steps must be taken to assure the validity and reliability of the data collected, and the preparation of the transcripts requires many hours per observed subject.
Protocol Analysis was originally developed (Newell and Simon, 1973) as a method to study and learn about the cognitive processes that humans undertake to solve problems. Learning an interface is just one example of a problem solving task. Protocol analysis has seen extensive use in other areas of Computer and Information Science such as discovering how an expert solves a problem for the objective of creating an expert system. A potential area that has not yet received the attention it deserves is the use of the method for enhancing the understanding of user requirements. It could be used by designers to better understand user requirements. There are many types of problem solving tasks where an individual can solve a problem but does not have a complete or detailed awareness of the mental process whereby the problem solving process is carried out. The primary objective of protocol analysis is to aid in the discovery of what is taking place. However, our emphasis will be its use in evaluating an interface design.
Protocol Analysis is the most basic method that a software designer should master as a tool for incorporating user feedback in the development of the interface to any interactive system. For a relatively small amount of effort, a designer can quickly discover any basic flaws in the design and has considerable chance of exposing more subtle problems. Though widely used, there is only limited published material on how to conduct a protocol analysis for software evaluation purposes. The objective of the information, instructions, and examples provided here is to prepare the reader to actually carry out this procedure.
4.1 Characteristics of the Method
Protocol Analysis is a tool that was developed in the field of cognitive psychology (Newell and Simon, 1972; Ericsson and Simon, 1980, 1984) to aid in discovering the process a person goes through in solving a problem. It is perhaps the only tool by which we can gain some understanding of how humans go about mentally solving a complex problem. While controlled experiments can give us insight into micro cognitive abilities that humans have, the process of putting those abilities together to deal with real problems is usually examined by Protocol Analysis.
In the area of software evaluation, many of the published studies using protocol analysis come from the IBM Watson laboratory (e.g., Lewis, 1982; Mack, Lewis, and Carroll, 1983; Carroll et. al., 1985; Carroll and Mazur, 1986). Protocol Analysis is also a standard tool in the development of Expert Systems. It is used by Knowledge Engineers to discover the human mental processes used by experts. Clearly a designer could use the same approach to discover the processes users employ to carry out various tasks they wish to support with an interactive system. However, our treatment in this chapter focuses on the evaluation of the interactive system. We are looking at the process of learning and understanding an interactive system as a problem solving process. For those who want to apply the technique more generally the Ericsson and Simon book (1984) is recommended.
The assumption behind the method is that the verbalizations made by the human problem solver are at least a meaningful subset of those mental processes that determine his or her behavior. People cannot verbalize all of their mental processes, or have complete awareness of them. For example, humans cannot perceive the inner workings of long term memory, but they can perceive to a great extent what is happening via what is passing through short term memory. Humans can clearly perceive the occurrence of difficulties in what they are trying to accomplish, and this is certainly one of the key areas of concern for interactive system designers. If they are not actually doing or simulating the doing of a problem solving task they cannot recall the details of the process at the level of short term memory dependent cognitive interactions. For designing systems that support human problem solving and tasks with a great deal of cognitive variability there is really no other way to accurately determine requirements for the functionality of the system.
In a 1980 paper and 1984 book on Protocol Analysis, Ericsson and Simon used the subtitle, "Verbal Reports as Data." They argued that, long-standing criticisms to the contrary, verbal self-reports can be a valid and reliable source of data about cognitive processes. The key is that people can directly express verbally the information that they are currently attending to while accomplishing a task. Thus, they are asked to "think out loud" as they are actually looking at a screen and deciding what to do.
Protocol Analysis is better than probing methods. "Concurrent probing," which asks questions of the user while the information is in short term memory, is somewhat less reliable. "Retrospective probing," which is questioning after the completion of a task to try to get a person to recreate cognitive processes, does not generally yield very valid or complete data. Clearly any delay in a subject responding will cause losses in the perception of what they were thinking. Furthermore, any questions asked by another person will contaminate the individual's perception of what he or she is thinking. So, a key is to get the subject to truly "think out loud," and not to interrupt or interfere with the user. In addition, users can most easily verbalize without interfering substantially with their cognitive processes when the information itself is in verbal form; if they have to translate a different medium (e.g., graphics, sound) into words, this will be more disruptive and less valid. Therefore, one does have to be concerned that characteristics of the graphical layout of screens may not be as well assessed as the semantic content of the screens.
Concerns about the accuracy of verbal reports (Nisbett and Wilson, 1977) include:
If verbalization is an inherent part of the task and the verbalization is successively stored in short term memory in verbal form throughout the task, it is believed that the results will be entirely accurate. If the verbalization or semantic content is a part of the problem solving process, but not normally heeded (in short term memory), then the results may well be incomplete or distorted. Finally, if the tasks are not semantic in nature, such as pattern recognition, the process of thinking aloud may significantly alter the problem solving process (Ericsson and Simon, 1980, 1984; Henry, 1934).
This does not mean that Protocol Analysis should not be used in, for instance, the study of Graphical User Interfaces, but that one must do so with a degree of caution in the interpretation of the results. Also, to go from the objective of identifying problems in the interface to the objective of gaining insights into the user task sequence for problem solving may require the correlation of protocol studies with passive monitoring of normal user interactive sessions. Protocol Analysis, even in a highly graphical environment, is still likely to turn up major conceptual problems in mastering the interface. However, it may not be evident from the user reports how to correct the situation or how deeply in the abstraction of the design concepts the problem exists.
The following are some conditions that apply when carrying out a Protocol Analysis:
Thus, one will not observe true "trial and error" learning behavior, which is a very common way that many users go about learning a new system. Furthermore, no insight can be gained on the time to carry out an interaction or the real error rate that might occur in a user's normal mode of interaction.
The subjects used in a protocol analysis should be representative of the intended users of the system. For example, if you are developing a system for managers in an organization, the subjects should be drawn from this pool. Usually, subjects are "first time" users of a system. However, if you are designing advanced features of an existing system, then the subjects should be experienced users of the system who would be candidates for employing the new functions.
The observer/recorder and analyst must be able to act in a neutral, supportive, non-directive and unbiased manner. Most designers are too biased to be able to play this role without training. Therefore, it would be best, if you are the designer, to plan the protocol analysis, but to get somebody else to actually carry it out. If this is not possible, have another person experienced in protocol analysis observe you for your first attempt, and point out any places where you forgot your role as a neutral, non-directive observer.
In order to succeed with the use of this method, there has to be a certain atmosphere established between the person carrying out the Protocol Analysis and the subject undergoing the process. The right atmosphere is set up by the following conditions:
Honesty
The evaluator has to convince the subject to be completely honest about his or her mental processes and reactions to the system. Part of establishing an atmosphere of honesty or trust is for the subject to know exactly what will be done, why, and how long it will take. Federally funded research projects require the use of a consent form whenever human subjects are used; it is recommended that this procedure be followed even when not legally required, since it is a good ethical practice that also helps to establish the right atmosphere for interaction between the evaluator and the subject.
No Evaluation of the Subject
It has to be clear to the subject that he or she is not being evaluated by this process; the objective is to evaluate the system they are using and not the capabilities of the user. For example, it has to be clear that you are not going to be reporting the subjects performance to co-workers or superiors.
No Pressure for Performance
It will always take a human longer to do something when verbalizing. The subject may also have to rethink what he or she was doing after interrupting the process to verbalize. Therefore, one cannot make any determination of how efficiently one can perform a task or how long it takes when doing a protocol analysis. The subject should understand that there is no pressure to act quickly, and he or she should feel free to take as long as needed to adequately verbalize while carrying out the problem solving process.
No Help or Introduction of Bias
The evaluator cannot provide any help to the user or any comments that might indicate to the user a possible approach to the task. If the evaluator is a designer it often takes a great deal of self control when the subject is perceived making a mistake with respect to something the designer thought was completely obvious. A subject who is hopelessly stuck may be extricated and placed back at the point in the interaction where the problem began, but no information should be conveyed to "help" him or her at this point. Say something like, "Okay, the system seems to have created a problem for you here. Let me put you back where you were and let you try once again."
Reciprocity and Respect
It is your ethical obligation to make the experience for the subject as pleasant and rewarding as possible. Do not allow the subject to feel humiliated. After two tries at a task, if the subject seems very upset, say something that blames it on the system, then smoothly go on to the next part of the planned exercise. Schedule one hour maximum sessions. Allow time at the end for the subject to ask you questions, and to see you demonstrate the use of some part of the system if this is desired by the subject.
The ground rules and expectations have to be conveyed to the subject before the exercise begins. They should be included in the initial instructions to the subject, an example of which will be given below.
The most important thing is for the user to constantly "think out loud;" otherwise, no data is being collected. A useful method for increasing verbalization is to demonstrate the thinking out loud technique while signing on the system and putting the user at the point where he or she is ready to take over and perform the requested series of tasks. During this demonstration, which should be scripted to be the same for each session, make a mistake, and verbalize about the possible reasons for and reactions to this problem. This helps put the subject at ease, shows that it is expected that mistakes will be made, and most importantly, demonstrates what is meant by "thinking out loud," an unfamiliar procedure for most people.
There are a number of things that are required to utilize Protocol Analysis for the evaluation of an operational interactive system. Some of these would not be necessary for evaluating a mock-up because of the limits of the available functionality. The following items need to be prepared.
4.2.2 Instructions for the User
It is necessary to write down a complete script and set of instructions for the user to read. This insures that you forget nothing and that you are consistent with all the subjects. You should
When the process is on-going, and the subject is not adequately verbalizing, you can probe by asking things like:
However you should not ask things like:
The first implies that the subject may have done something wrong by executing that operation and calls into question their intelligence. The second focuses the subject's attention on a specific feature and makes it seem more important to the subject because the person observing him or her is paying more attention to that item. This latter action destroys the validity of trying to determine what the subjects mental process is because it biases it with the preferences or biases of the observer. When the designers are the observers there are probably strong biases as to what they think are "great" features that they have designed and real surprise when a subject might choose not to use the feature.
When the user has completed some task or set of steps and has not given you a complete explanation of his or her thoughts you can ask questions that will produce what is called autoconfrontative verbalizations. Examples of these are:
Anything you say about the interface or the functionality, or any attention you pay to a particular item in the interface, can serve to contaminate the user's problem solving cognitive behavior with your own biases. The objective here is to discover what the user is thinking and not what you would like him or her to think. This is why you have to avoid giving any advice or aid unless the user is at a point where he or she is going to have to give up.
On a piece of paper write down a clear description of a series of tasks you would like the user to try to accomplish with the system. The task should be expressed in user terms and not reflect any specialized interface terminology. For example: "find an employee who can speak German" and not "look for an incidence of 'German' in the language skill field of the employee record."
This task description should be the statement of the objective you want the user to accomplish, and definitely not a step-by-step set of instructions on how to accomplish the task. For example, "send me a message," not "choose menu choice 'Mail' on the first screen, press enter, etc."
The set of tasks chosen should take less than one hour of the subject's time; many fewer tasks than you might think can be accomplished by a new user in this time frame. Generally, two or three related tasks in a single part of the system is all that can be studied in a single session. If there are other functions or screens that need analysis, they will have to be done as separate protocol analyses. Some subjects will begin to fatigue after a period of concentration, and you should be prepared to stop the process when fatigue seem to be setting in. Sometimes it might be desirable to give the users only one task at a time; this can be used to help slow them down when they are moving too fast. So you might have separate tasks on separate pieces of paper to hand to the subjects when they are ready for a new one.
Given that one cannot adequately use this method with tasks that are not familiar to the user or on advanced functionality that requires a great deal of learning, the focus should be on those tasks that are expected to be most valuable when the users begin using the system. For an existing system with an established user population, it is possible to find experienced users with the system to look at the utility of advanced functionality via Protocol Analysis. In addition, Protocol Analysis is an appropriate tool for evaluating new features that have not yet been released to the user population.
4.2.4 Recording and Transcribing
The data to be collected include a complete record of everything the subject says and everything the observer says; it is impossible to obtain an accurate and complete record of this without an audio recording. This needs to be synchronized with a record of what the screen showed at the point where the user was verbalizing, and what occurred as a result. If possible, a keystroke recording of user interactions is useful in obtaining this.
In some organizations that regularly perform Protocol Analysis, specialized software has been developed to allow one to specify what level of interaction functionality should be automatically recorded in a digital log (Smith, Smith, and Kupstas, 1991). Being able to automatically capture "action level protocols" (i.e., the level that corresponds to user strategic and reactive choices at the user task level) provides the opportunity to more extensively study the user problem solving process and possible improvements to providing better matches of functionality to the user mental model. Also, if one can develop a grammar for a specific class of problems, then the automation of the analysis of the data is also possible (Waterman and Newell, 1971, 1973; Lueke, et. al., 1987; Fisher, 1988; Sanderson, et. al., 1989; Walker, 1991).
Another possibility is video tape, although this often makes the user even more self-conscious. If a videotape is used, it should be focused only on the screen, and the user should be able to look through the camera and see that only the screen is being recorded, not his or her face or identity. There are some cases where specialized video systems have been developed to automatically synchronize data between the computer interaction and the video system (Mackay, 1989; Trigg, 1989).
It is useful to set up a coding form to be used for hand notes while observing. Columns would show parts of what is being said (the recording can be used to fill it in later), which keys are being depressed during each of these verbalizations, and what happens on the screen as a result. A coding form might have the following columns set up.
Later, the most tedious part of protocol analysis takes place; transforming these raw data into a complete, integrated transcript. This transcript needs to fill in every word that was spoken and every action that was taken; and may add observer's notes or interpretations. Only the original observer is really able to produce such a transcript; perhaps a secretary can help to create a rough transcript from a recording, but the original observer will have to complete and correct it. The observer's memory can help to fill in the correct version of any words that are unclear on the recording. The sooner after the Protocol Analysis that the transcript can be completed, the better. The observers memory of any potential unrecorded observations will decay rather fast.
For a one hour session, it may take eight hours or even more to obtain the complete, correct transcript, ready for coding and analysis. A single session with an expert user can produce twenty pages of transcript. Therefore, the use of Protocol Analysis as a regular interface evaluation tool in an organization would warrant an investment in automating the capture of the information.
When one is working with only a mockup the task is considerably introduced because what is being recorded is the specific terms for which there is unexpected interpretations or outright confusions by the subject.
At the end of the session, it is useful to have a prepared interview guide or questionnaire with some open-ended questions for the subject to complete. One question should be: "What aspect of the system was the most confusing or most difficult for you?" This is also an opportunity to examine whether the user has achieved some degree of understanding and retention of some of the major concepts in the system. This retrospective questionnaire can also ask the subject to evaluate the utility of some of the features encountered in the system. Protocol analysis is not only useful in determining the performance of the interface, but also evaluating how well the system supports the user's tasks.
To aid in understanding the results one should develop a category scheme by which one can classify what the subject is saying and which screen he or she is on when saying something. The exact nature of this coding scheme might reflect some categories that are appropriate to the particular application. However, the following categories are generally applicable:
If one has a prepared coding scheme already laid out on sheets of paper and a code for each screen, then one can quickly mark down significant items that occur. For example, just counting the number of positive and negative self evaluations that occur on a given screen can be revealing in some cases. Recording those terms that were misunderstood is certainly an obvious need.
The first step in analyzing and reporting the results is to read through each of the transcripts carefully, using highlighters or marginal notes to mark instances of difficulties or misunderstanding of the designer's intentions. Then go back and try to find patterns. Lewis (1982, pp. 5-6) explains:
We go through our notes and collect episodes in which users are having trouble or registering complaints. We make a listing of each such episode, coded by participant, and keyed to the original notes for checking. We then try to relate each episode to any aspect of the system or manual to which it may refer. By grouping the episodes, we now can collect all those that seem to be involved a given aspect of the design. For example, we collect problems of terminology, cursor control, menu flow, restarting, and many others. Episodes in these groups are then examined to determine what separate problems may be occurring in each area, and, if desired, what proportion of participants encountered a given problem. We also attempt to determine why a given problem is occurring.
If even two people (out of the six to eight observed for a particular set of tasks) have problems with the same aspect of the system, then a modification is advisable. One subject having a specific difficulty or misunderstanding may be a quirk; when it is replicated, it is a problem with the system. Summarize the areas needing improvement from those most frequently encountered to those that may possibly be problems. Be very specific; which menu terms are misunderstood, which error messages are misleading? Consider what changes to the system might be made to overcome these problems. Such recommendations are only hypotheses, however; the supposed "solutions" will need to be subjected to protocol analysis to see if they indeed improve the situation from the user's point of view.
When one encounters a recurrent problem, one also needs to consider whether the problem serves a learning function. If, for example, everyone is making the same error because of habit from a prior system and this error serves to make people recognize and learn a difference, we may have a situation where the problem is beneficial in terms of re-enforcing the learning process. The real issue is not whether everyone encounters the same problem but whether or not that problem situation keeps re-occurring for the individuals going through the learning process. In other words, certain conditions that appear as problems can in fact be learning facilitators.
4.3 Objectives for Interface Design
There are many significant things a designer can learn from the results of Protocol Analysis, including:
These are just the sort of things a designer needs to discover to improve his or her design. Some of these results depend upon what the designer is using to conduct the protocol analysis. With a "static screen mock up" the designer can get very useful feedback on the terms and initial screen layout. With a "Wizard of Oz" mock up that allows the subjects to make meaningful choices and obtain dummy, but realistic data, one can get insight for those types of tasks that do not involve any creation or change process. Finally with a working system, or its prototype, all the above results can be obtained.
One reason that Protocol Analysis is so useful in improving systems design revolves around the fact that it can provide utility at various stages in the design and development process. The early results using only static mockups can potentially be quite significant in preventing fundamental design blunders in determining such things as strategic and reactive function tradeoffs. The use of a simple static mockup and Protocol Analysis is far more effective in getting users to generate suggestions for missing functionality than can the use of a plain interview. After the software is complete, it can be quite costly to modify some of these fundamental design choices made in the early stages.
Protocol Analysis will not tell the designer about the value of new capabilities the users are currently not familiar with. Therefore, the aspects of the design related to the sort of tools that will be valuable to the users as they gain experience with the system, is not a topic that can be addressed by Protocol Analysis. Terms representing these concepts may appear to subjects as unclear and relatively useless. Also, tools or capabilities that provide "leverage" for an experienced user may not appear useful to the subject unless they are already very computer literate from other applications. In essence this approach will not provide any insight into evolution of user behavior and any problems that might occur as a result of long term usage.
Protocol Analysis cannot be utilized to assess functionality with respect to capabilities the subject is not familiar with or which result from normative aspects of the design process. For example, if the user has never used a particular type of new data report in doing his or her work they might have considerable difficulty in recognizing its existence in the system and certainly not be able to ascribe meaningful estimates of utility to it. One can assess if the interface provides some comprehension of the existence of new functionality, but not its understandability and utility. This is also true of functionality that is dependent upon earlier mastery of other functionality which exists in the system. However, in operational systems one can select users who have acquired the level of experience necessary and utilize those to investigate the usability of these more advanced facilities.
The pages that follow provide examples of the various materials needed to conduct a protocol analysis and some examples of transcripts from the resulting process. Most of these materials were developed by Starr Roxanne Hiltz for use in her course on the Evaluation of Information Systems. Exhibit 4-1 is a typical assignment as the instructor would present it to the students. Exhibit 4-2 includes the "interviewers" materials for conducting a protocol analysis including the script for introducing the task to the subject. Following that are two transcripts for the same task (Exhibits 4-3 and 4-4), one with a user who has not used a computer before, and one with an experienced user of computers. The contrast between the two is quite startling. It makes quite clear that the amount of computer literacy of a subject is critical factor in what you are observing in a Protocol Analysis.
Even when dealing with an early non working mock up of the screens the designer can obtain valuable feedback with a reduced protocol approach. Exhibits 4-5 and 4-6 are examples of protocol studies by students of their initial recipe system design. These are the summary reports prepared for the instructor as part of the project to design a recipe system. The students were asked to use three subjects where one was a novice computer user, one was an expert computer user, and one was an expert cook. This was intended to give the student an understanding of the wide range of different responses one can obtain from different types of users.
4.4.1 Example 1: Typical Assignment: Protocol Analysis
1. Form a team and pick a piece of software.
This team must consist of no fewer than two people and no more than eight. The number of people should be related to the complexity of the software to be studied (e.g., how many "parts" it has).
Your team may choose one of the following systems: (to be filled in). Teams that choose the same piece of software will then meet to decide who will work on which functions or "parts" of the system.
You must be very familiar with a piece of software before you try to design and carry out a protocol analysis of user sessions with that software.
You are to have your software chosen and group formed the first week we start on this unit and during that week, as a group, devise your task and instructions. These tasks and detailed procedures should be similar to examples you will be shown in class and which appear in some of the articles.
This involves developing a simple task for the person who is the subject and writing the sheet you will give them to let them know what the task is they are to try to do. You also have to consider whether you will prepare any other written materials for them (e.g., selected pages from a user's manual).
You should write a short (a page or less) follow up questionnaire to go to the user at the end of the test period to capture what you think might be important retrospectively. You SHOULD have a consent form; it should have a space for the subject's name, address, and telephone number.
2. Carry out a Protocol Analysis (week 2).
Each person in each pair or team should ACTUALLY CARRY OUT one protocol analysis using that guide and questionnaire. Then get together (by messages online, phone, face-to-face, any way) and decide if the instructions and procedures need to be changed in any way. Record changes made, and why.
You should use a tape recorder AND take notes on actions. Each person should write up a transcript of this session. (Typed is preferable, but neatly hand written is OK).
These activities should be completed by the end of the second week
3. Carry out an additional Protocol Analysis (week 3)
Each person should then use ONE MORE SUBJECT for the protocol analysis. You will then have a total of four or more subjects per team.
For the second subject, for purposes of this exercise only, you may choose to do just a summary of actions and problems, rather than a complete transcript.
4. Analyze the results (week 3)
Each person should independently draft recommendations for changes in the software, based on his or her two subjects. Write these down and turn them in with your transcripts, these will be the individual part of your report. The transcripts and analysis should not use the full name of a subject, in order to preserve confidentiality. The individual reports and transcripts must be shared with other members of your team. It would be convenient to post them in a computer conference for the team or to exchange them via electronic mail. This will facilitate discussion an drafting of a group report.
5. Compare results (week 4)
As a team, compare results and come to a set of overall recommendations concerning the following questions:
a. What aspects of this software are most confusing or difficult for users?
b. What changes in the interface, help material, functionality, etc., might avoid these problems for users (e.g., constructive, specific suggestions for software modifications)?
c. Write a concluding statement in which you assess the usefulness of the protocol analysis procedure in software development, from your point of view. Is it worth all the time and trouble?
Include in this group report the initial protocol instructions, the final ones (if different), and why you made any changes. The separate parts of the project and the group report should be handed in together. To make your group report, you will have to have seen each other's transcripts first. If one member of the group is "late" in completing their part, the rest of the group should turn in their transcripts and report.
This final report on the Protocol analysis assignment is due at the end of the fourth week.
You need to find subjects to study. You could volunteer to be subjects for each other on the first, "pretest" but it must be somebody not on your "team." The second subject should not be from this class. They should be subjects who would conceivably be users of such a system. For example, other employees where you work; other students not in this class. The second subject should not be somebody such as your husband or girlfriend, with whom it would be difficult for you to assume a "professional" role.
What should be included in your observations and recommendations? Things like:
1) Problems users had in using the system and why (differences in interpreting words, instructions, etc.).
2) Unexpected difficulties in doing the test, and how you might change it.
3) Insight into what the user expected or needed but did not find.
4) Points in the process where you were forced to help subjects and why.
5) Be sure to pinpoint where difficulties arose (e.g. screen layout, documentation, etc.)
Summary of what you turn in:
Your initial protocol analysis instruments and plans, including the questionnaire or interview guide; a READABLE account of the first protocol analysis; any revised instruments or procedures, with explanation of changes; a READABLE account of the second protocol analysis; your independent analysis; the summary of your group's analysis; and your concluding statement.
4.4.2 Example 2: Instructions/Script for Protocol Analysis
Message System on EIES 2 (Electronic Information Exchange System at NJIT)
Materials the interviewer needs:
Interviewers Script:
"Hi, my name is _______. And you are "________"
(Pronounce name. Make sure you have the right spelling.)
What we are doing today is trying to test a new computer system for sending messages. We want to find out how this system looks to you, to see through your eyes as a new user who has never seen it before.
First, let me familiarize you with the terminal you will be using. Every keyboard is a little different.
(Point out and demo the enter or "return" key, the + key, backspace, cursor control, whatever will be necessary to find during the test)
"The method we are using to try to find out how this system looks to you is called "thinking out loud." You will try to use the system to receive and send a message, and tell me what you see on the screen, what you understand or do not understand about what you see, what you decide to do based on what you see. I would also like you to verbalize any feelings you may have about how the system is acting.
Let me try to demonstrate what I mean by "thinking out loud."
(Do a demo at the "Welcome name or number?" and say)
"I see it is asking for your name. Let me type in your name" (check for spelling). DON'T DO RETURN. "It's not doing anything. Well, maybe it needs a carriage return to tell the computer that I'm ready for it to use this information." (Do CR). (Initial Choice screen comes up) "Yes, I guess I have to do a carriage return when I am through typing in an answer to a computer question. Now, I see that there seem to be menu choices across the bottom of the screen. The first says 'Notifications.' I wonder what that means, but I don't think it is what I want to be able to send a message. I am looking for Help and I don't see it on the menu. Oh, there it is over there, it's not a menu choice, I have to enter a question mark! Let me try that."
(enter ? and enter key; say)
"Oh, I see, the question mark is the way I can get help on any menu choice; it is telling me what these choices are. Now, how do I get out of help? There it says, use a minus sign. I'll do that."
(End demo of thinking out loud and look up)
"So, this is what I mean. Try to put into words what you think when you see the screen, and what you decide to do and why.
Before you try to figure out what is on this screen and what to do, let's go over some procedures. I am only here to record your perceptions, but if you forget to talk out loud about what you are seeing and doing, I will remind you. If you have any questions, I will answer them after you finish your session trying to use the system.
Do you have any questions about what I mean when I ask you to "think out loud?"
This is a university-based research project, so we need to get your formal written consent to test the system. Here is the form."
(As they look at the form, point out that)
"We are testing a new version of EIES, the Electronic Information Exchange System, which uses a computer to communicate with people. (Point to the confidentiality line) We will not use your name. Only the research team will see what you say in a way that can be identified with you. I will be recording what you say (point to recorder). After we are through, I will use the tape to help me make sure that I have complete and accurate notes and then the tape will be erased. Please read through the form and let me know if you have any questions. Then, when you sign it, we can begin."
(After they sign the consent form, seat them at the terminal. Turn on the recorder. Say:)
"OK, let's begin. You tell me what you think you need to do in order to receive the message waiting for you, and then try to do it. Some parts of the system that you do not need to receive and send a message are not finished yet. If you try to use them you will be told that they are not available. If that happens, just make another guess and try again. Don't forget to try to "think out loud" constantly to explain what you are thinking and doing and feeling. And also, please remember that we are testing the SYSTEM, not you.
From here on, I just listen while you try to use the system. We want you to receive the message that is waiting for you, and then follow the instructions for sending a message.
Note 1:
We will number all the possible screens the user can get, and refer to them by number in our notes."
Note 2:
If the user is really, truly, and absolutely stuck, and has gone around in circles for at least three minutes, you might intervene to move them to the next step. But do this only as a last resort, when they have apparently exhausted every idea they may possibly get about what to try or do.
Possible probes:
When the subject has finished, or has asked to stop, thank them and then ask the Post-test questions.
After questions are answered, ask: "Do you have any questions for me now?" At this point, you may sign on and show how something should have worked, if subject wants to.
Fill in: Recording form for Alpha Protocol Analysis:
4.4.3 Example 3: Transcription for a Naive Computer User
A Voting/Polling Activity on EIES
This transcription begins with the subject starting the thinking aloud portion of the exercise. The subject has been logged in, and taken as far as the beginning of the polling Activity interaction.
Notes:
Sub = Subject
Int = Interviewer
Anything in ( ) was not verbalized
Transcript:
Int: OK, now we're ready to start.
Sub: Uh, OK, you mean I, (gestures toward keyboard)
Int: Yup, go for it.
Sub: OK...(Silence)
Int: Don't forget to talk.
Sub: OK. (Hits the number 9 key, hits at least two more times)
Int: Tell me what you're thinking, what you're trying to do.
Sub: Well you told me I had to do number 9, there it is, and nothing's happening. (Hits number nine a few more times)
Int: Why are you hitting 9?
Sub: That's the one I have to do, that's the number. I'm confused.
Int: OK, umm! Look at your choices at the bottom (point to bottom of screen). OK, What are you thinking?
Sub: OK, Well, I guess I should choose Do (Starts hitting the arrow keys). Wait, nothing's happening, (pause) did I just screw up. Nance, help me.
Int: OK, my fault, don't worry. You press the letter, arrow keys don't work. (Sub's confusion was definitely due to lack of EIES 2 exposure. I was confident that helping here would not affect the results of the analysis.)
Sub: OH! (Hits d, return) Oh, now I ... OK (hits 9), (Silence)
Int: OK, What are you thinking now?
Sub: Well, I guess I choose one of these (points to questions and types in 9.1). Nothing's happening. Why? (hits arrow keys and ends up at previous menu and yelps), Oh, (exhales) OK, I know, I'll just (hits Do, return, then 9 to get back to where we were), OK, cool. I have to Respond. It says enter item, so I'm doing it. (types 9.1) Choose number. (types 2) Why does it say choose, but I can't, it won't let. Nothing's happening. (tries again) I don't know what to do. Nance, what did I do wrong?
Int: It's OK, You're not doing anything wrong. Remember return.
Sub: Oh, oh, oh, OK (hits return) OK, now I can (types in 2), (Silence)
Int: What are you thinking?
Sub: What's the. Why is it? It says response none, but this thing. (points to status box that appeared) Why do they put this on here? This still says no response and this thing, I don't get it.
Int: That's OK.
Sub: OK (types 9.2)
Int: Don't forget to talk.
Sub: Shoot. (types R, return. Types 9.3) Oh geez.
Int: What's wrong?
Sub: I didn't want. I meant to type 2. Oh, I guess it's OK. (reads and answers question) (Back at choose question prompt) What would happen if I hit return here?
Int: What do you think?
Sub: It'll probably do all of them. This one. I did this. Yeah, there's my answer. OK (Silence)
Int: Keep talking.
Sub: (Gets to second question) I have to answer this question. (answers question) (Now at third question) I'll probably, yeah, here's the other one I answered. (Silence)
Int: Don't forget to talk.
Sub: OK, here's another, respond, number. (Answered #4) (Silence)
Int: What are you thinking?
Sub: Is this a trick question? (at question 5)
Int: What do you mean?
Sub: It says legal. I see legal and I get paranoid, it must be a trick.
Int: No, no tricks. Remember this is confidential anyway. The software police won't come after you.
Sub: OK, good. (Answers question 5) OK, I answered all the questions.
Int: OK, now see if you can get the results.
Sub: OK, um, Summary, that sounds right, not implemented, OK, now what? (Silence) It's not implemented, are we done?
Int: Well, maybe that's not the right one.
Sub: Oh. OH, Display. What, what's this? What's this supposed to show?
Int: Tell me what you think it shows.
Sub: Well, there's, there's five things across, that's the five questions, right? And these numbers, oh, this, um, uh, (exhales, silence) What's this proto4 thing?
Int OK, uh, that's you. When, when I logged on, that's the ID, the logon, the, um, user ID.
Sub: OK, (Silence)I don't know, this is confusing. (Silence)
Int: OK, um, how about, look at the rest of the screen. Is there anything that helps?
Sub: Mmm, (long silence) Oh, this nine up, this, that's the question. OK, then what are these other numbers. (Silence) Oh, they must be, they're the choices, but this doesn't make sense.
Int: What's not making sense?
Sub: These other numbers here. They don't mean anything. This is stupid. They're confusing me.
Int: What would mean something?
Sub: Well, I don't know, maybe just use ones or, no, an x or check. That would help.
Int: Do you understand everything else on this screen?
Sub: Yeah.
Int: OK, now that you're done, I have a few questions to get your reaction, feelings to the system, well, the polling activity.
Sub: OK
Int Did you find that the menu choices were descriptive of the actions?
Sub: Of the two used, yes, well no, I got confused on that one part, um, what was it?
Int: The summary?
Sub: Yeah, that one, summary and display.
Int: Did you feel that more help is necessary in responding to the poll?
Sub: Responding? No.
Int: Did this system allow you to give adequate responses to the polling questions?
Sub: Yeah.
Int: Were the results of the poll easy to interpret?
Sub: No. The proto thing confused me. So did the numbers under the numbers. Checks would be better.
Int: Did you have any objections to responding to a poll in an online environment?
Sub: No.
Int: Did you feel a sense of accomplishment from using the system?
Sub: Didn't move me.
Int: Was this task a good match to the capabilities of the system?
Sub: Yeah, I thought so.
Int: Do you have any additional questions or comments about the polling activity?
Sub: No.
Int: Thank you for your time and cooperation. You did great.
4.4.4 Example 4: Partial Transcript for an Experienced Computer User
A Voting (Polling) Task on EIES
Note:
Sub = Subject
Int = Interviewer
Anything in ( ) was not verbalized
Transcript:
(Subject 2 has hit Do, return and is now at the question summary screen.)
Sub: (Reads labels out loud, at label # 9.4 says:), whatever it is, (subject says this because part of the word in the label is cut off), (Subject is now deciding what choice to make at the question summary screen), Next, Back, first I'll look at the abstract to see what this is about. (types A, reads Abstract) All right, now I'll view the questions. I'll take a look at them (types V, return). Type number, OK, I'll take a look at 9.1 and 9.2 (Types in 9.1 and 9.2, return) (Reads first question out loud) I can't answer from here like I thought. I'll try homebase to get back (types the double plus, return). This took me way back...I remember how to do this from watching you. (Sub gets back to question summary screen.). Now I understand, these are questions, not topics (referring to the labels). Oh, I see. I can hit carriage return and get all of them. (referring to answering all the questions), (at question 9.1). Ooh, I thought I had to view each one and there was an abstract for each topic. That's not the way it is. Oh, OK, I respond here. (answers questions 9.1 through 9.5)
Int: Now that you are done with answering the poll, how about taking a look at the results of the questions?
Sub: All right. Isn't the choice. I'll try Display. All right, here's the first question I answered. (Sub goes on to accurately describe the screen) (Sub pages through all results) (Sub is now at the last question results screen) This thing doesn't tell you how to get out. (tries next page, back page, several times.) There's gotta be something better than homebase. After trying homebase, I'm afraid of escape.
Int: Don't worry about trying anything.
Sub: I know. (pages back and next again) Well, I'll give it a shot. Oh, look at this, it took me back to this. (Sub is happy to be back at the question summary page.) (Sub then hits summary and sees it's not implemented.) Do you need me to do anything else?
Int: Is there anything else you want to try?
Sub: No.
Int: Now I have a few questions to get your feelings and reactions to the polling system.
Sub: 'OK.
Int: Did you find that menu choices were descriptive of the actions?
Sub: Menu choices descriptive. Yes. Yes and no. Because the labels were short. I couldn't tell the choices, which one to choose because I thought the labels were titles, and that threw me off. So yes and no. I had trouble with view and respond and display. I thought I had to view each question before I responded. And view and um, display, I didn't know what the difference was until I did it.
Int: Did you feel that more help in responding to the poll is necessary?
Sub: No, I guess not. I had trouble understanding the words and making it correlate to the screen.
Int: (still on question 2) Did you look at the highlighted portion of the screen when you typed a letter? (points to bottom of screen)
Sub: I got caught up with the rest of the screen; I never noticed it till now. Till you showed me it. The highlighted thing is good. I didn't even see it was there. It's adequate if you type the key you want to see.
Int: (still on question 2) So do you think you need anymore help?
Sub: Well, online or what?
Int: On-line, anything.
Sub: No, not with that highlighted thing.
Int: (skips to question 4) Did this system allow you to give adequate responses to the polling questions?
Sub: Yes.
Int: Were the results of the poll easy to interpret?
Sub: Yeah. It told you who picked 1 as their answer, 2, everything. But it didn't give you the totals. If you wanted to know the total of what was picked, you had to do it yourself. There weren't totals across the bottom.
Int: Did you have any objections to responding to a poll in an online environment?
Sub: No.
Int: Did you feel a sense of accomplishment from using the system?
Sub: On figuring my way through the system? Since I work on computers every day, no. I mean, it's part of my job to figure out what's going on. Compared to that, compared to what I have to do, that was easy. No.
Int: Was this task a good match to the capabilities of the system?
Sub: (thoughtful) Yeah, it was OK.
Int: Do you have any additional questions or comments about the polling activity?
Sub: If they made more use of the space between label and comment and expanded label. They left all this space here and they could use it to add to the label.
Int: Anything else?
Sub: No, just the space. Look, see all of it? (points to screen) it's wasted space.
Int: Thanks, you did a good job.
4.4.5 Example-5: Meal Management System Protocol Analysis Report
Instructions for Evaluators
To: Volunteer Evaluator
Subject: Instructions for Interface Evaluation
Thank you for volunteering to assist in evaluating this prototype of a new software system. Your feedback will be used to make specific improvements to the system. This exercise should take about one hour. I hope it will be an interesting experience for you. The system you will be exploring is a Meal Management System that is used to assist in the planning and preparation of meals in the home.
Your main task will be to tell me what you are thinking as you look at the various screens of the system. Please be as specific and detailed as possible. The version of the software you will be using is a prototype. This means that you will not actually be controlling the operation of working system. Instead you will be presented with a sequence of example screens. To move to the next screen just press the space bar. To move to the previous screen just the backspace key.
For each screen, please tell me all of the actions that you think you would be able to accomplish based on what you see. These actions include highlighting various items on the screens and pressing keys on the keyboard. For each of the available actions, also indicate what you think would happen if the action was taken. In addition, please verbalize your interpretation of the meaning of all terms and phrases on each screen.
You should also indicate anything that you are uncertain of. However, I will not be able to provide you with any assistance. The goal of this exercise is to determine how easy this system is to use by someone without access to an experienced user.
You do not need to verbalize when you are reading or concentrating on making a decision.
While you are talking, I will be recording your response. This information will be used for later analysis to make improvements to the system based on your comments. In addition, a summary will be prepared based on the results of your evaluation. However, you will not be identified in that report.
Although I will not be able to answer any of your questions during this exercise, I will use hand signals in case there is some additional information I would like you to provide. Specifically, I will point to an item on the screen that I would like you to talk about further. Hand signals will be used to minimize distractions to your thinking process. If necessary, I may ask you the following questions: what are you thinking? or what does that term mean?
Please tell me before you press the space bar to go to the next screen so I have a chance to prompt you for more information if needed.
Please remember, I am evaluating the Meal Management System, not you. Be sure to tell me everything you are thinking and describe everything you see on the screen. Now get ready to have some fun!
To begin, double click on the Meal Management System icon.
At the end of this exercise, I would like you to complete a brief questionnaire about your experience.
Results for Evaluator One (Non-Computer Literate)
Background of Evaluator:
This evaluator (my wife) has very little experience with computers. She uses Microsoft Word once per week to create and print a very simple document by adding a few lines of text to a template that I created. She has very little idea of how a computer works. However, she enjoys cooking.
Summary of experience:
Overall most of the major features were understood quickly. Some of the terms were confusing. It took considerably more time to do this protocol analysis than with the other evaluators who are experienced computer users.
Difficulties Encountered:
Initially she was not sure why she would need to enter her name. Since she is not familiar with the concept of navigating through a menu structure, she was not sure what the System Tour feature would do for her. She initially thought that the Week's Menu object might only allow dinners to be planned. She thought that by selecting Ingredient from the Main Menu she would be able to find recipes that contain a certain ingredient. She didn't know what the Review would be for. She wasn't sure what the term softkey referred to. It wasn't clear what the difference was between the Done and Exit softkeys.
She was not familiar with the idea of a pop-menu, and at first she did not realize that she was still on the same screen when the menu popped up. She was not sure of the meaning of the Completed Status indication. It wasn't clear what the Food Type and Keywords categories would consist of. She wasn't sure of the significance of the ellipsis used with some menu items. In the Find Recipe screen she thought she would have to fill in all the search conditions. In the specific Help screens, she was not sure why the F1 key was assigned to Done, when usually F8 was used for Done, and F1 was Help. In the Find Recipe screen, she wasn't sure of the significance of the X in the brackets.
In Find Recipe, she thought she would have to use the Mark Complete function before performing a search. She was not sure of the purpose of Save Search or Get Search - since these selections followed Search Worldwide, she thought they would be used to save the recipes that we downloaded from the worldwide collection. The use of Accept was not clear. In Create Week's Meal, she thought she could just type in a meal name after highlighting a field without having to press ENTER first.
Things found to be useful:
The ability to have a collection of recipes grouped into a meal and saved as a separate object. The matching of names of objects that are similar when a name is typed incorrectly.
Changes made:
Change the location of the pop-up menu in the Main Menu screen so it was clearer that this was still the same Main Menu screen. Add the option to go directly to the Create Recipe screen when trying to find a recipe and the name that is typed in does not match any currently in the collection. Add access to the Create Review (or Evaluate) screen from the F2 Special pop-up menu in View or Create Recipe or Meal. In the pop-up menus for the categories in the Find Recipe screen, don't use X's for categories that have been selected, instead leave the selected items highlighted.
What was not liked:
The access to recipes from throughout the world that she wouldn't be able to understand.
Suggestions made:
Add a feature that suggests what other recipes to serve with the current recipe. Add a feature to modify recipes by reducing salt, reducing calories, etc.
Protocol Analysis Questionnaire
Evaluator One
Thank you for participating in this activity. Please answer the following questions:
1. Which feature do you think would be the most useful?
To be able to access the recipes by specific ingredients, not just main ingredients.
2. Which feature do you think would be the least useful?
The world wide recipe retrieval since most of the recipes would probably be in a foreign language.
3. What features should be added to this system?
Glossary of terms used in pop-up menus.
4. What features should be removed from this system?
The blinking seizure lights.
5. What changes would you recommend to make the system easier to use?
The X's and black boxes were a little confusing to me - maybe have the words stay highlighted if more than one category is selected.
6. What else would you want to see changed?
Nothing else.
7. What are your comments about this evaluation experience?
Overall it was pretty straightforward and a generally pleasant experience.
Results for Evaluator Two (Highly Computer Literate)
Background of Evaluator:
This evaluator (an electronic technician) has a great deal of experience as a computer user. He also has a good understanding of how computers operate and what they are capable of doing. He uses both MS-DOS/Windows applications as well as HP-UX systems. He is very knowledgeable of databases systems. He does not do much cooking.
Summary of experience:
He was able to figure out most of the functions very quickly. Some of his misunderstanding was due to his lack of cooking experience. This protocol analysis took the least amount of time.
Difficulties Encountered:
He wasn't sure at first what the difference was between a meal and a recipe. He thought Review would be used to look at what recipes you have served in the past and when. He thought that Status: Completed meant that the system had finished processing the user's last request. In the Find Recipe screen, he wasn't sure how to enter the categories, he thought he might have to type them in the lower half of the screen. He wasn't sure what the Save Search and Get Search functions would do... possibly save the list of recipes retrieved from a search of the worldwide database. If the Special Actions pop-up menu in the List of Recipes Found screen, he was not sure what the Accept menu selection would do. He also wasn't sure if Delete would just remove a recipe from the list or if it would remove it from the collection. In Create Meal, he was not sure what the Categorize function would do.
Things found to be useful:
The Random Suggestion feature where the computer suggests what to serve.
Changes made based on experience:
Change the name of Review to Evaluation. Change the name of the Save Search/Get Search functions to Save Conditions/Get Conditions. Change name of Accept function to something else such as Choose, Use, Fill-in, or Write-in? Change name of Delete function in List of Recipes Found screen to Cut.
What was not liked:
The thought of having to type in a lot of detailed information for every ingredient that is added to the system.
Suggestions made:
Add the ability to define some additional meals in the Create Week's Menu screen, such as snack in afternoon or late at night. A feature that would automatically make a recipe milder or spicier.
Protocol Analysis Questionnaire
Evaluator Two
Thank you for participating in this activity. Please answer the following questions:
1. Which feature do you think would be the most useful?
Random and special keys that allow for more assistance from the software.
2. Which feature do you think would be the least useful?
Complete nutritional analysis on each ingredient.
3. What features should be added to this system?
Suggestions on how to make a recipe milder or spicier.
4. What features should be removed from this system?
None come to mind.
5. What changes would you recommend to make the system easier to use?
Ability to skip the entering of specific nutritional information - this could be very time consuming.
6. What else would you want to see changed?
None come to mind.
7. What are your comments about this evaluation experience?
It was okay.
Results for Evaluator Three (Expert Cook)
Background of Evaluator:
This evaluator (my administrative assistant - what used to be called a secretary) has a lot of experience as a computer user. She uses several MS Windows applications every day. She has some knowledge of how a computer works. She has a lot of cooking experience.
Summary of experience:
She had a relatively easy time figuring out how the system would function. She had the right amount of cooking and computer experience for the system as I designed it.
Difficulties Encountered:
In the General Help screen, she thought that the System Tour would be a demo of the system. In the Main Menu, she also thought that the Find Ingredient function would allow the user to locate recipes that use a certain ingredient. She was not sure what Review would be for. In View Recipe, she was not sure what Food Type and Food To Go categories would include. In Find Recipe, she was not sure what the Special Action functions would apply to, the current search conditions or the worldwide search. In Create Week's Menu, she was not sure at first how to just type a name into a field. In Create Recipe, she was not sure what Convert Units would do.
Things found to be useful:
The ability to search for meals and recipes based on many different categories.
Changes made based on their experience:
Change the softkeys in the specific Help screens so that F1=General Help, and F8=Done. In the Help for Find Recipe, where it says to press the F2 softkey, change this to read: press the F2 Special softkey. In Find Recipe, move the Search Worldwide function from the Special Actions menu to softkey F6. Change F4=Search to F4=Search Local.
What was not liked:
The ability of other users of the system to make changes to the collection of recipes.
Suggestions made:
In View Recipe, add a function which recommends what recipes might go with the current recipe that you are viewing. Assign each user a password, so someone can't use the system with another person's name. Provide the capability to prevent some users from modifying things in the collection. Maybe kids could only enter meal and recipe evaluations.
Protocol Analysis Questionnaire
Evaluator Three
Thank you for participating in this activity. Please answer the following questions:
1. Which feature do you think would be the most useful?
The ability to search for recipes based on categories.
2. Which feature do you think would be the least useful?
The world wide search of recipes via the modem.
3. What features should be added to this system?
Password protection so a person can't use someone else's name. Also the ability to prevent some users, such as kids, from modifying the collection.
4. What features should be removed from this system?
None.
5. What changes would you recommend to make the system easier to use?
Change some of the terms, such as Review - maybe call it Evaluation.
6. What else would you want to see changed?
In one of the help screens it said to press F2 for a world wide search, although on the help screen, F2 was for General Help.
7. What are your comments about this evaluation experience?
It was fun
.
4.4.6 Example 6: Guidelines for The Clever Chef Interface Inspection
To: John (Expert Computer)
Subject: Interface/Functionality User Inspection.
This memo contains the details for the Usability Inspection that you have agreed to participate in. The system that you will be reviewing is called The Clever Chef and I would like your opinion on how this system looks to you (someone who has never seen it before).
You have been selected as the "expert" computer user since you have spent a considerable amount of your career designing user interfaces. As you peruse through the system I would ask that you identify parts of the interface which are not consistent or which simply don't make sense. Please treat this inspection no differently than an inspection that we would perform on internal software products. (Although keep in mind that the design requirement specifies that the screens must be black and white and coded in DEMO II.)
It may be helpful for you to refer to the flowchart (flowchart attached) as you make comments and classify the severity of the defects that you mention. The flowchart is similar to what we use for internal defect classification, so I expect that you will be familiar with its usage (if you are unclear on any part of this defect classification flowchart, please let me know).
The review method that we will be using is called "thinking out loud". You will navigate through the system and verbalize what you see on the screen. The types of information that you might verbalize would be if the screen layouts and definitions make sense, if you understand the options that are listed, and also any feelings that you may have about how the system is acting. Ask me to give you an example of how to "think out loud" when you are finished reading this memo.
Objective of This Session:
By pressing the "F1" key you will be able to navigate through The Clever Chef. For this session, ALL tasks have been predetermined and are as follows:
Each task will require navigating (by using "F1") through several screens. It is very important that you verbalize your thoughts so that this working prototype can be improved. The inspector (Laura) is not allowed to talk during the session, except to answer logistical type questions; therefore, it is necessary that we agree on a gesture/hand movement which I can use as a signal to you. This signal (nudge on arm, tap on table, etc.) will be my means of letting you know that you are not verbalizing enough of your thoughts.
If you have any questions about this inspection, please ask now. Otherwise, we will continue with the experiment.
Expert User Profile
The person I chose as the expert user (John) has eight (8) years programming experience at XXX. He plays a significant role in the printer driver user interface for YYY products. Having him analyze the structure of the program, the definitions, and the screen layout was very useful. John considers himself to be an excellent cook; therefore, he was able to add value in terms of the type of functionality he would like to see inherent to such a system.
Experience, Difficulties, and Merits
In addition to taping the inspection sessions, I also created a Usability Inspection Concerns Log where I entered my perceptions and also logged information specific to each screen. Below I have entered the comments/issues from the session (have only listed those of particular interest) I had with John
.
| Screen | Task | Problem | Severity | Solution |
| 2 | Help | Help screen hard to read | H | Inverse text on help screen |
| 15 | Add | Align items | M | Made sure all items in all screens were aligned |
| 33 | Add | Give ingredient list: nice to see entire list after have inserted recipe | L | Listed entire recipe |
| many | Category | User arrow keys to move through list; make sure user understands this feature is available | H | Inserted arrows and PgUp and PgDn commands |
| 47 | Search | Last Accessed date confusing | H | Changed to "Last accessed recipe on mm/dd/yy" |
| 36, 37 | Search | Align #s between screens | M | Made sure all data lines up |
I found that this method of recording problems/thoughts made it easier to correct the problems that were uncovered. The severity of the given problem is rated High, Medium, or Low according to the following logic:
Is it likely the user will repeatedly experience the effects of this problem?
If no:
(A) Is the problem inherent in a task performed infrequently?
If no: Severity is Low
If yes: Severity is Medium
If yes:
Is the problem inherent in normal everyday use of the product?
If yes: Severity is High
If no:
Is the problem inherent in likely tasks?
If yes: Severity is Medium
If no: Go to (A)
Summary (Expert User):
One consistent message through the interviews was that all participants felt that a recipe system would be a terrific addition to their kitchens provided that the features were integrated and were easy to use. As an engineer, John was definitely the most critical regarding the format of the screens, alignment of text, and the "look and feel" of the UI.
Of all the people interviewed I made the most changes based upon his input. I was able to rearrange screens and provide a more standard look and feel for the application based upon the feedback John provided.
Each user got the following retrospective survey at the end of their session:
The Clever Chef Usability Survey
Now that you are done, I have a few questions to get your reactions and feelings about the Clever Chef recipe system.
1) What aspect of the system was the most difficult or confusing for you?
2) Did you find that the structure of the system made sense?
3) Did you find that the menu choices and names of the choices made sense?
4) What features have been left out that you would find useful/necessary as a potential consumer of this product?
5) Do you understand the menu item utilities? If not, can you think of another name which would be more descriptive.
6) Did you feel some feeling of accomplishment from using this system?
7) Do the capabilities of the system match the requirements of the recipe task?
8) Would a system such as this prompt you to leave the 3x5" index card method of using recipe cards?
9) Do you have any questions for me?
Thank you so much for your time, I really appreciate your input!
To: Tricia (Expert Cook)
Subject: The Clever Cook recipe handler
This memo contains the details for the Usability Inspection that you have agreed to participate in. The system that you will be reviewing is called The Clever Chef and I would like your opinion on how this system looks to you (someone who has never seen it before). I know that you cook a great deal, and I would like your opinion from the perspective of an "expert" cook. I realize that this system must offer extraordinary value in order that you may be lured away from those 3x5 index cards, so your honest opinion is vital.
As you are reviewing the system, keep in mind those tasks which are repetitive while you cook or are getting ready to cook; moreover, it is feasible that this program could simplify or automate many of the processes that are currently huge time sinks (or the program may offer additional functionality you never dreamed possible).
The review method that we will be using is called "thinking out loud". You will navigate through the system and verbalize what you see on the screen. The types of information that you might verbalize would be if the screen layouts and definitions make sense, if you understand the options that are listed, and also any feelings that you may have about how the system is acting. (Ask me to give you an example of how to "think out loud" when you are finished reading this memo.)
Objective of This Session:
By pressing the "F1" key you will be able to navigate through The Clever Chef. For this session, ALL tasks have been predetermined and are as follows:
Each task will require navigating (by using "F1") through several screens. It is very important that you verbalize your thoughts so that this working prototype can be improved. The inspector (Laura) is not allowed to talk during the session, except to answer logistical type questions; therefore, it is necessary that we agree on a gesture/hand movement which I can use as a signal to you. This signal (nudge on arm, tap on table, etc.) will be my means of letting you know that you are not verbalizing enough of your thoughts. If you have any questions about this inspection, please ask now. Otherwise, we will continue with the experiment.
Profile Expert Cook (Tricia)
Tricia is a electrical engineer and the best cook I know. Her experience with PC based programs is quite limited, yet I would consider her a good median user based upon the subject I chose as the expert and the one chosen as the novice. Tricia was ready to give up her current system of 3x5 index cards providing that the system would actually do what it said it would do. She indicated that she had actually purchased a couple of different programs but they were either too easy to use (not enough functionality) or were written more for restaurant owners. She did indicate a bit of reluctance (in a joking manner) about moving toward an on-line recipe system because currently she can tell how good a recipe is based up whether or not all of the pages of the cookbook are stuck together (she is a great cook but makes a huge mess!).
Experience, Difficulties, and Merits
Tricia really liked the system but thought that being able to break
down foods into a finer level of detail would be interesting (not necessary,
but interesting). She was enthusiastic over the system's ability to plan
meals, search for recipes, import recipe files, and keep track shopping
lists and costs. She brought me back into the reality of how "real"
cooks organize the whole recipe/shopping/inventory scheme. Below you will
find a few of the Concerns I noted in her log:
| Screen | Task | Problem | Severity | Solution |
| General | Equipment | Would like a splatter screen | L | "Ha, Ha" |
| Categories | Add a recipe | Wanted categories in a more detailed level. Other users felt categories were good | L | Stay with current categories |
| 1 | Utilities | Not descriptive enough as a menu choice | H | Change to "Recipe Tools" |
Summary (Expert Cook):
Tricia's real value was being able to provide the mental framework of a cook. I found that a few of her suggestions (i.e. breaking categories down even further), if implemented, would add marginal value and introduce system complexity which would make the novice user very frustrated. Tricia was able to provide some very useful suggestions regarding "cook lingo"; for example, what are the most commonly used measurement terms. The hard problem with system design is being able to weigh the advantages and the tradeoffs that each new user interface feature poses. Based upon Tricia's expert cook suggestions, I was able to fine tune the Clever Chef feature set and interface so that expert cooks could be persuaded to move away from the 3x5" index card method of record keeping.
To: Linda (Novice Computer User)
Subject: The Clever Chef Ease of Use
This memo contains the details for the Usability Inspection that you have agreed to participate in. The system that you will be reviewing is called The Clever Chef and I would like your opinion on how this system looks to you (someone who has never seen it before).
Although I know you are not very familiar with computers, you have been chosen as someone who could be a potential consumer of a recipe program like this. At this point you still may be a bit apprehensive about using a computer; just remember no matter what you touch on the keyboard, there is no way that you can damage anything. This should be a fun experiment, there is no way that you can do anything "wrong".
The review method that we will be using is called "thinking out loud". You will navigate through the system and verbalize what you see on the screen. The types of information that you might verbalize would be if the screen layouts and definitions make sense, if you understand the options that are listed, and also any feelings that you may have about how the system is acting.
Ask me to give you an example of how to "think out loud" when you are finished reading this memo.
Objective of This Session:
By pressing the "F1" key you will be able to navigate through The Clever Chef (if you do not know where "F1" is on they keyboard, just ask). For this session, ALL tasks have been predetermined (all you do is press "F1") and are as follows:
Each task will require navigating (by using "F1") through several screens. It is very important that you verbalize your thoughts so that this working prototype can be improved.
The inspector (Laura) is not allowed to talk during the session, except to answer logistical type questions; therefore, it is necessary that we agree on a gesture/hand movement which I can use as a signal to you. This signal (nudge on arm, tap on table, etc.) will be my means of letting you know that you are not verbalizing enough of your thoughts.
If you have any questions about this inspection, please ask now. Otherwise, we will continue with the experiment.
Novice User Profile (Linda)
The person that I interviewed as a novice (Linda) gave me an entirely new perspective on the types of people who have little or no experience with computers. Linda is a dietitian at one of the local hospitals but she cooks at home very seldom. Part of her job is to help the hospital cafeteria do meal planning; moreover, all of the meal conversions and meal plans are done by hand. The dietitians pull the handwritten recipes out of boxes, ask the cooks for approximate head counts, and do all the meal standardization with a calculator. The process of maintaining costs and managing the inventory is a nightmare.
Linda indicated that the hospital had tried to use one of the software packages that are commercially available but found using it to be more laborious than the manual method. Her initial outlook on the possibility of implementing such a system was very pessimistic. Her number one "wants" from a recipe handling system are that it is easy to use, doesn't require a great deal of maintenance, and finally that the program just sticks to the basics--no fancy stuff!
Experience, Difficulties, and Merits
After working with computers, it is easy to see that designers really
lose sight of what the "average" person's computer abilities
are. Linda had to be reminded where the "F1" key was on the computer;
moreover, for many beginning users I doubt this is atypical. I also kept
a running a concerns log of which I will share the most revealing comments.
| Screen | Task | Problem | Severity | Solution |
| 1 | Help | Did not understand "utilities" | H | Change to "recipe tools" |
| 26 | Add | Did not understand how she would be able to scroll through the category boxes | H | Added Up and Down arrow keys |
| 54 | Add a meal | How to do meal standardization | M | Added menu item for standardization of an entry |
| General | Recipe tools | System should calculate a standard unit (e.g., 23 eggs require 2 dozen) | M | Would require a bit of an expert system. |
Summary (Novice User)
Linda felt the basic add a recipe and modify/search/list/delete a recipe was straightforward and would be something she would use. She had a bit more trouble with the Recipe Tools section--I believe it is because it required a different mindset. After I made changes, based upon her suggestions, she felt that the system was quite usable and proved more accurate (than her manual calculations); even more important to her, was the Clever Chef software would allow her to save time. I learned a great deal; moreover, it was a great reality check to be able to understand what frustrates and motivates the "average/novice" user.
Protocol Analysis has an extremely high payoff relative to the investment of effort. It readily turns up any glaring errors that may cause users to stumble around in their efforts to learn the system. As a form of "user participation" in system design, it demonstrates to the users that the designers and developers of the systems are seriously concerned with ease of learning. It also serves to humble designers, since there is little doubt that no matter how experienced they are, no design is ever perfect the first time. Designing is a tradeoff and compromise process; "perfect design" is not a term that should not be considered an objective. Finding the right tradeoff is difficult even for experienced designers, and this can be aided tremendously by the feedback that Protocol Analysis provides.
There is no excuse for not employing Protocol Analysis. Organizations that are serious about producing good application software should require it as an integral part of the development process. Even if the organization does not recognize the importance of this the designer can still choose to employ this method in a casual manner with those users willing to cooperation with him or her. In the long run the users will come to appreciate those designers that actually provide them this form of structured feedback on their design choices. It shows a much higher level of professionalism on the part of the designer then walking into the users office and asking for just "off the cuff" feedback.
1. Conduct a Protocol Analysis on an application system using at least three subjects. One subject should be relatively unfamiliar with computers or a complete novice, another should be at least a casual user of computers, and the third should be a power user of computer applications.
2. Conduct a Protocol Analysis of an operating system such as MS/DOS, UNIX, or WINDOWS. In this case, you can pick users with varying degrees of familiarity with the particular operating system, but choose tasks for each one that represent things they have not tried to utilize in that operating system environment.
3. Pick a relatively complex or rich system that you know (e.g., a popular word processor or spreadsheet). Layout sets of tasks, each taking about an hour for a user, and determine how many different task sets one needs to cover the full scope of the system.
4. Interview users of a popular word processor or spreadsheet in your organization and determine what parts of the total functionality of the system they have learned and what parts they have not learned. Use this to evaluate your grouping of tasks in the prior question and to recommend the criteria for choosing individuals to conduct a protocol analysis on.
5. Pick an application package in your organization where the users can have considerable differences in their expertise with respect to the application. Using subjects with a reasonable amount of computer experience, pick a subject with little application knowledge, a moderate amount of application knowledge, and a great deal of expertise in the application area.
Carroll, et. al. (1985). Exploring a Word Processor. Human-Computer Interaction, 283-307.
Carroll, John M., and Aaronson, Amy P. (1988). Learning by doing with simulated intelligent help. CACM, (31:9), Sept., 1064-1079.
Carroll, J. M., and Mazur, Sandra A., (1986), Lisa Learning, IEEE Computer, 19(11), November, 35-49.
Denning S., et. al., The Value of Thinking Aloud Protocols in Industry: A Case Study at Microsoft Corporation. Proceedings Human Factors Society 34th Ann. Meeting, Human Factors Society, Santa Monica, 1990, 1285-1289.
Ericsson, K. A., and Simon, H. A. (1980). Verbal reports as data. Psychological Review, (3), 215-251.
Ericsson, K. A., and Simon, H. A. (1984). Protocol Analysis: Verbal Reports as Data. Cambridge, MA: MIT Press.
Fisher, C. (1988). Advancing the study of programming with computer-aided protocol analysis. Empirical Studies of Programmers. (eds.), G. Olson, E. Soloway, and S. Sheppard. Norwood, NJ: Ablex Publishing.
Henry, L. K. (1934). The role of Insight in the analytic thinking of adolescents. Studies in Education, University of Iowa Studies, (9), 65-102.
Lewis, Clayton (1982). Using the "Thinking-aloud" method in cognitive interface design. IBM research report, (RC 9265), Yorktown Heights, NY, IBM Thomas J. Watson Research Center.
Lueke, E., Pagery, P. D., and Brown, C. R. (1987). User requirement gathering through verbal protocol analysis. Cognitive Engineering in the Design of Human-Computer Interaction and Expert System, (eds.) G. Salvendy. Amsterdam: Elsevier Science.
Mackay, W. E. (1989). EVA: An experimental video annotator for symbolic analysis of video data. SIGCHI Bulletin, (21:2), 68-71.
Newell, A., and Simon, H. A. (1973). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.
Nielsen, J., Evaluating the Thinking Aloud Technique for use by Computer Scientists. Advances in Human-Computer Interaction, Volume 3, H. R. HArtson and D. Hix, eds. Ablex Press, Norwood, NJ, 1992, 75-88.
Nisbett, R. E., and Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, (84), 231-259.
Sanderson, P. M., James, J. M., and Seidler, K. S. (1989). SHAPA: An interactive software environment for protocol analysis. Urbana-Champaign, Il: Engineering Psychology Research Laboratory Technical Report # 89-09.
Slator, Anderson; and Conley (1986). Pygmalion at the Interface. CACM, July, 599-604.
Smith, J. B., Smith, D. K., and Kupstas, E., (1991), Automated Protocol Analysis; Tools and Methodology, (TR91-034), Chapel Hill, NC: UNC Department of Computer Science.
Trigg, R. H. (1989). Computer Support for transcribing recorded activity. SIGCHI Bulletin, (21:2), 72-74.
Walker, J. Q., II (1991). Automated analysis of computer-generated software usage protocols: An exploratory study, (TR91-052). Chapel Hill, NC: UNC Department of Computer Science.
Waterman, D. A., and Newell, A. (1971). Protocol Analysis as a task for artificial intelligence. Artificial Intelligence, (2), 285-318.
Waterman, D. A., and Newell. A. (1973). PAS-II: An interactive task-free
version of an automatic protocol analysis system. Pittsburgh, PA: Carneigie-Mellon
University department of Computer Science Technical Report.