Open Access

Modeling and Supporting Web-Navigation

Journal of Interaction Science20153:3

DOI: 10.1186/s40166-015-0008-9

Received: 19 November 2014

Accepted: 5 July 2015

Published: 29 July 2015


Navigation within a website is an important factor for the success of a website. Faster and easy web-navigation leads to better usability and reduces cognitive load on the user. Several cognitive models exist that simulate the web-navigation process. In this paper we propose a new cognitive model – CoLiDeS++Pic (based on Comprehension-based Linked model of Deliberate Search or CoLiDeS) that incorporates path adequacy and backtracking strategies. This model also takes into consideration the semantics of pictures. Firstly, we present here the results of an experiment in which we test the efficacy of support based on the new model CoLiDeS++Pic and multi-tasking under cognitively demanding situations. The results prove that the model-generated support is effective. Secondly, we also propose that in this way navigation behavior can be better modeled when compared to previous models. We verify this hypothesis by simulating the model on a mock-up website and comparing the results with a previous model CoLiDeS+. Extending our previous work we demonstrate that the performance of the new model CoLiDeS++Pic is improved compared to the preceding model CoLiDeS+. We further discuss the challenges and advantages of automating navigation support using the proposed model.


Web-navigation Hyperlink Information scent Navigation support Cognitive models


In today’s world, any information we seek can be searched and possibly found on the internet. In fact the amount of information on the internet is so abundant that users are known to often face what is called infobesity or information overload. It is therefore of great importance to be able to sift and navigate through millions of pages by discarding the irrelevant ones and focusing only on the relevant ones to quickly find what we are looking for. However, this is cognitively challenging for most users and research has shown that users often get lost and disoriented in this process. Here cognitive models come into the picture: they can simulate user’s navigation through web pages, can provide support by identifying relevant pages and can rectify usability problems such as poor navigation architecture of a website [1, 2]. These models take into account all the cognitive processes underlying information seeking behavior: perception, comprehension, reasoning, decision making and problem solving. The first aim of this article is to describe our own modeling efforts of web-navigation. Several models exist, some of which we first review in the next section. After that we will describe our own modeling based on CoLiDeS or Comprehension-based Linked Model of Deliberate Search developed by Kitajima et al. [11]. The second aim of this article is to empirically investigate the usefulness of automated model-generated support using the cognitive model. This article mainly differs from our related proceedings paper [24] through emphasizing the modeling part and comparing the performance of different versions of the same model, i.e. a comparison of the CoLiDeS+ and CoLiDeS++Pic model.

Cognitive models of web-navigation

Before getting into details of models of human navigation behavior with computers, we first briefly look at two theories on which most of them are based. We first look at the Construction-Integration Theory of text comprehension proposed by [3]. When we read a piece of text, several other concepts not mentioned in the text and their vivid memories appear in our mind, in working memory. Construction-Integration Theory attempts to account for these cognitive processes when reading and comprehending a piece of text. As the name suggests, it consists of two phases: knowledge construction phase and knowledge integration phase. During construction phase, instructions in the form of text are encoded and elaborated with associated or closely related and alternative meanings. Knowledge to decide between the alternatives is also inferred and generated. During the integration phase, one representation from the many alternatives and activated concepts is determined and stored in memory. Just like in text comprehension, during the process of web-navigation as well, a user has to comprehend and understand not only the hyperlink texts (link labels) visible on the current page, but also to integrate and relate these to the hyperlinks chosen during the session before. Thus, the construction and integration cycles are highly relevant to the process of web-navigation.

The second theory is called Information Foraging Theory (IFT) proposed by [4], which tries to answer questions such as which hyperlink to choose, how much time to spend on a particular page and on what basis these decisions are made. This approach is motivated by a theory called optimal foraging theory (OFT) which explains the foraging behavior exhibited by animals when looking for food. It is known that animals follow their sense of smell to find their way searching for food. IFT postulates that human beings behave exactly the same way when they navigate on the web. Analogous to smell in OFT, IFT introduced the concept of information scent which is the (imperfect) estimate of the value or cost of information sources represented by proximal cues (such as hyperlinks and icons). At any point of time, users would assess the net information gain they would get by accessing particular information compared to the cost of processing that information and follow that path that maximizes its information scent. We are now ready to describe a few cognitive models of web-navigation.

Linked model of comprehension-based action planning and instruction taking (LICAI)

Kitajima and Polson introduced a model called Linked Model of Comprehension-based Action Planning and Instruction Taking (LICAI) that simulates learning by exploration [5, 6]. It is based on Construction-Integration theory of text comprehension, mentioned above, proposed by Kintsch [3]. LICAI takes the example of an expert Mac user working with a graphical task on an Excel application to simulate user interactions. LICAI also defines the cognitive processes involved in the generation of goals from a set of instructions by proposing several modified versions of construction-integration cycles.

Method for evaluating site architectures (MESA)

Miller and Remington proposed a Method for Evaluating Site Architectures (MESA) [7]. This model focuses on the quality of link labels and the effectiveness of various link selection strategies. By varying the link quality and using links that are not fully descriptive of the target goals, user behavior is modeled. The situation when the user is not sure of his/her goal or is not knowledgeable enough to assess the relevance of the link texts to the goals is modeled. It uses three main cognitive principles: limit capacity principle, that is, human memory has limited capacity and therefore the model focuses on only one link at a time, simplicity principle, that is, the model favors simple approximations to complex features that add little value and rationality principle, that is, the model assumes that users are rational and they always choose the best strategy. The model however does not give an account of how the link relevancies are assessed, but instead focuses on the effectiveness of selecting various links given their relevance to the search goal. Also the model assumes that the structure of the website is known beforehand.

Scent-based navigation and information foraging in ACT architecture (SNIF-ACT)

SNIF-ACT (Scent-based Navigation and Information Foraging in ACT Architecture) was developed by Pirolli and Fu to predict navigational choices and simulate user behavior as they perform unfamiliar information retrieval tasks on the web [8]. Actions such as which hyperlink to click, where to go next, when to leave the website are decided based on the measure of information scent. SNIF-ACT 1.0 assumed that users assess all the hyperlinks on a web-page before making a choice. However, several studies showed that user choices are sensitive to the location of hyperlinks on the web-page. SNIF-ACT 2.0 was later introduced by Fu and Pirolli [9] incorporating mechanisms from Bayesian Satisficing Model [10]. It combines the measure of information scent, the position of hyperlinks on a search result page and the number of hyperlinks evaluated so far into a satisficing process that determines whether to continue to evaluate more hyperlinks or to click on the best hyperlink found so far.

Comprehension-based linked model of deliberate search (CoLiDeS)

CoLiDeS or Comprehension-based Linked Model of Deliberate Search developed by Kitajima et al. [11], divides user navigation behavior into four stages of cognitive processing: parsing the webpage into high-level schematic regions, focusing on one of those schematic regions, elaboration/comprehension of screen objects (e.g. hypertext links) within that region, and evaluating and selecting the most appropriate screen object (e.g. hypertext link) in that region, by determining it’s semantic similarity to the goal. Figure 1 shows a schematic representation of CoLiDeS with an example. In the example, the user goal is “I want to know at least three regions in the human body where lymph nodes are present”. The user is on the “Circulatory System” page. The user first parses the web page into high-level schematic regions like logo, left navigation column, main content and a picture. The navigation menu attracts attention and the user focuses on that region. Next, this region is elaborated and comprehended, that is, using the words in the hyperlink text related terms from long term memory are activated. Finally, one of the hyperlinks is chosen to click on. Based on Information Foraging Theory, the model postulates that this hyperlink will be the one with the highest semantic similarity with the goal.
Fig. 1

Schematic diagram of processes involved in CoLiDeS

For the specific goal that we took as an example, the user has to first select “Circulatory System” and then “Lymphatic System”. This process is repeated for every new page until the user reaches the target destination. The CoLiDeS model uses Latent Semantic Analysis (LSA) introduced by Landauer et al., to determine information scent between the user goal and the content of hyperlinks on a given web page [12]. The information scent or semantic similarity between the goal and the hyperlink text is based on the overlap in meaning between both [12]. Semantic similarity is computed with Latent Semantic Analysis technique. LSA is an unsupervised machine learning technique that builds a high dimensional semantic space using a large corpus of documents that represents a given user population’s knowledge and understanding of words. The meaning of a word or sentence is represented as a vector in that high dimensional space. The degree of similarity between a link and the goal of the reader is measured by the cosine value (correlation) between corresponding vectors [2, 12]. Each cosine value lies between +1 and −1. Closer the value to +1, higher is the similarity between two words. Values near zero represent two unrelated texts. The relatedness between words as determined by LSA is not simply based on literal similarity but rather on deeper semantic similarity. One main advantage of LSA is that the process of determining information scent can be automatized.

The CoLiDeS model has been successful in simulating and predicting user link selections, though the websites and web pages used were very restricted [13]. The model has also been successfully applied to finding usability problems, by predicting links that would be unclear to users [1, 2]. Note that all the four models discussed so far, LICAI, MESA, SNIF-ACT and its variants and CoLiDeS do not include any information from pictures when modeling navigation behavior. We will next describe some of the recent work addressing this limitation.

Related work: CoLiDeS + Pic

It is important to note that CoLiDeS relies only on textual information from hyperlinks. Since pictures are an essential component of almost all websites it would be useful to extend the model so that it also takes into consideration semantic information from pictures. This is done in the CoLiDeS + Pic model, using the basic CoLiDeS architecture. A central assumption in this model is that the processing of pictures happens at the initial stage of processing a web page. Semantic information from pictures can be readily available and in this way influence the link selection process (positively or negatively depending on the relevance of the pictures). In the example goal above, the model assumes that semantic features from the body picture are available for the user already at the first stages of processing and used during estimating the LSA of the most appropriate hyperlink. It is important to note here that the semantic features for a specific picture are obtained in a pre-phase of the research by a semantic feature generation task. In this task participants (non-expert users: university students) generate semantic features based on their judgment of the picture (in context) and then the features are selected, which are common between participants. For a detailed description of the procedure for obtaining the semantic information from pictures, as embedded in CoLiDeS + Pic modeling, we ask the readers to check [14, 15]. We will describe briefly previous work focused on getting empirical support in different ways for the CoLiDeS + Pic model, because this work will be the basis for the next model and the following empirical study and comparison of models.

The moment of processing pictures on a web page

First, we examined empirically the moment of processing of the pictures, testing the assumption of the CoLiDeS + Pic model that pictures are processed in the initial stages of navigation through a webpage [16]. In an eye-tracking experiment we registrated the eye fixation durations of participants (mainly technology students) on relevant and irrelevant pictures that were included on the pages of the presented website. The website contained biological information and the pictures were either highly or lowly relevant to the content of a page. We found that pictures were processed mainly in the first 10 % of the total time spent on a web-page during the first visit of participants on that page. These results confirm the assumption of CoLiDeS + Pic that pictures are particularly processed in the initial stages of processing a web-page.

Efficacy of the CoLiDeS + Pic model

Second, in a simulation study we analyzed the behavior of the model when high relevant, low relevant or no pictures, respectively were presented in the context of search goals. In a pre-phase, semantic features from the pictures were collected and used for computing the LSA values between search goals and hyperlinks. We found that CoLiDeS + Pic with high relevant pictures increases the values of information scent of task-relevant hyperlinks, and therefore it increases the probability of selecting those hyperlinks compared to CoLiDeS without pictures or CoLiDeS + Pic with low relevant pictures. Furthermore, the results showed that CoLiDeS + Pic with high relevant pictures chooses the hyperlinks on the shortest path to the search goals more often, and also the shortest path is found more frequently by the model. An important next question is how correct are the predictions made by CoLiDeS + Pic compared to the behavior of actual readers. This question was investigated in the next validation study.

Validation of CoLiDeS + Pic model with behavioral data

Third, in a behavioral study we varied the relevance of pictures on web pages and studied the impact of varying the relevance on the navigation pattern of real participants and compared that with the behavior of the model, that is, the links the model did choose. Most importantly, in the high relevant picture condition, CoLiDeS + Pic predicted significantly more actual user clicks. This proves the second assumption made by CoLiDeS + Pic that relevance of pictures can influence the link selection process positively or negatively depending on the relevance of pictures.

Usefulness of automatic model-generated support

Fourth, in a final experiment we used the CoLiDeS + Pic model to automatically provide navigation support [17]. The navigation support offered was based on simulation of successful paths, that is, the links chosen by the model and leading to the requested information were subsequently emphasized to the reader by highlighting the links in a contrasting green color. Compared to a control condition (without support), model-generated support had a very positive influence on navigation and search performance: participants were significantly faster and more accurate in answering questions and were less disoriented.

Summarizing the main results of our work on CoLiDeS + Pic, incorporating semantic information from high relevant pictures improves the performance of the CoLiDeS model. Also when we compared the model-predicted clicks with actual user-clicks we found that CoLiDeS + Pic with high relevant pictures predict user behavior better. Furthermore, we found that the model-generated support was helpful to improve the performance of users in terms of accuracy, disorientation, and the total time needed to perform the search tasks.

Implementation of a new model: CoLiDeS++Pic

However, several amendments to the CoLiDeS + Pic model are possible. Firstly, we can assume that users base their link selections not only on goal-relevance of incoming information on a web page but also on whether a candidate selection is consistent with past selections or not. And past selections include not only information from hyperlinks on which the user clicked but also from the pictures that those pages contained. Secondly, CoLiDeS + Pic only models the ideal situation of forward linear navigation. Backtracking steps are considered erratic actions. However, backtracking seems to be natural in web-navigation [18]. Therefore both need to be modeled: consistency with previous steps as well as backtracking. Consistency with past information and backtracking were partially implemented in [19], also an extension of CoLiDeS, called CoLiDeS+. The model CoLiDeS+ comes up with a new parameter called path adequacy that computes the semantic similarity between the navigation path so far and the goal statement. The navigation path is derived by concatenating text from all the hyperlinks selected by a user in the past up to a certain moment during the navigation session. When information scent does not continue to increase, CoLiDeS+ checks path adequacy. If path adequacy continues to increase, then there is no backtracking. However, if path adequacy fails to increase, then CoLiDeS+ triggers backtracking strategies. Then other hyperlinks with lower information scent that could increase path adequacy are considered. If there is no such hyperlink, then the model opts for going back to earlier pages. However, as mentioned already, CoLiDeS+ considers only information from hyperlink text for backtracking and ignores information from pictures. In the next section, we try to address this gap by including semantic information from pictures seen from earlier pages into computation of path adequacy and use that to implement backtracking strategies. The flow chart in Fig. 2 shows the work-flow of this model. To compute the LSA values we used the semantic space ‘tasaALL’ provided by the LSA website (http: This space is meant to represent the knowledge and vocabulary levels of first year university students.
Fig. 2

Working of CoLiDeS++Pic

The new cognitive model CoLiDeS++Pic is derived from CoLiDeS + Pic, but as indicated we added the concept of path adequacy and included the strategy of backtracking. We termed this combination as CoLiDeS++Pic. The new model CoLiDeS++Pic also uses the semantic information from pictures, just like CoLiDeS + Pic. We were interested in testing the impact of including semantic information from pictures in computation of path adequacy and inclusion of backtracking on modeling user navigation behavior. To test this, we ran both CoLiDeS + Pic and CoLiDeS++Pic on a mock-up website and used their predictions to generate support. The results of the simulations will be described after we have introduced the experiment on model-generated support.

Current experiment on model-generated support

The second aim of our research as indicated in the introduction is to investigate the usefulness of automated model-generated support using CoLiDeS++Pic. There have been earlier efforts in this direction, we cite a couple of those studies which are most relevant to our experiment [19]. used CoLiDeS + to predict the correct hyperlinks and then tried using auditory cues as support. However, participants found the auditory cues annoying and not useful. Later, [17] used CoLiDeS + Pic to generate the correct hyperlinks and they used visual highlighting as a method to provide support. Participants in their study found the support very useful and showed significant improvements in navigation performance. While the first study did not include any semantic information from pictures, the second study did not include information from the past selections. In the current experiment, we used the CoLiDeS++Pic model which incorporates both aspects to automatically generate navigation support. The basic idea is that generating navigation support using a model which incorporates information from past selections as well as information from pictures can be even more helpful. It is possible to determine at each step in the simulation of the model what the successful path is (minus detours). The hypothesis is that indicating this path to users would facilitate navigation. The navigation support offered was based on simulation of successful paths. That is, for each goal, the model was run and the links chosen by the model leading to the requested information were recorded. We used the same methodology as [17] to provide support: emphasized to the reader were the model predicted hyperlinks in a contrasting green color. For example, take the goal (or task): “Lymphatic System contains immune cells called lymphocytes, which protect our body from antigens. They are produced by lymph nodes. Name at least three locations in the body where lymph nodes are present”. On the home page, the user sees the following four hyperlinks: respiratory system, nervous system, digestive system and circulatory system. The semantic similarity values obtained for these hyperlinks in relation to the goal are 0.251, 0.251, 0.27 and 0.273 respectively. Thus, CoLiDeS++Pic predicts circulatory system as the correct hyperlink. On the next level, the user sees two hyperlinks: cardiovascular system and lymphatic system. Their semantic similarities with the goal are 0.238 and 0.242 respectively. CoLiDeS++Pic now predicts lymphatic system as the correct hyperlink. Therefore, for this goal, two hyperlinks are highlighted with a green color arrow pointing towards ‘Circulatory System’ at the first level and then towards ‘Lymphatic System’ at the second level as shown in Fig. 3. Similarly, support for all other goals is generated.
Fig. 3

Mock-up website with suggested links as green-colored arrows



Compared to the control condition (without model-generated support), we expect the support condition (with model-generated support) to have a positive influence on navigation and search performance. We hypothesize participants to be significantly faster, more accurate in answering questions, and less disoriented in navigation behavior in the support condition.

In order to examine our hypothesis regarding the influence of the variable (no-)support we performed the following experiment in which participants received support (or no support). Furthermore we introduced a second variable, multi-tasking, because we assume that model-generated support can be particularly helpful when the context demands multi-tasking [20]. More specifically, we hypothesize that when the user is provided with support, he can perform the navigation and information search task better and faster, particularly when doing multi-tasking compared to single-tasking. In the case of multi-tasking one could hypothesize that due to restricted cognitive capacity of users, offering navigation support would be more helpful compared to the situation that no extreme cognitive load is present [21, 22]. In our experiment, while users were working with the information retrieval tasks, they had also to monitor a comedy video in parallel. By introducing a secondary task during navigation, we assume that cognitive load will be higher, participants will be distracted and performance on their main task (navigation and search) will be hindered.

Design and subjects

We followed a 2 (support vs no support) x 2 (multi-tasking vs no multi-tasking) factorial design, where the two factors were between-subject variables. Forty students of International Institute of Information Technology-Hyderabad, 34 males, 6 females (age M 27.14, SD = 6.75) participated in the experiment on a voluntary basis. The participants were randomly assigned to the four groups: Group NS-NMT (no support, no multi-tasking), Group NS-MT (no support, multi-tasking), Group S-NMT (support, no multi-tasking) and Group S-MT (support, multi-tasking). Each group had ten participants.


In order to conduct the experiment, we took the same mockup website as in [14] due to the availability of semantic features for pictures present in the website. Figure 3 presents a screenshot of one of the pages in this website. On all pages the navigation menu was presented on the left. The website was hierarchically organized, four levels deep, containing in total 34 pages. We created two versions of the website. In the first version, hints for the links to be selected were not given, while in the second version, there were green-colored arrows pointing to the link where users can find the answer for the given task. This support was automatically obtained by running the navigation support tool at the back-end. Figure 3 shows the version of the mockup website with the green arrow. Thus, group NS-NMT and group NS-MT were given the website without link suggestions, while group S-NMT and group S-MT were given the same website with link suggestions. There were in total eight search questions for which the participants were supposed to find the answers. These questions were equally divided into four levels based on the number of web pages to be browsed.

The level 1 tasks require only one web page to be visited. Similarly level 2, level 3, level 4 tasks require 2, 3, 4 web pages respectively. Table 1 lists all the information retrieval tasks used in this experiment.
Table 1

Information retrieval tasks (Human body website)

Level 1

Level 3

The muscles of the esophagus contract in waves to move the food down into the stomach. What name is given to these contractions?

In the respiratory system, what name is given to the valve that drops down when we swallow in order to protect our lungs and trachea?

Which center in the brain controls the automatic process of breathing?

Name the three layers of tissue that form the heart wall.

Level 2

Level 4

Name at least three chemicals that aid in transmission of information from one neuron to the other in our nervous system.

If a blood sample contains A-antigens and anti-B antibodies, what name is given to this according to ABO system?

Lymphatic System contains immune cells called lymphocytes, which protect our body from antigens. They are produced by lymph nodes. Name at least three locations in the body where lymph nodes are present.

What specific name is given to those motor neurons that act on the muscles of the face and the neck?


We measured the performance of the participants in terms of accuracy, disorientation and task-completion time.

Accuracy was assigned 1 or 0 score, if the participant answered the specific question correctly or not. It was assigned as 0.5 if the participant reached to the correct page at the end but his answer was not correct. Mean accuracy over all eight tasks was computed (score range 0–1).

Disorientation was measured based on the ratio of visited and optimal node counts [23]. The formula for calculating disorientation is as follows:
$$ L=\surd \left({\left(N/S-1\right)}^2+{\left(R/N-1\right)}^2\right) $$

R is the minimum number of pages needed to visit in order to finish the task,

S is the actual number of pages visited,

N is the number of distinct pages visited and

L is the disorientation. A higher value of L means less goal-directed navigation behavior with many detours.

Task-completion Time was the time taken by participants to complete each task. It is based on the interval between presenting the question and submitting the answer.


Participants in groups S-NMT and S-MT were instructed that the link suggestions were only an advice, and that they were free to ignore the advice and click on a hyperlink of their choice. Participants of groups NS-NMT and NS-MT were asked to perform the search tasks of finding answers to questions with no secondary task in parallel. The other two groups were asked to perform a secondary task (monitoring a comedy video) while they were completing the main tasks. These participants were informed that at the end of the session they would be asked questions about the video. At the end, participants were asked to name the characters present in that video. This was done to ensure that participants would pay attention to the video also, thus really doing multi-tasking.



We did a 2X2 between subjects ANOVA with support and multi-tasking as independent variables and accuracy as dependent variable. We found a significant main effect of support F(1, 36) = 6.40, p < .05. Participants were significantly more accurate in the support condition compared to the no-support condition. The main effect of multi-tasking was not significant (p > .05). The interaction of support and multi-tasking was also not significant (p > .05). Figure 4 shows the means of accuracy in all four groups. Multi-tasking or no multi-tasking, participants were equally accurate. Also, there was no significant advantage of support found for the participants in the multi-tasking group.
Fig. 4

Mean accuracy in relation to support and multi-tasking


We did a 2X2 between subjects ANOVA with support and multi-tasking as independent variables and disorientation as dependent variable. The main effect of support on disorientation was highly significant, F (1, 36) = 18.99, p < .01). The main effect of multi-tasking was also significant F (1, 36) = 5.35, p < .05. However the interaction of support and multi-tasking was not significant (p > .05).

We can see from Fig. 5 that participants were significantly less disoriented in the support condition compared to the no-support condition. The participants did deviate less from the correct navigation path when provided with the hints. However, they were significantly more disoriented when there was no multi-tasking and were better off with multi-tasking! To investigate this deeper, we checked the participant’s performance on the secondary task by summing up the number of characters named correctly. We found that they performed equally well on the secondary task in both conditions (of support vs no-support). May be multi-tasking forced participants to focus better on both tasks. The support was not found to be significantly better for participants in multi-tasking condition.
Fig. 5

Mean disorientation in relation to support and multi-tasking

Task-completion time

And lastly, we did a similar 2X2 between subjects ANOVA with support and multi-tasking as independent variables and task-completion time as dependent variable (See Fig. 6).
Fig. 6

Mean task-completion time in relation to support and multi-tasking

The main effect of support was significant F (1, 36) = 16.54, p < .001. Participants were significantly faster in completing tasks in the support condition compared to the no support condition. The main effect of multi-tasking was not significant (p > .05). Irrespective of whether they were multi-tasking or not, there was no significant difference in task-completion times of participants. The interaction of support and multi-tasking was also not significant (p > .05). The impact of support was not significantly greater under multitasking condition.

Summarizing, we found a significant impact of providing support with respect to all our three metrics. Participants in the support condition were significantly more accurate, less disoriented and faster compared to those participants without support. We did not find any significant effects of multi-tasking. We did not find that support is particularly helpful for participants who were multi-tasking. Since all our participants were technology students who were very used to the internet and spend considerable amount of time on the computer in a day, we think this manipulation did not have any significant influence on their performance. However, we did find one counter-intuitive effect on disorientation. We found that participants were significantly less disoriented when asked to multi-task.

Comparing model performance of CoLiDeS + Pic and CoLiDeS++Pic

In this section we return to the first aim of this article, which in particularly extends our previous conference publication (at WIMS14, [24]): to investigate whether taking into account path adequacy and backtracking strategies next to the semantics of pictures lead to better model performance. For that purpose we will compare and contrast the two models CoLiDeS + Pic and CoLiDeS++Pic. Does inclusion of information from past selections and backtracking have any impact on the model’s performance?

First we present here an example of the implementation applied on the mock-up website we used in the experiment. The user goal is, for instance, “Lymphatic System contains immune cells called lymphocytes, which protect our body from antigens. They are produced by lymph nodes. Name at least three locations in the body where lymph nodes are present”. The correct links are Home: Introduction > Circulatory System > Lymphatic System.

At level 1, one out of four given links needs to be selected. The cosine values for each of the links, representing the semantic similarity between the user goal and the hyperlink text along with the semantic features of the picture present on the webpage are calculated. The link ‘Circulatory System’ with the highest LSA value (0.273) is selected. At level 2, again the LSA values are calculated. The highest LSA value obtained at this level is 0.242 (which is < 0.273). As the LSA value is not increasing, path adequacy (PA) is calculated. The PA of first link (Circulatory System + Cardiovascular System) compared to the goal is .284, and second link (Circulatory System + Lymphatic system) compared to the goal is .308, so the second one is chosen. Thus, we get “Lymphatic System” as the suggested link which is the correct link to achieve the goal. The LSA value is now increased from .242 to .308. Table 2 shows the LSA values at different levels for this example.
Table 2

LSA values for the example

Level 1

LSA Value


Respiratory System



Nervous System



Digestive System



Circulatory System



Level 2

LSA Value

Path Adequacy

Cardiovascular System



Lymphatic System



Note: bold links are selected by the system

Next we included in Table 3 the average similarity (LSA) value given by the CoLiDeS + Pic model and the CoLiDeS++Pic model for the correct hyperlinks across each of the 8 goals in the way as indicated above. These values give us a measure of the strength of each choice made by the models.
Table 3

Overall (mean) performance of CoLiDeS++ Pic and CoLiDeS + Pic in terms of LSA value



CoLiDeS + Pic

1 (level1)



2 (level1)



3 (level2)



4 (level2)



5 (level3)



6 (level3)



7 (level4)



8 (level4)



Overall mean



Note: bold numbers indicate cases where CoLiSeS++Pic has a higher LSA value

We can see from Table 3 that the overall mean efficacy of CoLiDeS++Pic is higher in terms of LSA values when compared to CoLiDeS + Pic (t(df = 7) = 2.59, p < .05). The CoLiDeS++Pic model works better in terms of LSA value when compared to CoLiDeS + Pic. The differences in the LSA values of both the tools were mainly observed when the path adequacy comes into consideration, which is particularly the case at deeper levels.

Effects of support on user behavior

In Table 4 we have included the main results (means) regarding user behavior of the previous experiment using support based on the CoLiDeS + Pic model as used in the study of [17] (indicated as Experiment 1) next to the results of our current experiment (indicated as Experiment 2), using the CoLiDeS++Pic model support. Please note that the same tasks, same website and same metrics were used in both experiments. However participants were different and between both studies there was a time interval of two years. For this reason we omit statistically testing differences between experiments.
Table 4

Mean behavioral results in Experiment 1 and 2


Experiment 1

Experiment 2

Control condition

Support condition

Control condition

Support condition











Task-completion time (seconds)





We make two observations on Table 4. First, we already have mentioned that also in previous study (Experiment 1) which used CoLiDeS + Pic, participants with support were significantly more accurate in solving the tasks, took less clicks and were less disoriented. Second, also in the current experiment (Experiment 2) model-generated support had a positive - though weak - effect on accuracy (t(df = 18) = 1.34, p < .10 one-sided), a strong positive effect on disorientation (t(df = 18) = 3.62, p < .01) and a positive effect on time to solve the tasks (t(df = 18) = 2.85, p < .01). These results mean that, just like in the previous study which used CoLiDeS + Pic, in this experiment also, participants in the support condition were significantly faster, more accurate and less disoriented.

Discussion and Conclusion

We presented here a new cognitive model CoLiDeS++Pic. The model is derived from two models CoLiDeS+ and CoLiDeS + Pic, taking into consideration the advantages of both [15, 19]. Based on this model we also introduced and tested a support tool for navigation. Given a specific goal, the tool provides navigation hints to the user. It uses semantic similarity between the user goal and the website hyperlinks, and so does not require any past experience or navigation history for providing hints. Furthermore it uses the navigation path, a backtracking mechanism and semantic information available from pictures. When we compared the performance of the new model with the CoLiDeS + Pic model we observed that the performance in terms of LSA value is improved (Table 3). It is capable to select the correct hyperlinks with higher information scent (higher LSA value; with higher confidence so to say. Regarding our first aim we can conclude that when backtracking, navigation path and semantics from pictures is included in the computational cognitive model, the modeling is enhanced. In subsequent analysis of the data we still have to compare the different cognitive models (CoLiDeS+, CoLiDeS + Pic and CoLiDeS++Pic) for their predictability of actual user behavior; we will do that by examining the match of the behaviors of respective models with user navigation patterns [see [25] for the specific method to do that]. These outcomes are needed to decide which model provides the highest predictability of user navigation behavior and thereby also decide which enhancement (backtracking via CoLiDeS+, pictures via CoLiDeS + Pic and both backtracking and pictures via CoLiDeS++Pic), provides how much improvement.

One of the main reasons of development of the modeling is of course to be able to provide more adequate support during navigation, resulting in better information seeking performance of participants. Summarizing the main results of the current experiment regarding user performance, we found that the model-generated support was helpful to improve the performance of users in terms of accuracy, disorientation, and total time needed to perform the search task (Table 4). Regarding the second aim of this article we can thus conclude that automatic model-generated support using a cognitive model can be useful for participants. Multi-tasking had no negative effects, and on disorientation even a positive effect. Apparently we were not successful in bringing about a secondary task that was taxing working memory. Even in the no-support conditions the differences in accuracy or total time needed were minimal between the multi and no multi-tasking conditions. Regarding disorientation we even did find a positive effect of multi-tasking. This effect is in line with a study reported in [26], where it is suggested that interruption is not always as deleterious to productivity as one might expect - although it creates more stress (which we did not measure). We interpret this pattern as that these groups might have had a higher motivation (already from the beginning or due to the instruction) for doing their search tasks. In subsequent studies a stricter and more taxing secondary task has to be used.

At present, the tool is automated except for the module of semantic features extraction from pictures, as it involves the manual task for semantic feature generation. The semantic aspects can be incorporated using the metadata and keywords for pictures when they are available. Unfortunately, extracting meaning from pictures is not yet sophisticated enough to do this automatically. State of the art systems are beginning to extract only low-level perceptual features such as color, shape, orientation and are not sophisticated enough yet to extract higher semantics. However, we demonstrated in this study that even using non-expert participants, semantic aspects from pictures can be incorporated into the modeling and that it is possible and useful to do so. For the time being the system can be used without picture information, and with the materials we used, the model and tool are functioning adequately as we have shown in our experiments.

Of course the system is preliminary and open for extensions, such as studying the model with web sites containing more than one picture on a page. It is significant to note here that we also tested our tool (not reported here) with few real-time websites (while not considering the semantics of pictures), where it worked well and provided useful and usable suggestions to the user. We used here a small website for evaluating the tool and in all cases (all questions) the answer to the question was found by the model. This triggers the questions what happens when a set of related websites is used, and also when the model fails, in the sense that no solution to the search goal is found, what information should be given to the user. One could think in this case of suggesting several alternatives consisting of the best links (more than one). Another possibility is to present two suggestions when information scent values are closely together. Several options are possible and need to be examined empirically. Another issue for further research concerns the situation that the system proposes wrong links that mislead the user. Quite another interesting research area involves the specificity of goals (given to the user or formulated by the user him/herself). Goals (or tasks) can be too general or too specific given the information available in the website(s) [27]. The efficacy of the performance of the model will be dependent on the appropriateness of the goal. This is as it should be: when the model (or user) is asking a too general or too specific question its behavior deteriorates or slows down. The support that can be given is equally dependent on posing the right goal level or asking the right question so to say. We are working this moment on the issue of using real time websites and testing the model in this environment. Here we will also pay attention to the respective contribution of the different features (navigation path, backtracking strategy and picture information). The issues mentioned will be tackled in subsequent research.

The tool can be helpful to a wide variety of internet users. The experiment reported here shows that when the users are provided with the tool suggestions (by the green arrows), their performance in terms of accuracy, (dis)orientation and time needed to perform the search tasks is improved. We also assume that visually impaired people when augmented with text-to-speech facility or people having memory problems and new internet users can benefit by the tool too. In the future, we plan to validate the tool support for these use cases. We plan to study the system with older aged readers because we assume that particularly for readers with constrained cognitive abilities - like problems with working memory capacity - navigation support could be helpful. Finally, we want to include the possibility that readers themselves indicate and formulate their own goal - instead of the externally given goals as it is right now. We will address these questions in our further studies. Overall, we claim that making use of a cognitive model for navigation support is a useful and promising research area.



We would like to thank Saraschandra Karanam for his critical input and efforts in improving this article. This research is funded by the Netherlands Organization for Scientific Research (project MISSION 464-13-043).

Authors’ Affiliations

Institute of Information and Computing Sciences, Utrecht University
International Institute of Information Technology


  1. Blackmon, MH, Kitajima, M, & Polson, PG. (2005). Tool for accurately predicting website navigation problems, non-problems, problem severity, and effectiveness of repairs (Proc. CHI 2005, pp. 31–40). New York: ACM.Google Scholar
  2. Blackmon, MH, Mandalia, DR, Polson, PG, & Kitajima, M. (2007). Automating usability evaluation: Cognitive walkthrough for the web puts LSA to work on real-world HCI design problems. In T Landauer, D McNamara, S Dennis, & W Kintsch (Eds.), Handbook of latent semantic analysis (pp. 345–375). Mahwah, NJ: L. Erlbaum Associates.Google Scholar
  3. Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York: Cambridge University Press.Google Scholar
  4. Pirolli, P, & Card, S. (1999). Information foraging. Psychological Review, 106(4), 643–675.View ArticleGoogle Scholar
  5. Kitajima, M, & Polson, PG. (1995). A comprehension-based model of correct performance and errors in skilled, display-based, human-computer interaction. International Journal of Human-Computer Studies, 43(1), 65–99.View ArticleGoogle Scholar
  6. Kitajima, M, & Polson, PG. (1997). A comprehension-based model of exploration. Human-Computer Interaction, 12(4), 345–389.View ArticleGoogle Scholar
  7. Miller, CS, & Remington, RW. (2004). Modeling information navigation: Implications for information architecture. Human-Computer Interaction, 19(3), 225–271.View ArticleGoogle Scholar
  8. Pirolli, P, & Fu, WT. (2003). SNIF-ACT: A model of information foraging on the world wide web. In Proceedings of the 9 th International Conference on User Modeling. Johnstown, Pennsylvania, USA (pp. 45–54). Heidelberg: Springer.Google Scholar
  9. Fu, WT, & Pirolli, P. (2007). SNIF-ACT: A cognitive model of user navigation on the world wide web. Human–Computer Interaction, 22(4), 355–412.Google Scholar
  10. Fu, WT, & Gray, WD. (2006). Suboptimal tradeoffs in information seeking. Cognitive Psychology, 52(3), 195–242.View ArticleGoogle Scholar
  11. Kitajima, M, Blackmon, MH, & Polson, PG. (2000). People and Computers XIV, 357–373. London: Springer. A comprehension-based model of web navigation and its application to web usability analysis.Google Scholar
  12. Landauer, TK, Foltz, PW, & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25, 259–284.View ArticleGoogle Scholar
  13. Kitajima, M, Polson, PG, & Blackmon, MH (2007). CoLiDeS and SNIF-ACT: Complementary models for searching and sensemaking on the web. Human Computer Interaction Consortium (HCIC) 2007 Winter Workshop.
  14. Karanam, S. (2011). A Cognitive model of web-navigation based on semantic information from pictures, PhD Thesis. India: IIIT-Hyderabad.Google Scholar
  15. Van Oostendorp, H, Karanam, S, & Indurkhya, B. (2012). CoLiDeS+ Pic: a cognitive model of web-navigation based on semantic information from pictures. Behaviour & Information Technology, 31(1), 17–30.View ArticleGoogle Scholar
  16. Aggarwal, S, van Oostendorp, H (2012). When are pictures processed on a web-page? In Intelligent Human Computer Interaction (IHCI), IEEE: 1–6
  17. Karanam, S, van Oostendorp, H, & Indurkhya, B. (2011). Towards a fully automatic model of web-navigation. In Modern Approaches in Applied Intelligence (pp. 327–337). Berlin Heidelberg: Springer.View ArticleGoogle Scholar
  18. Cockburn, A, & McKenzie, B. (2001). What do web users do? An empirical analysis of web use. International Journal of Human-Computer Studies, 54(6), 903–922.View ArticleMATHGoogle Scholar
  19. Juvina, I, & van Oostendorp, H. (2008). Modeling semantic and structural knowledge in web navigation. Discourse Processes, 45(4–5), 346–364.View ArticleGoogle Scholar
  20. Kahneman, D. (1973). Attention and Effort. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  21. Logan, GD. (2002). An instance theory of attention and memory. Psychological Review, 109, 376–400.View ArticleGoogle Scholar
  22. Logan, GD, & Burkell, J. (1986). Dependence and independence in responding to double stimulation: A comparison of stop, change and dual task paradigms. Experiment Psychology: Human Perception and Performance, 12, 549–563.Google Scholar
  23. Smith, PA. (1996). Towards a practical measure of hypertext usability. Interacting with Computers, 8(4), 365–381.View ArticleGoogle Scholar
  24. Aggarwal, S, van Oostendorp, H, & Indurkhya, B. (2014). Automating Web-Navigation Support by Using a Cognitive Model. In R Akerkar, N Bassiliades, J Davies, & V Ermolayev (Eds.), Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS’14) (pp. 1–6). New York: ACM.View ArticleGoogle Scholar
  25. Karanam, S, van Oostendorp, H, & Indurkhya, B. (2012). Evaluating CoLiDeS+ Pic: the role of relevance of pictures in user navigation behaviour. Behaviour & Information Technology, 31(1), 31–40.View ArticleGoogle Scholar
  26. Mark, G, Gudith, D, & Klocke, U. (2008). The cost of interrupted work: more speed and stress (Proc. CHI 2008, pp. 107–110). New York: ACM.Google Scholar
  27. Athukorala, K, Oulasvirta, A, Glowacka, D, Vreeken, J, Jacucci, G. (2014). Narrow or Broad? Estimating Subjective Specificity in Exploratory Search. International Conference on Information and Knowledge Management 2014. Shanghai, China: 819–828.


© van Oostendorp and Aggarwal. 2015

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.