User Interface Evaluation by Two Factor Model and Heuristic Evaluation: A Perspective from Ecommerce Industry of Pakistan

Objective: In today’s world buyers have the luxury to purchase products using ecommerce applications from their mobile phone or any other handheld device? This study extends the research on user behavior pertaining to Interface. Methods: The study looks forward to understand a wider audience by analyzing three renowned e-commerce applications in the market, which includes kaymu.pk, daraz.pk and yayvo.com. The focus of the research is on online store applications design factors and their influence on the buying actions of the consumer. For that hygiene–motivation theory is selected , after selecting the theory the variables used in this research are Satisfaction, Dissatisfaction (dependent variables), Navigation, Information Display, Response Time (Hygiene Factors), Screen Complexity, Visual Appearance and User Empowerment (Motivation factors) depicting independent variables. Three different Evaluation methods have been used, namely statistical evaluation, heuristic evaluation and cognitive walkthrough is used. Findings: With the help of statistical finding we are concluding the best application which satisfies the customer most. Application: With help of our analysis we come to conclusion that Daraz is the best interface among all three applications.


Introduction
E-commerce industry in Pakistan is set for expansion. As the internet technology reaches masses, it allows electronic retailers also known as e-trailers faster entry into the market with significantly fewer investments costs and better chances of a quicker return on investments. However, this has created robust growth and competition. It is important how the service providers differentiate themselves in the market place in order to sustain their growth and profit margins. Hence today, understanding user behavior and evaluating the factors in interface design which affects it, is crucial for e-tailers. Also, the technology war has intensified the competition in this market segment of e-commerce. Websites remain the prime focus of all the e-tailers in the business but with high smartphone penetration opens a new horizon. Now day's applications for smartphones are more accessible to people rather than the websites so user satisfaction becomes the major focus of e-commerce sites. In this paper analysis of three online store applications has been carried out which are operating in Pakistan.
From the previous analysis 1 of e-commerce trends with respect to user behavior in Pakistan by Kaymu.pk narrates that 56% are estimated new visitors and 44% are observed returning visitors of the website. The conversion rate of returning visitors (users visiting the website again) is double (specifically 98.5%) than first-time visitors. The average website session for the new visitor for 3 minutes, and for the returning visitor it is 5 minutes. These statistics can be perceived such that usually after the first visit buyer makes up his/her mind to purchase a product in a second or third visit. Returning visitors are usually brand loyal, they spend time browsing products and making a purchase 1 .
Our study extends the research on user behavior pertaining to Interface. We look forward to understanding a wider audience by analyzing three renowned e-commerce applications in the market, which includes kaymu.pk, daraz.pk and yayvo.com. We focus on online store applications design factors and their influence on the buying actions of the consumer.
User satisfaction is the most important metric that helps understand user retention on a certain website. Satisfied users spend more time on a website and they tend to return often. Many theories have been proposed that describes satisfaction in terms of various factors 2 . We tend to explore these theories in literature and have marked our research on two-factor Hertzberg theory that classifies the attributes into hygiene and motivation factors as according to the theory" the emotions of satisfaction and dissatisfaction are caused by two different categories of characteristics and lack of dissatisfaction does not mean satisfaction, and lack of satisfaction does not mean dissatisfaction" 3 .
These factors are tested using statistical techniques factor and regression analysis. Also, heuristic evaluation and cognitive walkthrough have been done to strengthen the research by understanding the influence of these factors on user behavior. The outline of the remainder of the paper is as follows. In section 2 we review the literature on factors of online store design and evaluation techniques to measure its usability. Section 3 describes the research methodology. Section 4 presents data analysis. In section 5 research findings and its implications are presented. Section 6 concludes the discussion.

Literature Survey
There is a lot of research on the attributes of websites based on their nature. Different researchers have presented their theories. Two-factor theory (Hertzberg, Mausner & Snyderman, 1959) classifies factors into hygiene and motivation. Hygiene factors are utility pre-serving and motivation factors are utility enhancing factors 2 . The theory foundationally explained the factors of job satisfaction in the workplace. It has been used to understand the motivation and hygiene factors induced by internet services. For online stores, hygiene factors judge the website attributes that attract the consumers to enter the market place while motivation factors are crucial to determine conversion rate for the customers -which means how many visitors convert into paying customers. The spectrum of measurement is different for motivation and hygiene factors. Hygiene factors relate to user dissatisfaction whereas motivation factors correlate with user satisfaction 4 .
The disadvantage of this theory is that classification can be largely based on opinion. Some factors can be classified as either hygiene or motivation based on individual differences. However, the factors included in our study are supported by a research in 4 . The study with the help of a wide survey classifies online store design as hygiene and motivation factors.
Another theory that has been widely used to understand user satisfaction is expectation disconfirmation theory. The theory basis satisfaction on: expectation, performance, and disconfirmations. Expectations are developed and if they are met that leads to satisfaction. Satisfaction, in particular, is achieved if performance is above expectations i.e. positive disconfirmation. Dissatisfaction is the result of a performance that is below expectations i.e. negative disconfirmation. There can also be a situation where performance matches expectation and there is no impact on satisfaction (zero disconfirmation) 2 .
In 6 proposes a model that classifies attributes into three groups. The first group is called 'dissatisfies' which are basic requirements that lead to dissatisfaction if not fulfilled. The second group enhances satisfaction if performance is good but would not lead to dissatisfaction if they are not met. It can be actually based on gentle surprises that can cause delight. The third group has attributed that can lead to satisfaction if performance is good and dissatisfaction otherwise.
There are more theories explaining the user satisfaction based on attributes. One of them, like: by 6 states that negative effects have a greater impact than positive effect or pleasant conditions. All these theories have been used to evaluate user satisfaction for website interfaces. We will carry forward with two-factor theories by Hertzberg because of simplicity and wide use in literature. Vol 12 (22) | June 2019 | www.indjst.org Syed Asim Ali, Afshan Ejaz, Hira Anwar Khan and Sania Siddiqui The research is aimed to be strengthened with evaluation techniques: regression analysis, cognitive walkthrough and heuristics evaluation. In a heuristic evaluation, UI specialists with help of their experience study the interface and look for usability properties that can be improved in order to avoid problems or inefficiency. Heuristic evaluation is low cost and time-saving. It has been found to identify much more problems than any other method 7 . We have adopted heuristics by that suits the e-commerce needs. While the cognitive walkthrough is a usability evaluation method in which one or more evaluators work through a series of tasks and ask a set of questions from the perspective of the user. The focus of the cognitive walkthrough is on an understanding of the system's learn ability for new or infrequent users. Also, the above three different models have been tested on 5 points Likert scale; converged by Factor Analysis and evaluated by Regression analysis. A mean squared valued scatter plot has been used to identify the most highly rated application and for analyzing zone of intolerance, efficiency and satisfaction using a model from 3 .
Cognitive walkthroughs help understand the core tasks from the user perspective. The actions and feedback of the interface are compared with the user's goal and knowledge. Differences in user understanding and interface implications are noted 7 . Evaluation in the study is done by three different techniques in order to overcome the drawbacks of individual technique and have a more profound understanding of user behavior and application attributes.

Methodology
For this research, there are three different approaches that have been followed. Firstly, a statistical approach is followed to test significant relationships among satisfaction versus motivation factors and dissatisfaction versus hygiene factors. Quantitative research methodology has been used to collect data. Data was collected using an indirect method by using the survey questionnaire. Convenient Sampling was employed because of limited resources Also to rate the three applications on a graph to check which zone they belong. A total of 96 out of 100 responses were received. A questionnaire was floated personally (hard copy) and online (google forms). The respondents were asked to download the applications (Daraz, Kaymu, Yayvo) if not used before and review it before filling the questionnaire. Millennials generation representing people between the ages 18-34 are the target respondents for this research. The respondent includes students, industry peoples, teachers, and experts. The whole survey was conducted around 5 weeks which include.
Secondly, an approach is heuristic evaluation is used to test the overall usability of the application; the respondents of this evaluation is usability expert users. This helps to find the detailed and hidden flaws of an application.
Finally, cognitive walkthrough tests the process of completing a task. Tasks are designed by the surveyor to check the overall flow of the application and to check how the user feels about the process and the end goal. This test would likely explain the problems within completing a task or transaction and also for highlighting the most significant issues in the applications. Novice and Infrequent Users are the respondents for this test. Following is the information about the tools and techniques used for all three methods.

Statistical Evaluation
Controlled Psychologically-oriented Experiment techniques are used to collect data from respondents. Scientific methods and technique are applied to evaluate humancomputer interaction aspects. In our case, a problem of scarce literature was identified on e-commerce application user interface design.
Considering the theoretical framework, a lucid and testable hypothesis was designed. The hypothesis explained in below section were checked with Satisfaction and Dissatisfaction as dependent variables and Screen Complexity (SC), Visual Appearance (VA), User Empowerment (UM) Navigation (N), Information Display (ID), Ease of Learning (L) and Response Time (RS) as independent variables. Factor Analysis was used to randomly assign subjects to groups. Principal Component analysis and varimax rotation method were used for data reduction. Control for biasing factors (User Empowerment and Ease of Learning) was eliminated by factor analysis due to high variation in responses. Lastly, a Regression analysis was used to depict the explanation of satisfaction and Dissatisfaction with respect to hygiene and motivation needs. SC VA and UM are subquotients of motivation and N, ID, L, and RS are sub-factors used for hygiene needs.
SPSS is the software used for the evaluation due to its easiest interface and accurate results. The questionnaire was designed in two sections. The first part was Hygiene, Motivation, Satisfaction, and Dissatisfaction tested on a five-point Likert scale; '5' as Strongly Agree through '1' namely strongly disagree. The second section was of demographics which collected the basic information (Age, occupation, qualification, etc.) of the respondents (Figure 1).

Heuristic Evaluation
The ten rules for heuristic evaluation are followed to evaluate the three applications in a discussion. According to approximately a single evaluator is able to find 35% of usability problems. The optimal ratio of users for testing points out of more than 80% of the usability problems. In our case, three experts have evaluated the applications and filled a questionnaire based on the 8 . The questionnaire is cited in the appendices section.

Cognitive Walkthrough
The cognitive walkthrough is a usability evaluation method in which one or more evaluators work through a series of tasks and ask a set of questions from the perspective of the user. The focus of the cognitive walkthrough is on the understanding of the system's learnability for new or infrequent users. We asked every participant to do some basic tasks using the application so we could check whether the user can easily understand the application and perform the mentioned task easily. The cognitive walkthrough also revealed to us how efficient applications were and if the feedback provided was sufficient for the users.

Data Analysis
This section explains the hierarchy of the test applied and their results. This section like the previous one is divided into three subsections. Explaining one of the three techniques used.

Statistical Evaluation
Quantitative research methodology has been used to collect data. Data was collected using an indirect method by using a survey questionnaire. Convenient Sampling was employed because of limited resources. A total of ninety-six out of hundred responses were received. A questionnaire was floated personally (hard copy) and online (google forms). The respondents were asked to download the applications (Daraz, Kaymu, Yayvo) if not used before and review it before filling the questionnaire. Millennials generation representing people between the ages 18-34 are the target respondents for this research. For Pakistan millennials constitute more than 30% of the population are the prime drivers of technological usage. This is the generation which is the most tech savvy and wants to be updated about the advancements. One research found that about ten percent of e-shoppers are millennials classified as Technophiles. For these reasons, Millennials were the focus group for this study. Respondents were mostly students who are soon to be graduated or are already graduated.
The respondents of this survey found the apps easy to use and about forty-five respondents had already used one of the applications before. Figure 2 shows the demar-Vol 12 (22) | June 2019 | www.indjst.org Syed Asim Ali, Afshan Ejaz, Hira Anwar Khan and Sania Siddiqui cation of ease of use and application used before with the discriminating factor of age and application.

Figure 2. Ease of Use and Application used before representations
The tool used for statistical evaluation is SPSS. This tool was used due to accurate results and user-friendly interface. For all the graphical representations, Microsoft Excel has been used.
The research question being the test of User Satisfaction against the motivation factors and analysis of Dissatisfaction of User with respect to hygiene factors have been carried out.
There are eight variables having two to five subfactors of each variable were used. The questions were designed according to questionnaire from 3 . Principal Component Analysis was used because the subclasses were already identified in the questionnaire. There are two different hypotheses tested. The two are explained below according to their respective models.

Satisfaction Explained by Motivation Factors
Using the model identified by 3 , the dependent variable Satisfaction (S) was explained by Motivation factors derived by the sub elements Screen Complexity (SC), Visual Appearance (VA), and User Empowerment (UE).
Reliability analysis was done to check the overall consistency of the construct (Figure 3) Chronbach's alpha is used to measure internal consistency. It is considered to be a measure of scale reliability. A «high» value for alpha does not imply that the measure is unidimensional. Here the value of Chronbach's Alpha is greater than 0.6 that defines that the data set is reliable for analysis. The data collection of the construct is reliable. There were several sub-factors as questions in every sub element; therefore, factor analysis has been used to converge the results. The result is explained below: Firstly, in Figures 4 and 5, determinant values and KMO and Bartlett's Test are seen to analyze the overall consistency of the construct. For Determinant value should not be zero else there will be computational difficulties in the construct. For this factor analysis the value, is not 0. Therefore, it is fit for computations 9 .  The KMO and Bartlett test signify the model Sampling Adequacy which should be greater than 0.6. For this case, it is 0.711 which is a good score (i.e. 71% adequate). Now it can be said that the factor analysis is fit for computations. Figure 6 illustrates the Rotated Component Matrix which illustrates the subfactors which have converged to make up a variable. The important fact here is missing variable i.e. User Empowerment which was unable to converge in one signal factor and was removed from the analysis. All other factors have converged and are ready for modeling. Figure 6. Rotated component matrix. Figure 7. It can be seen that a variable is missing namely UE. This is due to the fact that it did not converge in factor analysis.

The model for this Analysis is shown in
The highlighted text illustrates the hypothesis for Satisfaction in terms of Motivation substituted on Visual Appearance and Screen Complexity.
H0: There is no significant and positive impact of Screen Complexity (SC), and Visual Appearance (VA) on Satisfaction.
H1: There is a significant and positive impact of Screen Complexity (SC), and Visual Appearance (VA) on Satisfaction.
Finally, a regression analysis was carried out considering the variables given above. Following are the interpretations of the analysis.

Model Summary b
This is the proportion of variance in the dependent variable (S) which can be explained by the independent variables (VA and SC). This is an overall measure of the strength of association and does not reflect the extent to which any particular independent variable is associated with the dependent variable. The above statistic if R 2 represents a good value for the construct (Figure 8 and 9).

ANOVA a
This is the F-statistic the p-value associated with it are shown in Figure 10. The p-value is compared to some alpha level in testing the null hypothesis that all of the model coefficients are 0. This value is acceptable as it is less than 0.05 ( Figure 11).

Coefficients a
The coefficient for SC (0.351) is significantly different from 0 because its p-value is 0.074, which is smaller than 0.1. The coefficient for VA (0.311) is significantly different from 0 because its p -value is 0.054, which is smaller than 0.1. The intercept is not significantly different from 0 at the 0.1 alpha levels ( Figure 12).

Dissatisfaction Explained by Hygiene Factors
Using the model identified by 3 , the dependent variable Dissatisfaction (DS) was explained by Hygiene Factors derived by the sub elements Navigation (N), Information Display (ID), Ease of Learning (L), and Response Time (RS) (Figure 13). Reliability analysis was done to check the overall consistency of the construct. The value of Chronbach's Alpha is greater than 0.6 that defines that the data set is reliable for analysis (Figure 14).
The data collection of the construct is reliable. There were several sub-factors as questions in every sub-element; therefore, factor analysis has been used to converge the results. The result is explained below. Firstly, in Figures 15 and 16  For Determinant value should not be zero else there will be computational difficulties in the construct. For this factor analysis, the value is not 0. Therefore, it is fit for computations 9 .
The KMO and Bartlett test signify the model Sampling Adequacy which should be greater than 0.6. For this case, it is 0.719 which is a good score (i.e. 72% adequate). Now it can be said that the factor analysis is fit for computations. Figure 17 illustrates the Rotated Component Matrix which illustrates the subfactors which have converged  Figure 17. Rotated component matrix.
The model for this Analysis is shown in Figure 18. It can be seen that a variable is missing namely L. This is due to the fact that it did not converge in factor analysis.
H0: There is no significant and positive impact of Navigation (N), Information Display (ID), and Response Time (RS) on Dissatisfaction.
H1: There is a significant and positive impact of Navigation (N), Information Display (ID), and Response Time (RS) on Dissatisfaction.
Finally, a regression analysis was carried out considering the variables given above. Following are the interpretations of the analysis. This is the proportion of variance in the dependent variable (DS) which can be explained by the independent variables (RS, N and ID). This is an overall measure of the strength of association and does not reflect the extent to which any particular independent variable is associated with the dependent variable. The above statistic if R 2 represents a good value for the construct. This is the F-statistic the p-value associated with it. The p-value is compared to some alpha level in testing the null hypothesis that all of the model coefficients are 0. This value is acceptable as it is less than 0.05.
The coefficient for Info Display (ID) (0.214) is significantly different from 0 because its p-value is 0.050, which is smaller than 0.1. The coefficient for Navigation (-0.168) is significantly different from 0 because its p-value is 0.010, which is smaller than 0.

Heuristic Evaluation
Ten rules for heuristic evaluation are followed for the three applications in the discussion. According to approximately a single evaluator is able to find 35% of usability problems. The optimal ratio of users for testing points out more than 80% of the usability problems. In our case, three experts have evaluated the applications and filled a questionnaire based on the Neilsen's rules. The questionnaire is cited in the Appendix -2.

Cognitive Walkthrough
The cognitive walkthrough is a usability evaluation method in which one or more evaluators work through a series of Vol 12 (22) | June 2019 | www.indjst.org Syed Asim Ali, Afshan Ejaz, Hira Anwar Khan and Sania Siddiqui tasks and ask a set of questions from the perspective of the user. The focus of the cognitive walkthrough is on understanding the system's learnability for new or infrequent users. We asked every participant to do some basic tasks using the app so we could check whether the user can easily understand the app and perform the entire mention task easily. Cognitive walkthrough also revealed to us how efficient the applications are and if the feedback provided was sufficient for the users. We had some two scenarios ( Figure 19). Figure 19. Task analysis. After completing of the task we ask the user to fill the questionnaire which is attached in Appendix 3.

Statistical Evaluation
The above equation identifies the quotient of satisfaction in terms of Screen Complexity (SC) and Visual Appearance (VA). Screen Complexity is the most significantly important because it increases the rate of human error 5 . If the screen is too complex user would likely to slip on some task. Simplicity and low Screen complexity can make an application more feasible. Visual Appearance also plays a positive role in Satisfaction as if the visual appearance is not appealing to the user than the user wouldn't want to stay for a longer period of time. The more time the user spends on the application it is more likely of a sale prospect. Satisfaction = 1.007 + 0.214 ID -0.168 N + 0.601 RS Dissatisfaction is explained in terms of Information Display (ID), Navigation (N) and Response Time (RS). Information Display if not provided on the individual screens when required is of no use. It is the biggest factor which distracts the user and makes it frustrated. For e-commerce the information display is important due to the fact that most buyers/ Users wants to be sure of the specifications of the product they are purchasing. Here Prompt Information Display helps the user and makes the buying decision possible.
Response Time is the most important feature of any application. Users get frustrated easily if there is lack of feedbacks and RS is greater than a few milliseconds. This lets them perceive of irresponsive application which means instant session closure. In the heuristic evaluation for the survey, we found that Kaymu had the worst RS and the respondent complained about how long it took to load one single product line.
Here Navigation has a negative relationship with Dissatisfaction as Navigation is a strong focus of any application and if not placed at a visually appealing point; it can be misjudged of some other feature. Figure 20 is plotted between Hygiene and Motivation. For this, we have taken the mean of all variables present in Motivation (SC, UE, VA) and Hygiene (N, RS, ID and L). It was done application wise so that we can rate which application scores best on both hygiene and motivation factors. From 3 we used their plotting technique for a minmax model of the zone of intolerance, zone of efficiency and the zone of satisfaction and applied it in the mean scores we calculated here. Zone of intolerance is represented by the left two quadrants where hygiene is low. Any app that lies here will be rated as intolerable. The user felt not so good about its hygiene factors. From the data sample for this research, Kaymu was found to be in the zone of Intolerance as shown in Figure 20. Subsequently, Yayvo lies in the zone of Efficiency. Daraz was rated to be both high on hygiene and motivation that is the reason it is placed in the zone of satisfaction.
The above results suggest that Daraz was found to be the best as compared to Yayvo and Kaymu and it also fulfills user needs and is the biggest satisfier of application's available by e-tailers.

Heuristic Evaluation
The results derived from the survey showed that Daraz. pk is able to maintain good feedback and a reasonable response time which keeps the users satisfied regarding the application status at any time. User practices good control and freedom while browsing the application or making a transaction. This shows application supports cancel or undo/redo operations for any action that the user has performed by mistake. Daraz.pk gets high ranks for learnability aspects. It follows consistency in design within the application and maintains the standards of the e-commerce industry. The user interface design also focuses on recognition more than recall through well categorized items and caching user details for return visits. Well specified instructions and help contact details also add to user's ease and satisfaction. The application was rated highest for flexibility and efficiency. Evaluators pointed out the display of quite a lot of information on the home page of the application. The interface does not follow a simplistic or minimalistic design pattern.
For Yayvo evaluators mention a room for improvement in the response time for better feedback and system visibility. The items are well categorized and grouped, so users don't feel the burden to remember exact details in their return visits. For frequent users, it is adaptive enough to show their items of interest. Help numbers and live chat enables the users to feel at home while interacting with the application.
Kaymu.pk has appropriate feedback for user action but many a time a delay in response can confuse the user about application status. Application scores low on flexibility and efficiency of use. Kaymu.pk maintains the consistency of user interface design throughout the application as well as standard terms used in the e-commerce industry. However, a few actions that are unique to the application need a clearer description. Instructions used are easy to comprehend and application provides with help numbers.

Cognitive Walkthrough
From the above analysis, a conclusion is made that user discover Kaymu and Yayvo hard to use as the regularity of the system is very less. Users discover hard to utilize the app as the functionality is not very good and the user likewise gets befuddled by while performing return policy task. User feels daraz.pk not much simple to use, but the design of the app incorporates simplicity and visual appearance of the app is great so user simple perform all the activity effectively and don't discover any perplexity with respect to terms and conditions.

Conclusion
Pakistan as a growing economy has new horizons opening up with the prevalent advances in technology. Today buyers have the luxury to purchase products using the e-commerce applications from their mobile phone or any other handheld device. This was the reason we conducted Vol 12 (22) | June 2019 | www.indjst.org Syed Asim Ali, Afshan Ejaz, Hira Anwar Khan and Sania Siddiqui this research to find the factors which affect user behavior for application interface and how they affect user buying behavior. This research is important because of its unique perspective of Hertzberg theory used for depicting user behavior. The future research measures could be testing these questionnaires on a random sample collected from all over Pakistan. Also, the variables ignored in this research can also be explained by future researches.