501 183 191 by ides editor

Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC

Refining the Use Case Classification for Use Case Point Method for Software Effort Estimation Divya Kashyap 1, Durgesh Shukla 2 and A. K. Misra 2 1,2,3

Department of Computer Science and Engineering Motilal Nehru National Institute of Technology Allahabad, India Email: {rcs0904@mnnit.ac.in, durgeshkrshukla@gmail.com, Kashyap.div2011@gmail.com}

Abstract— Software cost estimation is a key open issue for the software industry, which suffers from cost overruns frequently. As the most popular technique for object-oriented software cost estimation is Use Case Points (UCP) method, however, it has two major drawbacks: the uncertainty of the cost factors and the abrupt classification. To address these two issues, refined the use case complexity classification using fuzzy logic theory which mitigate the uncertainty of cost factors and improve the accuracy of classification. Software estimation is a crucial task in software engineering. Software estimation encompasses cost, effort, schedule, and size. The importance of software estimation becomes critical in the early stages of the software life cycle when the details of software have not been revealed yet. Several commercial and non-commercial tools exist to estimate software in the early stages. Most software effort estimation methods require software size as one of the important metric inputs and consequently, software size estimation in the early stages becomes essential. The proposed method presents a techniques using fuzzy logic theory to improve the accuracy of the use case points method by refining the use case classification. Index Terms— Cost Estimation, Function Point, Fuzzy Logic, MMRE, and Use case point.

I. INTRODUCTION Software cost estimation is vital for project bidding, budgeting, controlling and planning. Bad estimation will lead to the failure and good one will enable the project deliver on time and within budget. Due to lack of information to make decisions at an early stage in development, the cost estimation is full of uncertainty. Although in the literature many estimation models like Constructive Cost Model (COCOMO), Function Points (FP) have been proposed to help manager in estimation task, there is no obvious evidence shows that the accuracy is improved in last decades [3]. Achieving a highly accurate estimation is still a challenging issue in software engineering. We believe that there are two reasons: one is that something inside the cost model is not proper in practical, the other is the inherent uncertainty of assessment of the cost from the cost factors relied on the expert determination. To address these problems, the first step we need is to find out the improper part in practical and then improve it accordingly, secondly, we need to manage these uncertainties to lower the risk of the project being failed [4]. Among various existing cost models, UCP has become well known and widely accepted because it utilizes two common practices in the industry: the object-oriented paradigm and use case to describe functionality requirements. Taking deterministic value as inputs, UCP can give most likely cost estimation. The reminder of this paper can describe as follows: Section II contains a description of the UCP method, DOI: 02.ITC.2014.5.501 © Association of Computer Electronics and Electrical Engineers, 2014

Section III describe Fuzzy logic concept. In Section IV different techniques used to estimate software costs are discussed, Section V give the details of proposed method with related case study. The paper ends with conclusion and future work for proposed method. II. USE CASE POINTS METHOD An early estimate of effort based on use cases can be made when there is some understanding of the problem domain, system size and architecture at the stage at which the estimate is made. The use case points method is a software sizing and estimation method based on use case counts called use case points. TABLE I: ACTOR WEIGHT FACTOR Actor Type

Weighting Factor

Simple

Average

Complex

TABLE II: USE CASE COMPLEXITY WEIGHT FACTOR Use Case Type Simple Average Complex

Number Of Transactions â&#x2030;¤3 4 to 7 â&#x2030;Ľ8

Weighting factor 5 10 15

A. Classifying Actors and Use Cases Use case points can be counted from the use case analysis of the system. The first step is to classify the actors as simple, average or complex. A simple actor represents another system with a defined Application Programming Interface, API, an average actor is another system interacting through a protocol such as TCP/IP, and a complex actor may be a person interacting through a GUI or a Web page. A weighting factor is assigned to each actor type according to Table 1. The total unadjusted actor weights (UAW) is calculated by counting how many actors there are of each kind (by degree of complexity), multiplying each total by its weighting factor, and adding up the products. Each use case is then defined as simple, average or complex, depending on number of transactions in the use case description, including secondary scenarios. A transaction is a set of activities, which is either performed entirely, or not at all. Counting number of transactions can be done by counting the use case steps. Another mechanism for measuring use case complexity is counting analysis classes, which can be used in place of transactions once it has been determined which classes implement a specific use case. A simple use case is implemented by 5 or fewer classes and average use case by 5 to 10 classes and a complex use case by more than ten classes. The weights are as in Table 2. Each type of use case is then multiplied by the weighting factor, and the products are added up to get the Unadjusted Use Case Weights (UUCW) The UAW is added to the UUCW to get the unadjusted use case points (UUPC) UUCP = UAW + UUCW

(1.1)

B. Technical and Environmental Factors The UCP method also employs a technical factors multiplier corresponding to the Technical Complexity Adjustment factor of the FPA method, and an environmental factors multiplier in order to quantify nonfunctional requirements such as ease of use and programmer motivation. Various factors influencing productivity are associated with weights, and values are assigned to each factor, depending on the degree of influence. 0 means no influence, 3 is average, and 5 means strong influence throughout. Table 3 and Table 4 show the Technical and Environmental Factors. The adjustment factors are multiplied by the unadjusted use case points to produce the adjusted use case points, yielding an estimate of the size of the software. The Technical Complexity Factor (TCF) is calculated

184

by multiplying the value of each factor (T1- T13) by its weight and then adding all these numbers to get the sum called the TFactor. The following formula is applied: = 0: 6 + 0: 01_

(1.2)

The Environmental Factor (EF) is calculated by multiplying the value of each factor (F1-F8) by its weight and adding the products to get the sum called the EFactor. The following formula is applied = 1.4 + (−0.03 ∗ ) (1.3) The adjusted use case points (UPC) are calculated as follows:

∗ ∗

(1.4)

TABLE III: TECHNICAL C OMPLEXITY FACTORS Factor

Description

Distributed System

Weight 2

Response adjectives

End-User efficiency

Complex processing

Reusable code

Easy to install

0.5

Easy to use

0.5

Portable

Easy to change

T10

Concurrent

T11

Security features

T12

Access for third party

T13

Special training required

TABLE IV: ENVIRONMENTAL FACTORS

Factor F1 F2 F3 F4 F5 F6 F7 F8

Description Familiar with RUP Application experience Object-oriented experience Lead analyst capability Motivation Stable Requirements Part-time workers Difficult programming language

Weight 1.5 0.5 1 0.5 1 2 -1 2

C. Producing Estimates Based on Use Case Points Karner proposed a factor of 20 staff hours per use case point for a project estimate. Field experience has shown that effort can range from 15 to 30 hours per use case point, therefore converting use case points directly to hours may be an uncertain measure. Schneider and Winters [2] recommend that the environmental factors should determine the number of staff hours per use case point as follow: NF = The number of factors in F1 through F6 that have value < 3 + The number of factors in F7 through F8 that have value > 3 If NF≤ 2 then PF=20 If NF=3 or 4 then PF = 28 If NF>4 then 185

PF=36 The reason for this approach is that the environmental factors measure the experience level of the staff and the stability of the project. Negative numbers mean extra effort spent on training team members or problems due to instability. However, using this method of calculation means that even small adjustments of an environmental factor, for instance by half a point, can make a great difference to the estimate. III. FUZZY LOGIC Fuzzy logic is derived from the fuzzy set theory that was proposed by Lotfi Zadeh in 1965 [5]. As a contrary to the conventional binary logic that can only handle two values True or False (1 or 0), fuzzy logic can have a truth value which is ranged between 0 and 1. This means that in the binary logic, a member is completely belonged or not belonged to a certain set, however in the fuzzy logic, a member can partially belong to a certain set. Mathematically, a fuzzy set A is represented by a membership function as follows: [ ] = ( ): â&#x;š [0,1]

(2.1)

Where ÎźA is the degree of the membership of element x in the fuzzy set A. A fuzzy set is represented by a membership function. Each element will have a grade of membership that represents the degree to which a specific element belongs to the set. Membership functions include Triangular, Trapezoidal and S-Shaped. In fuzzy logic, linguistic variables are used to express a rule or fact. For example, the temperature is thirty degrees is expressed in fuzzy logic by the temperature is low or the temperature is high where the words low and high are linguistic variables. In fuzzy logic, the knowledge based is represented by if-then rules. For example, if the temperature is high, then turn on the fan. The fuzzy system is mainly composed of three parts. These include Fuzzification, Fuzzy Rule Application and Defuzzification. Fuzzification means applying fuzzy membership functions to inputs. Fuzzy Rule Application is to make inferences and associations among members in different groups. The third step in the fuzzy system is to defuzzify the inferences and associations and make a decision and provide an output that can be understood. In this proposal, fuzzy logic will be used to calibrate the complexity weight of use cases. IV. RELATED WORK A method for mapping the object-oriented approach into Function point analysis is described by Thomas Fetke et al., [6]. The authors propose mapping the use cases directly into the Function point model using a set of concise rules that support the measurement process. These mapping rules are based on the standard FPA defined in the IFPUG Counting Practices manual. Since the concept of actors in the use case model is broader than the concept of users and external applications in FPA, there cannot be a one to- one mapping of actors and users to external applications. But each user of the system is defined as an actor. In the same manner, all applications which communicate with the system under consideration must also appear as actors. This corresponds to Karnerâ&#x20AC;&#x2122;s use case point method [7]. The level of detail in the use case model may vary, and the use case model does not provide enough information to how to count a specific use case according to function point rules. Therefore, as in Karnerâ&#x20AC;&#x2122;s method, the use cases must be described in further detail in order to be able to count transactions. John Smith of Rational Software describes a method presenting a framework for estimation based on use cases translated into lines of code [8] There does not seem to be any more research done on this method, although the tool â&#x20AC;&#x2DC;Estimate Professionalâ&#x20AC;&#x2122;, which is supplied by the Software Productivity Center Inc, and the tool â&#x20AC;&#x2DC;CostXpertâ&#x20AC;&#x2122; from Marotz Inc. produce estimates of effort per use case calculated from the number of lines of code. David Longstreet of Software Metrics observed that applying function points helps to determine if the use case is written at a suitable level of detail [9]. If it is possible to describe how data passes from the actor to inside the boundary or how data flows from inside the application boundary to the actor, then that is the right level of detail, otherwise the use case needs more detail. By adopting both the use case method and the function points method, the quality of the requirement documents can be improved. Thus, sizing and estimating is improved. V. PROBLEM DISCUSSED AND PROPOSED SOLUTION UCP technique has two major drawbacks: â&#x20AC;˘ The uncertainty of the cost factors â&#x20AC;˘ The ambiguous classification of use cases 186

UUCP is calculated by adding UUCW and UAW, the UAW is easily to be determined, and UUCW is calculated based on the number of transactions in each use case, which is uncertain in a way, because when counting the transactions, the human experience, the use case writing style and the level of domain knowledge will impact the human decision. The number of transactions of a use case may be different if counted by different people A. Proposed Solution This proposal introduces a new approach to overcome the limitations of the UCP. Rather than classifying a use case as simple, average, or complex, the use case will be classified as ux, such as x 2 [1; 25] where x represents the number of transactions. This concludes that there will be 25 degrees of complexity for use cases (u1, u2, u3,.. etc.). The proposed approach will be implemented in two independent stages. First, by defining the rules, to keep the transactions at a appropriate level and avoid duplicated transactions. Second a fuzzy logic approach is applied to determine the complexity factor of ux. B. Refining Use Case Classification The use case specification is a formal document based on a detailed template with fields which include use case name, actors, user interface screens, pre-conditions, primary scenario, alternative scenario, exception scenario, post-conditions and so on. The steps in scenarios describe the interaction between actors and system at a user view, a step includes actors action and system response. The number of transactions in a use case is used to classify the difficult level of use case. There is no standard definition of transaction. In this work I use the definition which is defined as a round trip from the user to the system back to the user [10]. According to the template of use case specification, a step in use case specification can be regarded as a transaction, but there are some rules for counting transaction: â&#x20AC;˘ The transactions which are too simple are ignored. e.g. Enter user name, the name are showed in the screen. â&#x20AC;˘ The transactions which are only different in instantiation are regarded as the same transaction. e.g. Searching a medicinal book and architectural book can be two different scenarios in use case specification but the same operation in the database, they can be regarded as the same transaction. â&#x20AC;˘ All exception scenarios in a use case are regarded as one transaction, supposing that a framework is designed to handle all types of exceptions. â&#x20AC;˘ All transactions in a use case for validating the fields in the screen are regarded as one transaction, supposing that a framework is designed for validation. By defining the rules, we can keep the transactions at appropriate level and avoid duplicated transactions. The following technique can refine the use case classification with fuzzy theory. C. Fuzzification Fuzzification of the linguistic terms of use case complexity matrixes by generating fuzzy numbers and extend three categories to five. The classification of use cases is extended by adopting the method proposed in [1], two linguistic terms very simple and very complex are added. The values of these terms are presented in Table 5. The according fuzzy numbers are presented in Fig:1 according to the membership function in Formula 3.1.

( ) =

â&#x17D;§ â&#x17D;Ş â&#x17D;¨ â&#x17D;Ş â&#x17D;Š

" " #

< â&#x2C6;&#x2C6; [ , ) â&#x2C6;&#x2C6; [ , !]

(3.1)

â&#x2C6;&#x2C6; (!, $] >$

D. Defuzzification After fuzzification the defuzzification of the linguistic terms then give UUCW. According to membership function and Table 5 values of terms, the weight of the use case can be obtained by formula: % = &

% ' * ( ) â&#x2C6;&#x2014; % ' * + 51 â&#x2C6;&#x2019; ( )7 â&#x2C6;&#x2014; % ' *89

187

# !/ 2 ! â&#x2C6;&#x2C6; [ * , !* ] # !/ 2 ! â&#x2C6;&#x2C6; [!* , $* ]

(3.2)

Finally the UUCW can be calculated by Formula3.3

% = â&#x2C6;&#x2018;# /? /?/ â&#x2C6;&#x2014; %

(3.3)

Rather than classifying the use cases into three classes (simple, average and complex) as in Karnerss work [7], the use cases will be classified into 5 categories according to the number of transactions per use case. Since the main goal of our approach is to enhance the current model proposed by Karner and not to completely modifying it, assumed that the largest use case contains 21 transactions. Also assume that the complexity factor of the largest use case is 25. Table 6 presents a comparison between the original work (Karners method) and the proposed approach. The table 6 shows that in the proposed approach, the weights of the use cases are gradually increasing as opposed to the abrupt increase in Karners UCP method. TABLE V: VALUES OF THE T ERMS Linguistic Terms Very Simple Simple Average Complex Very Complex

i 1 2 3 4 5

ni 2.5 6 10 14

mi 1 4 8 12 16

ai 2.5 6 10 14

bi 4 8 12 16

Weight 5 10 15 20 25

Fig 1: Fuzzy numbers After Fuzzification and Extension TABLE VI: ADJUSTED WEIGHT No. Of Transactions in Use Case 1 2 3 4 5 6 7 8 9 10

UCP Weight

Refine Weight

5 5 5 10 10 10 10 15 15 15

5 5 6.45 7.5 8.55 10 11.4 12.5 13.6 15

E. Evaluation Criteria Several methods exist to compare cost estimation models. Each method has its advantages and disadvantages. In this work, three methods will be used. These include the Mean of the Magnitude of Relative Error (MMRE), the Mean of Magnitude of error Relative to the Estimate (MMER) and the Mean Error with Standard Deviation. MMRE This is a very common criterion used to evaluate software cost estimation models [11]. The Magnitude of Relative Error (MRE) for each observation i can be obtained as: @ * =

|B'DE F GHHIJDK LJMN*'DMN GHHIJDK | B'DE F GHHIJDK

MMRE can be achieved through the summation of MRE over N observations: 9 @ * = â&#x2C6;&#x2018;O *Q9 @ * O

188

(4.1)

(4.2)

MMER MMER is another method for cost estimation models evaluation [12]. MER is similar to MRE with a difference that the denominator is the predicted effort instead of the actual effort. Consequently, the equations for MER and MMER are: |B'DE F GHHIJD K LJMN*'DMN GHHIJDK | @ * = (4.3) 9

LJMN*'DMN GHHIJDK

(4.4) @ * @@ = â&#x2C6;&#x2018;O O *Q9 When using the MMRE and the MMER in evaluation, good results are implied by lower values of MMRE and MMER. Mean Error with Standard Deviation Although MMRE and MMER have been used for a long time, both methods might lack accuracy. If the actual effort was small, MMRE would be high. On the other hand, if the predicted effort was low, MMER would also be high. Foss et al. argued that MMRE should not be used when comparing cost estimation models and using the standard deviation would be better [13]. The equation for the mean error for each observation i and total number of observations N is: 9 Ě&#x2026; = O â&#x2C6;&#x2018;O (4.5) *Q9 * Where * = ( S TT * â&#x2C6;&#x2019; ?V2 ?V TT * ) The equation of the standard deviation can be defined as: 9

^ XY = ZO ! â&#x2C6;&#x2018;O *Q9( * â&#x2C6;&#x2019; Ě&#x2026; )

(4.6)

F. Case Study The evaluation of this work was conducted on seven different projects. I collected information about seven projects information from projects plans, report, and analysis and design documents [14]. There is no standard and known conversion between the size in UCP and the size in function points or SLOC. Since some information about the complexity of the projects and the team experience is known about each project, the Technical Factor (TF) and the Environmental Factor (EF) were calculated. Karner suggested that the effort required to develop one UCP is twenty person hours. This method had been criticized by many researchers. Schneider et al. [2] refined Karners UCP method in determining the effort from UCP. Schneider suggested counting the number of factor ratings of F1-F6 in Table 4 (Environmental Factors) that are below three and the number of factor ratings of F7-F8 that are above three. If the total is three or less, then twenty person hours per UCP should be used. If the total is three or four, twenty eight person hours per UCP should be used. If the total is five or more, then the project team should be reconstructed so that the numbers fall at least below five. A value of five indicates that this project is at significant risk of failure with this team. In this work, Schneiders method has been used to calculate the size of the projects in UCP from the effort. Equation 1.4 is used to determine the size of each project in UUCP. To distinguish between the results in the proposed fuzzy logic, the evaluation of each approach was done separately. From the Table 7, Fig: 2 shows the comparisons of estimated effort between Actual size UUCP model, UCPM and Proposed Model and Fig: 3 show the error comparison between UCPM and Proposed Method. It can concluded that the proposed model improve the software effort estimation with MRE +22% and with MER +9%. TABLE VII: C OMPARISON BETWEEN UCPM AND THE PROPOSED MODEL Project

Project 1 Project 2 Project 3 Project 4 Project 5 Project 6 Project 7 Mean Standard Dev

Actual Size UUCP 72.44 74.33 55.50 68.00 48.75 94.50 72.50

UCPM

Proposed Model

MRE UCPM 0.78 0.73 0.08 0.60 0.52 0.79 0.50

MRE Proposed Model 0.45 0.46 0.12 0.36 0.26 0.52 0.28

128.96 128.54 51.00 108.50 74.25 168.75 108.41

104.98 108.65 48.70 92.40 61.25 144.00 92.44

0.44 0.42 0.09 0.37 0.34 0.44 0.33

MER Proposed Model 0.31 0.32 0.14 0.26 0.20 0.34 0.22

0.57

0.35

0.26

189

MER UCPM

Error UCPM 56.22 54.21 -4.50 40.50 25.50 74.25 35.91

Error Proposed Model 32.54 34.32 -6.80 24.40 12.50 49.50 19.94

40.34 25.33

23.77 17

Fig: 2 Comparison of Estimated Effort

Fig: 3 Error Comparison

VI. CONCLUSION AND FUTURE WORK In this paper, the weakness of the UCP model are discussed and proposed an enhancement to this model using fuzzy logic to creation of the linguistic term very simple and very complex. These terms will make the estimation more accurate specially the use case is comparative complex. Moreover, this approach provides some rules for counting the transaction which is very crucial for use case complexity weight. From the case study result it can be concluded that the proposed approach improve the UCP software effort estimation up to 22%. In future work approach can be applied to other UML diagrams like sequence diagram, state transition diagram etc. In future work the model can be applied as: â&#x20AC;˘ The largest use case should contain at least 21 transactions. â&#x20AC;˘ Extend and include use cases should be considered when estimating the software size. 190

REFERENCES [1]. O. Junior, P. Farias, and A. Belchior, “A Fuzzy Model for Function Point Analysis to Development and Enhancement Project Assessments”, CLE Electronic Journal, 5:2, 1999. [2]. G. Schneider and J. P. Winters, “Applied use Cases”, Second Edition, A Practical Guide. Addison-Wesley, 2001. [3]. K. Molokken and M. Jorgensen, “A review of software surveys on software effort estimation” In Empirical Software Engineering, 2003. ISESE 2003. Proceedings. 2003 International Symposium on, pages 223230. [4]. B. Kitchenham and S. Linkman. “Estimates, uncertainty, and risk”, IEEE Software, 14(3):6974, 1997. [5]. L. A. Zadeh,”Fuzzy sets. “Information and Control”, vol. 8, pp. 338-353, 6, 1965. [6]. Thomas Fetke, Alan Abran, and Tho-Hau Ngyen. “Mapping the oojacobsen approach into function point analysis” The Proceedings of TOOLS, 23, 1997. [7]. Karner, G. Metrics for Objectory. “Diploma thesis, University of Linkping” Sweden. No. LiTH-IDAEx- 9344:21. December 1993. [8]. John Smith. “The estimation of effort based on use cases,” Rational Software White Paper, 1999. [9]. David Longstreet. “Use cases and function points” Copyright Longstreet Consulting Inc. www.softwaremetrics.com, 2001. [10]. Jacobson, I. and Christerson, M. and Jonsson, P. and Overgaard, G. “Object- Oriented Software Engineering: A Use Case Driven Approach” Harlow, Essex, England: Addison Wesley Longman, 1992. [11]. L. C. Briand, K. E. Emam, D. Surmann, I. Wieczorek and K. D. Maxwell, “An assessment and comparison of common software cost estimation modeling techniques ”, ICSE’99, vol. 0, pp. 313-322, 1999. [12]. B. A. Kitchenham, L. M. Pickard, S. G. MacDonell and M. J. Shepperd, “Accuracy Statistics Really Measure”, IEE Proc. -Softw, vol. 148, pp. 81-85, 2001. [13]. T. Foss, E. Stensrud, B. Kitchenham and I. Myrtveit, “A Simulation Study of the Model Evaluation Criterion MMRE”, IEEE Transactions on Software Engineering, vol. 29, pp. 985-995, 2003. [14]. Ribu, K. “Estimating Object-Oriented Software Projects with Use Cases ” , University of Oslo, Department of informatics, Master of Science Thesis, 7th November 2001.

191