4.11.2013 1 Software measurement Jonathan I. Maletic Department of Computer Science Kent State University 4.11.2013 2 Definitions Measure - quantitative indication of extent, amount, dimension, capacity, or size of some attribute of a product or process. – Number of errors Metric - quantitative measure of degree to which a system, component or process possesses a given attribute. “A handle or guess about a given attribute.” – Number of errors found per person hours expended 4.11.2013 3 Why Measure Software? • Determine quality of the current product or process • Predict qualities of a product/process • Improve quality of a product/process 4.11.2013 4 Example Metrics • Defects rates • Errors rates • Measured by: - individual - module - during development • Errors should be categorized by origin, type, cost 4.11.2013 5 Metric Classification • Products - Explicit results of software development activities - Deliverables, documentation, by products • Processes - Activities related to production of software • Resources - Inputs into the software development activitie - Hardware, knowledge, people 4.11.2013 6 Product vs. Process Process Metrics: - Insights of process paradigm, software engineering tasks, work product, or milestones - Lead to long term process improvement Product Metrics: - Assesses the state of the project - Track potential risks - Uncover problem areas - Adjust workflow or tasks - Evaluate teams ability to control quality 4.11.2013 7 Types of Measures Direct Measures (internal attributes) - Cost, effort, LOC, speed, memory Indirect Measures (external attributes) - Functionality, quality, complexity, efficiency, reliability, maintainability 4.11.2013 8 Size Oriented Metrics • Size of the software produced • Lines Of Code (LOC) • 1000 Lines Of Code KLOC • Effort measured in person months • Errors/KLOC • Defects/KLOC • Cost/LOC • Documentation Pages/KLOC • LOC is programmer & language dependent 4.11.2013 9 LOC Metrics • Easy to use • Easy to compute • Can compute LOC of existing systems but cost and requirements traceability may be lost • Language & programmer dependent 4.11.2013 10 Function Oriented Metrics • Function Point Analysis [Albrecht ’79, ’83] • International Function Point Users Group (IFPUG) • Indirect measure • Derived using empirical relationships based on countable (direct) measures of the software system (domain and requirements) 4.11.2013 11 Computing Function Points • Number of user inputs - Distinct input from user • Number of user outputs - Reports, screens, error messages, etc. • Number of user inquiries - On line input that generates some result • Number of files - Logical file (database) • Number of external interfaces - Data files/connections as interface to other systems 4.11.2013 12 Compute Function Points • FP = Total Count * [0.65 + 0.01*Sum(Fi)] • Total count is all the counts times a weighting factor that is determined for each organization via empirical data • Fi (i=1 to 14) are complexity adjustment values 4.11.2013 13 Complexity Adjustment • Does the system require reliable backup and recovery? • Are data communications required? • Are there distributed processing functions? • Is performance critical? • Will the system run in an existing heavily utilized operational environment? • Does the system require on-line data entry? • Does the online data entry require the input transaction to be built over multiple screens or operations? 4.11.2013 14 Complexity Adjustment (cont) • Are the master files updated on line? • Are the inputs, outputs, files, or inquiries complex? • Is the internal processing complex? • Is the code designed to be reusable? • Are conversions and installations included in the design? • Is the system designed for multiple installations in different organizations? • Is the application designed to facilitate change and ease of use by the user? 4.11.2013 15 Using FP • Errors per FP • Defects per FP • Cost per FP • Pages of documentation per FP • FP per person month 4.11.2013 16 FP and Languages Language LOC/FP Assembly 320 C 128 COBOL 106 FORTRAN 106 Pascal 90 C++ 64 Ada 53 VB 32 SQL 12 4.11.2013 17 Using FP • FP and LOC based metrics have been found to be relatively accurate predictors of effort and cost • Need a baseline of historical information to use them properly • Language dependent • Productivity factors: People, problem, process, product, and resources • FP can not be reverse engineered from existing systems easily 4.11.2013 18 Complexity Metrics • LOC - a function of complexity • Language and programmer dependent • Halstead’s Software Science (entropy measures) – n1 - number of distinct operators – n2 - number of distinct operands – N1 - total number of operators – N2 - total number of operands 4.11.2013 19 Example if (k < 2) { if (k > 3) x = x*k; } • Distinct operators: if ( ) { } > < = * ; • Distinct operands: k 2 3 x • n1 = 10 • n2 = 4 • N1 = 13 • N2 = 7 4.11.2013 20 Halstead’s Metrics • Amenable to experimental verification [1970s] • Length: N = N1 + N2 • Vocabulary: n = n1 + n2 • Estimated length: Ñ = n1 log2 n1 + n2 log2 n2 - Close estimate of length for well structured programs • Purity ratio: PR = Ñ / N 4.11.2013 21 Program Complexity • Volume: V = N log2 n - Number of bits to provide a unique designator for each of the n items in the program vocabulary. • Program effort: E=V/L - L = V*/V - V* is the volume of most compact design implementation - This is a good measure of program understandability 4.11.2013 22 McCabe’s Complexity Measures • McCabe’s metrics are based on a control flow representation of the program. • A program graph is used to depict control flow. • Nodes represent processing tasks (one or more code statements). • Edges represent control flow between nodes. 4.11.2013 23 Flow Graph Notation Sequence While If-then-else Repeat-until 4.11.2013 24 Cyclomatic Complexity • Set of independent paths through the graph (basis set) • V(G) = E – N + 2 - E is the number of flow graph edges - N is the number of nodes • V(G) = P + 1 - P is the number of predicate nodes 4.11.2013 25 Example i = 0; while (i