Measurement theory is a branch of applied mathematics. The specific theory we use is called the representational theory of measurement.
scales of the management theory:
nominal:
group subjects into different categories. eg. weather: sunny, rainy, cloudy, other.
jointly exhaustive and mutually exclusive: each measurement can be classified into one and only one category; all categories together should cover all possible values of the measurement.
ordinal:
subjects can be compared in order. eg. worst, bad, average, good, best.
subjects can not be numerically compared, eg. we know that good is better than bad, but we can compute the value of good - bad.
we can not user athematic operations on this scale
interval:
indicates the exact difference between measurement points. requires well-defined, fixed unit of measurement. eg. temperature scale (celsius, fahrenheit, kelvin): we know exactly how 25C differs from 15C.
ratio:
an interval scale where an absolute or non-arbitrary zero can be located.
absolute zero means the absence of the property being measured. eg. money: 0 means no money in the account
all arithmetic operations can be applied
values may be integers or non-integers.
absolute:
when there is only one way to measure a property, it is independent of any specific substance. eg. counting: the number of objects in a room
0 means the technical factor is irrelevant to the project or the use case, 3 is average, 5 corresponds to major effort.
Table 4-5
calculate TCF:
TCF=Constant-1+Constant-2+TechnicalFactorTotal=C1+C2+sum(W(i)*F(i))# as i from 1 to 13
Where:
C1=0.6# Constant-1C2=0.01# Constant-2
W(i)weightofithtechnicalfactor# weight of factor i, retrieved from table 4-5
F(i)perceivedcomplexityofithtechnicalfactor# factor i, value between 0 and 5 according to complexity
measure the experience level of the people on the project and the stability of the project
other external factors can be taken into account, such as the available budget, company’s market position, the state of the economy, etc.
people with greater experience will have a lower ECF, while people with lesser experience will have a higher ECF (things will be easier on them).
has 8 factors, each one ranging from 0 -5. (table 4-7)
for factors E1-E4, 0 means no experience on the subject, 3 average, 5 means expert.
for factor E5, 0 means no motivation, 3 average, 5 means highly motivated.
for E6, 0 means no changes in requirements, 3 average, 5 means significant changes in requirements.
for E7, 0 means no part-time people in the team, 3 average, 5 means many part-time people in the team.
for E8, 0 means programming language used is easy, 3 average, 5 means programming language used is hard.
table 4-7 summarizes the weights for each factor.
Table 4-7
calculate ECF:
TCF=Constant-1+Constant-2+EnvironmentalFactorTotal=C1+C2+sum(W(i)*F(i))# as i from 1 to 13
Where:
C1=1.4# Constant-1C2=-0.03# Constant-2
W(i)weightofithenvironmentalfactor# weight of factor i, retrieved from table 4-6
F(i)perceivedcomplexityofithenvironmentalfactor# factor i, value between 0 and 5 according to complexity
measures the complexity of the program’s conditional logic (control flow).
program with no branches is the less complex, program with loops is the more complex, program with two crossed loops is even more complex.
also represent the total number of different paths in the program.
program with less number of different paths is less complex.
calculating the cyclomatic complexity requires converting the program into a graph, where we can apply the graph theory algorithms to calculate the complexity as below.
V(G) = e - n + 2 where e is the number of edges in the graph and n is the number of nodes in the graph.
calculating the cyclomatic complexity:
cyclomatic complexity equals the number of binary decisions in the program + 1.
if the decision is not binary (eg. select, switch, if-elseif-else, etc.), the number of binary decisions is as n-1, where n is the number cases in switch/select statement.
in loops, each iteration is one binary decision.
converting a program into a graph:
The cyclomatic complexity is additive. The complexity of several graphs considered as a group is equal to the sum of individual graphs’ complexities.
cyclomatic complexity formulas:
the original formula: e - p + 2p where e is the number of edges in the graph and n is the number of nodes in the graph, p is the number of connected components in the graph.
the linearly-independent cyclomatic complexity formula: e - n + p + 1 where e is the number of edges in the graph and n is the number of nodes in the graph, p is the number of connected components in the graph.
Cyclomatic complexity ignores the complexity of sequential statements, so a program with no branches has a complexity of 0.
cyclomatic complexity does not distinguish different kinds of control flow complexity, such as loops vs. IF-THEN-ELSE statements or selection statements vs. nested IF-THEN-ELSE statements.
Cyclomatic complexity metric was originally designed to indicate program’s:
testability
understandability.
determine the number of unique tests that needs to run on the program (to cover all branches)
programmes with higher cyclomatic complexity are more difficult to understand and test, more difficult to maintain.
the recommended value for a program is between 0 and 10.
Internal cohesion = syntactic cohesion: evaluated by examining the code of each individual module.
modularization is very close to the concept of internal cohesion, some simple modularization rules that increases cohesions (examples):
each module should not exceed a certain size. eg. 50 lines of code.
each class has a maximum number of methods and attributes. eg. 10 methods and 10 attributes.
each package has a maximum number of classes. eg. 10 classes.
Coincidental cohesion:
happens when a modules performs unrelated tasks, eg. the class needs 3 methods, but it to keep consistence with other classes we may add a few methods that we don’t need.
this is rarely happens in the initial design.
as we modify (due to bug fixes or requirements changes), the coincidental cohesion may appear.
By identifying different types of module cohesion we can create a nominal scale for cohesion measurement.
A stronger scale is an ordinal scale, which can be created by asking an expert to assess subjectively the quality of different types of module cohesion and create a rank-ordering
functional cohesion: the module (class, unit) performs a single well-defined function or achieves a single well-defined goal.
sequential cohesion: the module performs more than one function, but these function occurs in order prescribed by the specification, eg. they are strongly related to each other.
communication cohesion: the module performs multiple functions, but all are targeted on the same data or the same set of data; the data is not organized in an oop manner(the data does not belong to a specific class or structure).
procedural cohesion: the module performs multiple functions that are procedurally related, the code in each module represents a single piece of functionality defining a control sequence of activities.
Temporal cohesion: the module performs multiple functions, but all are related to the same time period. eg. a module that combines all initializations at the beginning of the program although these are not related to each other.
logical cohesion: module performs a series of similar functions, eg. Math module which contains all the mathematical functions (logically related), but they are totally independent eg. logarithm operations are not related to square root or trigonometric operations.
interface-based metrics: compute class cohesion from information in method signatures
code-based metrics: compute class cohesion from the code itself
code-based cohesion metrics can also be classified into 4 sup-types:
disjoint component-based metrics: count the number of disjoint sets of methods or attributes in a class
pairwise connection-based metrics: compute cohesion as a function of the number of connected and disjoint method pairs.
connection magnitude- based metrics: count the accessing methods per attribute and indirectly find an attribute-sharing index in terms of the count.
decomposition-based metrics: compute cohesion in terms of recursive decompositions of a given class. the decompositions are generated by removal of pivotal elements that keep the class connected.
These metrics evaluate the consistency of methods in a class’s interface using the lists of parameters of the methods.
requires only class prototypes to be available and not the actual implementation of the class.
one of these metrics is Cohesion Amon Methods of Classes (CAMC)
CAMC is based on the assumption that the parameters of a method reasonably define the types of interaction that method may implement.
4.3.3 Cohesion Metrics using Disjoint Sets of Elements¶
An early metric of this type is the Lack of Cohesion of Methods (LCOM1).
this metric counts the number of pairs of methods that do not share their class attributes, eg. how many disjoint sets are formed by the intersection of the sets of the class attributes used by each method.
the perfect cohesion achieved when all methods access all class attributes, so expected value of LCOM1 is 0.
the bad cohesion happens, when each method accesses only (one or none) of the class attributes, so expected value of LCOM1 is 1.
there are new versions of this metrics:
LCOM2: calculates the difference between the number of methods pairs that do or do not share their class attributes.
LCOM3: counts the number of connected components in the graph.
Cohesion = module “strength” refers to the notion of a module level “togetherness” viewed at the system abstraction level.
semantic cohesion assesses wether the module represents semantically a whole
Semantic complexity metrics evaluate whether an individual class is really an abstract data type in the sense of being complete and also coherent.
for a class to be semantically cohesive, a class should contain everything that the one would expect from this class and no more.
It is possible to have a class with high internal, syntactic cohesion but little semantic cohesion.
Individually semantically cohesive classes may be merged to give an externally semantically nonsensical class while retaining internal syntactic cohesion.
example:
CAR_PERSON class that define a one-to-one relationship between a car and a person.
assuming that there is no intersection between Car and Person classes, the CAR_PERSON class can have both CAR and PERSON class in it.
if CAR, PERSON classes are internally cohesive syntactically, then the merged class PERSON_CAR is syntactically cohesive.
however, semantically, a class PERSON_CAR does not make sense, so this class is semantically incoherent but yet syntactically coherent.
our abstractions are unavoidably approximate. The term often used is “coarse graining,” which means that we are blurring detail in the world picture and single out only the phenomena we believe are relevant to the problem at hand.
One way of defining the complexity of a program or system is by means of its description, that is, the length of the description.
The crude complexity of the system: the length of the shortest message that one party needs to employ to describe the system, at a given level of coarse graining, to the other party.
Algorithmic information content (AIC): is defined as the length of the shortest possible program that prints out a given string.
logical depth: the difficulty of going from the shortest program that can print the description to the actual description of the system. (Charles Bennett).
4.6.1 Deriving Project Duration from Use Case Points¶
Duration = UCP * PF where UCP is the use case points and PF is the productivity factor (the team velocity).
The Productivity Factor is the ratio of development person-hours needed per use case point. past projects and experiences are important in this factor.
example:
a past project with a UCP of 112 took 2,550 hours to complete.
divide 2,550 by 112 to obtain a PF of 23 person-hours per use case point.
If no historical data has been collected, the developer can consider one of these options:
Establish a baseline by computing the UCP for projects previously completed by your team (if such are available).
Use a value for PF between 15 and 30 depending on the development team’s overall experience and past accomplishments (Do they normally finish on time? Under budget? etc.). For a team of beginners, such as undergraduate students, use the highest value (i.e., 30) on the first project.