Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization of LCOM produces counter intuitive results and makes results less comparable #76

Open
maxmeffert opened this issue Apr 26, 2021 · 1 comment
Assignees

Comments

@maxmeffert
Copy link

maxmeffert commented Apr 26, 2021

Hi,
as far as I understand the Code and the somewhat related blog posts [1][2] and papers [3][4] the method for computing LCOM implemented in DesigniteJava is YALCOM. YALCOM seems to be an extended version of LCOM proposed by Hitz & Montazeri [3] adding support for measuring aspects of cohesion related to inheritance and applicability. I strongly agree with the idea of capturing the cohesive properties of inheritance. On a side node: I would naively like to propose to also consider calls to inherited methods since attributes of super types could also be transitively accessed by them from a sub type. Moreover, consideration of applicability seems nothing but an improvement to me. However, I have some problems comprehending the method normalization, i.e. division by number of methods.

My Understanding of LCOM

From my graph theoretic understanding of LCOM and YALCOM count the number of disconnected components of an graph representing direct and indirect usages of attributes by methods. Perfect cohesion is indicated if this graph consists of only one such component. The graph is considered less cohesive for each additional component present. Eventually the graph is considered least cohesive if it consists of the maximum number of possible disconnected components which is the number of vertices, i.e. the total number of methods.

So we have:

(perfect cohesion) 1 <= number of disconnected components <= number of methods (least cohesion)
(perfect cohesion) 0 <= normalized LCOM <= 1 (least cohesion)

Problem 1: Counter-intuitive Results

Considering a class with 3 disconnected components and 3 methods YALCOM produces a normalized result of 1 indicating least cohesion as expected. For a class with 3 disconnected components but now with 30 methods YALCOM produces a normalized result of 0.1 indicating acceptable cohesion. Obviously both cases have the same structural cohesive properties. However, normalization with division by number methods causes YALCOM to produce diametrically opposed results. Sadly, in-cohesive classes with 30 or more methods can be encountered in the wild which makes this anomaly counter intuitive to me.

Problem 2: Results are less comparable

Dividing the number of disconnected components by the number of methods of the same class tightly couples the measure to that class since different classes can have different numbers of methods. The only practical interpretation of the normalized LCOM I could come up with is "the cohesion is not as bad as it could be". This makes it impossible to use the normalized LCOM to compare different classes which seems to somewhat defeat the purpose of normalization to me.

Proposed Solution

Instead of dividing the number disconnected components by the number of methods it seems more practical to me to implement normalization with the following ratio:

Let NDC be the number of disconnected components.
LCOM := (NDC-1)/NDC

Since cases where the number of disconnected components could be zero is already handled by the applicability portion of YALCOM division by zero should not be problem. Also this ratio emphasizes the perfect cohesion with only one disconnected component and should facilitate an intuitive understanding of LCOM by encoding the Lack of cohesion in a percentage value, i.e. given the examples from Problem 1 two-thirds or 66% of a class are not cohesive with the rest.

Happy to hear your thoughts on the matter.

Regards
Max

References

  1. http://www.tusharma.in/technical/revisiting-lcom/
  2. http://www.tusharma.in/research/yalcom-yet-another-lcom-metric/
  3. http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=F450176CD6FB83AA1F137BFBBEFC79C7?doi=10.1.1.409.4862&rep=rep1&type=pdf
  4. https://arxiv.org/pdf/2012.12324.pdf
@tushartushar
Copy link
Owner

Thanks Max for the detailed explanation and issue. It indeed needs a deeper look.
I am putting it in my TODO list.

@tushartushar tushartushar self-assigned this Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants