The Basic Four Measures and their Derivates in Dichotomous Diagnostic Tests

Citation: Ostrowski TR, Ostrowski T (2020) The Basic Four Measures and their Derivates in Dichotomous Diagnostic Tests. Int J Clin Biostat Biom 6:026. doi.org/10.23937/2469-5831/1510026 Accepted: June 03, 2020: Published: June 05, 2020 Copyright: © 2020 Ostrowski TR, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


The Basic Four Measures and their Derivates in Dichotomous Diagnostic Tests
Tadeusz R Ostrowski, MD 1  , stands for disease positive, negative (respectively). It should not lead to mix-up in the paper using the convenient convention, namely T+ = TP + FP for the frequency of T+ (the same concerns other concepts).
The elements of the matrix M are also named with the common convention (recall that the intersection of two sets A and B, denoted by A ∩ B, is the set containing all elements of A that also belong to B). Therefore the confusion matrix (1) contains all information needed for the quantitative assessment of the

Preliminaries
The aim of dichotomous diagnostic tests is to determine or predict the presence or absence of target condition (a disease or an infection) in study subjects. As it is known, clinical developments of new treatments are impossible without them. Different diagnostic measures relate to the different aspects of diagnostic procedure and some of them are used to assess the discriminative property of the test, others to estimate its predictive ability or overall accuracy. Let 2 by 2 matrix M (also called a confusion matrix) representing a contingency Or equivalently, it is the measure answering the question "What are the odds of having a positive test in the presence of disease?" The relationship above implies that the lower FN, the odds of Se are higher and for this reason Se is referred to as sensitive to disease. Specificity (Sp), or true negative rate, called sometimes selectivity, is a measure giving an answer to the question "If a patient does not have the disease, how likely is the patient to have a negative test?" Or equivalently, it is a measure answering the question "What are the odds of having a negative test in the absence of disease?" The derivate of sensitivity and specificity is so called Youden index, and the derivate of predictive values is known as predictive summary index. diagnostic test accuracy. Regrettably, only in an ideal world the test can be perfect (a positive patient has the target condition, and a negative patient does not have the condition of interest, so FP = FN = 0). In the real world, i.e. in practice, that kind of test is "a rare bird", so FN, FP cells are not empty, and we have to deal with false results.
Recall that if E, F are two events, then P(E/F) stands for the probability (called the conditional probability of E given F) of the event E occurring given that the event F has occurred. Formally, Closely related to the idea of probability are the odds (familiar to gamblers and in betting), used to describe the chance of an event occurring. Probability and the odds represent a different way of expressing similar concept. The odds of an event E happening, or the odds in favor of E, denoted in this paper by O(E), mean the ratio of the probability that E will occur to the probability that E will not occur. Formally, By the way, there is another way [2,3], to determine this measure widely used by clinicians. Namely, the ratio PPV , 1 -NPV i.e relative risk of the disease for an exposure to that for non-exposure, called positive predictive ratio (PPR). As it is known, the clinicians commonly prefer to use it, but from a patient point of view usually more preferred seems to be negative predictive ratio (NPR), i.e. relative risk of non-disease for the exposure to that for non-exposure, referred as 1 -PPV . NPV Therefore diagnostic odds ratio isthe ratio of these two ratios, PPR relative to NPR. γ * Note that applying notation Se = 1 -β, Sp = 1 -α, equivalent matrix to M D has the following form

Recall
where α is related with I type error (the error probability of falsely classifying a healthy person as diseased), β is related with II type error (the error probability of Youden index (γ) is a measure of the goodness of the detectability, formally defined by the following formula: = Se + Sp -1 < -1, 1 >. γ ∈ Predictive Summary Index (ψ), in construction similar to γ, and introduced by Linn and Grunau [1], is a measure of the goodness of the predictability in a diagnostic test, and defined as follows: = PPV + NPV -1 < -1, 1 >. ψ ∈ Note that, as a derivate of sensitivity and specificity, Youden index γ can be interpreted as four "excess coins", namely: Similarly, as a derivate of PPV and NPV, summary predictive index ψ also can be interpreted as four excess coins, namely: detM, and the product γ * ψ on the right side is always nonnegative because γ and ψ are both negative, zero or positive.
Where the sign of MCC is determined by the sign of The distance between the vertex C (1 -Sp, Se) of the trapezoid OBCD and the point (Se, Se) of the straight line y = x is equal to γ and it achieves its maximum. The point C lies on the parallel straight line y = x + γ. It can be seen "anywhere you look, there γ is a cook" (it sounds like a poetry, and it is reality!).

Discussion
The sufficient codition for a diagnostic test to be useless was established as detM ≤ 0. Unfortunately, for the test to be possibly useful only the necessary condition detM > 0 can be determined. Sufficiency depends upon the aim of applying the diagnostic test results. As it is known, if it is ruling out a target condition, then high sensitivity is required; if it is ruling in the condition, then high specificity is needed.
Note that PPV, NPV are not intrinsic to the test and they depend also on Pr. Applying Bayes' Theorem to the formula PPV = P(D+/T+), we obtainwell-known adjusted formula for positive predictive value: Se Pr PPV = . Se Pr + (1 -Sp) (1 -Pr) * * ⋅ If the sample sizes do not reflect the real prevalence of the disease, then PPV should be calculated using the adjusted formula. As it can be checked, we also have (I) if detM > 0, then PPV > Pr; (II) the parttial derivative PPV > 0 Pr ∂ ∂ for all fixed Se, Sp, and PPV is increasing function of Pr. Roughly, the lower Pr, the smaller PPV; the higher Pr, the greater PPV. When Pr is low, then a greater Sp is needed to achieve a higher PPV.
The adjusted formula for negative predictive value is known as Sp (1 -Pr) NPV = .
(1 -Se) Pr + Sp(1 -Pr) * * ⋅ Similarly as above, if the sample sizes do not reflect the real prevalence of the disease, then NPV should be calculated using the adjusted formula. Furthermore, (I) if detM > 0, then NPV > 1 -Pr; (II) the parttial derivative NPV < 0 Pr ∂ ∂ for all fixed Se, Sp, and NPV is monotonically decreasing function of Pr. Roughly saying, the higher Pr, the smaller NPV, and the lower Pr, the greater PPV. When a disease is common (Pr is high), then a greater Se is needed to achieve a higher NPV. The illustration of the effect of disease prevalence on PPV and NPV can be found in [7].
In the paper are shown connections between linear algebra on one side and test statistics on the other side. Especially, if a contingency matrix is singular, i.e. it has a determinant of 0, then the test is uninformative; it happens when rows (columns) are proportional. Matrix