What statement about the relationship between statistical power and a statistical probability is true?

Both concepts are unified in the form of the risk function.

Generally, a statistical procedure $t$ can be considered a principled way of determining some action to take upon observing data governed by some probability distribution $F$ (aka "model" or "state of nature") that is not completely determined (some of its properties are unknown). When the action can be quantitatively compared to the action that would be performed if the distribution were known, it becomes possible to compare procedures quantitatively, too.

The comparison requires us to consider--hypothetically--every possible $F$ that might occur. For a given $F$, the data $X$ are considered a random sample of $F.$ The procedure $t$ determines the action $t(X)$ to take upon observing $X.$ Let $a(F)$ be the best possible action to take if $F$ were known. The value of the loss function $\mathcal L$ is zero when $t(X)=a(F)$ and otherwise is some positive number representing the "cost" or "loss" associated with taking the possibly inferior action $t(X).$ Usually the loss is directly expressed as a cost of doing $t(X)$ when $F$ is the true state of nature, written

$$\mathcal{L}(a, F).$$

The risk associated with $F$ for the procedure $t$ is the expected loss,

$$r_{\mathcal{L}; t}(F) = E[\mathcal{L}(t(X), F)].$$

Notice how (given $\mathcal L$ and $t$) this is a function of $F.$ This fact complicates the comparison of statistical procedures, because typically one procedure may have lower risk for some $F$ and the other procedure may have lower risk for other $F.$

In its most basic form, hull hypothesis testing (NHT) concerns just two possible actions: "accept" or "reject." The "null hypothesis" is a set $\Theta_0$ of models, while the "alternative hypothesis" is a complementary set $\Theta_A.$ People doing null hypothesis testing are concerned about getting the correct answer: that is, there is no loss whenever $t(X)$ is "accept" when $F\in\Theta_0$ or $t(X)$ is "reject" when $F\in\Theta_A.$ Otherwise, an "error" is made: a false positive occurs when $t(X)$ is "reject" and $F\in\Theta_0$ and a false negative occurs when $t(X)$ is "accept" and $F\in\Theta_A.$ All errors are considered to have the same loss.

We might as well choose units in which this common value is $1.$ Mathematically, this binary loss function can be expressed as

$$\eqalign{ \mathcal{L}(\text{accept}, F) &= \left\{\matrix{0 & \text{if } F\in\Theta_0 \\ 1 & \text{if }F\in\Theta_A}\right. \\ \mathcal{L}(\text{reject}, F) &= \left\{\matrix{1 & \text{if } F\in\Theta_0 \\ 0 & \text{if }F\in\Theta_A}\right. }$$

From this we may easily compute that the risk function is

$$r_t(F) = \left\{\matrix{{\Pr}_F(t(X)=\text{ reject}) & \text{if }F\in\Theta_0 \\ {\Pr}_F(t(X)=\text{ accept}) & \text{if }F\in\Theta_A }\right.$$

(Since we will be discussing the binary loss function from now on, I have dropped all references to it in the notation.)

So far, there has been no essential difference between $\Theta_0$ and $\Theta_A,$ nor need there be. However, null hypothesis testing is usually conducted in an asymmetrical way: one chooses procedures that limit the risk when the null is true. This limiting risk is the "size" of the test, or its "alpha," given by

$$\alpha = \sup\left\{r_t(F)\mid F\in \Theta_0\right\}.$$

One selects a test that tends to make the values of the risk function for the alternative hypothesis as small as possible. Which values, exactly? It depends on your objectives and assumptions. Thus, it's common for the statistician to ask her client for (a) an acceptably small value of $\alpha$ and (b) some indication of the largest risks the client can endure for some key models in the alternative hypothesis. Let's look at an illustrative example.

Figure 1 shows a textbook case where $\Theta_0$ is the set of Normal$(\mu,\sigma^2/n)$ distributions with $\mu \le 0$ and $\Theta_A$ is the set of Normal$(\mu,\sigma^2/n)$ distributions with $\mu\gt 0;$ both $\sigma^2$ and $n$ are fixed. The procedure is the usual "Z test." ($n$ is a potential sample size; varying $n$ causes the risk to change, enabling one to determine a sample size that makes the risk function acceptably low.)

What statement about the relationship between statistical power and a statistical probability is true?

The red curve graphs $r_t(F)$ (where $F$ can be identified with a multiple of the parameter $\mu,$ termed the "effect"). $\Theta_0$ therefore corresponds to the points on the horizontal ($F$) axis at or to the left of $0.$ The test has been chosen so that the highest point on the graph for $\Theta_0$ is $\alpha=0.10.$

The backgrounds summarize the conversation between statistician and client:

  • The black rectangle at the left is the region where the risk does not exceed $\alpha=0.10$ for the null hypothesis.

  • The black rectangle at the right is a region within the alternative hypothesis where the risk does not exceed a value $\beta=0.20.$

  • The gray rectangle is a "gray area" between these regions where no restrictions are placed on the risk.

The client has selected $\alpha,$ $\beta,$ and the left hand limit $\Delta$ of the right black rectangle. This value $\Delta$ is an "effect size." We may characterize the states of nature with $\mu \ge \Delta$ as being "extreme" because they differ most substantially from any state in the null hypothesis. By limiting the risk in the extreme states we are, in a way, modifying (or weighting) the loss function to reflect a sense that false positive errors made for extreme states are worse than false positive errors made for states in the intermediate gray area.

What this test accomplishes, then, can be stated as follows:

The test $t$ is a procedure that limits the risk to $\alpha$ for the null hypothesis and limits it to $\beta$ for the "more extreme" parts of the alternative hypothesis.

It is conceptually very useful to keep Figure 1 in mind when thinking about NHTs, for it shows clearly that although the error rates are controlled, they nevertheless depend on the true (but still unknown!) state of nature.

This figure is usually drawn by flipping the risk upside-down in the alternative hypothesis, as in Figure 2:

What statement about the relationship between statistical power and a statistical probability is true?

This "power curve" is simply the chance of making a "reject" decision. By comparing this figure to the previous one, you can see that where the power is high, the chance of making a false negative error is low. Because the test is constructed to assure the chance of a "reject" is low throughout $\Theta_0,$ often the power curve isn't even plotted for the null hypothesis: it simply is summarized by the test size $\alpha.$ The "significance level" of the test is just $1-\alpha,$ or $90\%$ in this example.


References

Kiefer, J.C. (1987), Introduction to Statistical Inference.

US EPA (2006), Guidance on Systematic Planning Using the Data Quality Objectives Process.

NUREG-1575, Rev. 1 (2000), Multi-Agency Radiation Survey and Site Investigation Manual.

Kiefer is a decision-theoretic account of statistics. I have used his language and notation. US EPA uses graphical explanations (similar to the second figure here) to help people develop appropriate quantitative criteria and statistical procedures for environmental decision making. This guidance has been adopted by many other US agencies, such as the Nuclear Regulatory Commission and Department of Energy: see NUREG-1575 for one example out of many. The term "gray area" or "gray region" is taken from these guidance documents.

Which statement about the relationship between statistical power and statistical probability is true?

Which statement about the relationship between statistical power and statistical probability is true? A statistical test having high power also has high probability for finding significant support.

What is the relationship between power and p value?

Significance (p-value) is the probability that we reject the null hypothesis while it is true. Power is the probability of rejecting the null hypothesis while it is false. Significance is thus the probability of Type I error, whereas 1−power is the probability of Type II error.

What is the power of a statistical test is the probability of?

The power of a test is the probability of rejecting the null hypothesis when it is false; in other words, it is the probability of avoiding a type II error. The power may also be thought of as the likelihood that a particular study will detect a deviation from the null hypothesis given that one exists.

What is true about statistical power?

What is statistical power? In statistics, power refers to the likelihood of a hypothesis test detecting a true effect if there is one. A statistically powerful test is more likely to reject a false negative (a Type II error).