Difference between revisions of "Notes:Distribution of the sample median"
(Saving work (switching computer)) |
(Conclusion) |
||
Line 79: | Line 79: | ||
*#**: {{MM|\eq\frac{1}{2!}F(r)^2\left[\frac{3F(t)-2F(r)}{3}\right]}} | *#**: {{MM|\eq\frac{1}{2!}F(r)^2\left[\frac{3F(t)-2F(r)}{3}\right]}} | ||
*#**: {{MM|\eq\frac{1}{3!}F(r)^2\big(3F(t)-2F(r)\big)}} | *#**: {{MM|\eq\frac{1}{3!}F(r)^2\big(3F(t)-2F(r)\big)}} | ||
+ | It is clear that as {{M|t\rightarrow\infty}} that we end up with {{MM|I_1\eq\frac{1}{3!}F(r)^2\big(3-2F(r)\big)}} | ||
+ | |||
+ | Thus: {{MM|\P{X_1\le X_2\le\Min{r,X_3}\le X_3}\eq\frac{1}{3!}F(r)^2\big(3-2F(r)\big)}} | ||
+ | |||
+ | Finally: | ||
+ | * {{MM|\Pcond{X_1\le X_2\le r}{X_1\le X_2\le X_3}\eq F(r)^2\big(3-2F(r)\big)}} | ||
===Required corollary=== | ===Required corollary=== | ||
Recall from [[Probability of i.i.d random variables being in an order and not greater than something]] that: | Recall from [[Probability of i.i.d random variables being in an order and not greater than something]] that: |
Revision as of 17:05, 16 December 2017
Contents
[hide]Problem overview
Let X_1,\ldots,X_{2m+1} be a sample from a population X, meaning that the X_i are i.i.d random variables, for some m\in\mathbb{N}_{0} . We wish to find:
- \P{\text{Median}(X_1,\ldots,X_{2m+1})\le r} - the Template:Cdf of the median.
Initial work
Since the variables are independent then any ordering is as likely as any other (which I proved the long way, rather than just jumping to \frac{1}{(2m+1)!} - silly me) however the result, found in Probability of i.i.d random variables being in an order and not greater than something will be useful.
I believe the \P{\text{Median}(X_1,\ldots,X_{2m+1})\le r}\eq\Pcond{X_1\le\cdots\le X_{m+1}\le r}{X_1\le\cdots\le X_{2m+1} } . Let us make some definitions to make this shorter.
- \mathcal{O}:\eq X_1\le\cdots\le X_{2m+1} - representing the order part
- \mathcal{M}:\eq X_1\le\cdots\le X_{m+1}\le r - representing the median part
- \mathcal{Q}:\eq\P{\text{Median}(X_1,\ldots,X_{2m+1})\le r}\eq\Pcond{\mathcal{O} }{\mathcal{O} } - representing the question
We should also have some sort of converse, related to r\le X_{m+2}\le\cdots X_{2m+1} or something.
We also have:
- An expression for \P{X_1\le \cdots\le X_n\le r} from Probability of i.i.d random variables being in an order and not greater than something
- It's \eq\frac{1}{n!}F_X(r)^n
Analysis
Let us look at X\le r and X\le Y to see what we can say if both are true (the "and")
- Claim: (X\le r\wedge X\le Y)\iff(X\le\Min{r,Y})
- Proof:
- \implies
- Suppose r\le Y, so \Min{r,Y}\eq r, obviously X\le r\ \implies\ X\le r\eq\Min{r,Y} , so the implication holds in this case
- Suppose Y\le r, so \Min{r,Y}\eq Y, obviously X\le Y\ \implies\ X\le Y\eq\Min{r,Y} , so the implication holds in this case too.
- \impliedby
- We notice either \Min{r,Y}\eq r if r\le Y, or \Min{r,Y}\eq Y if Y\le r (slightly modify the language for the equality, it doesn't matter though really)
- Thus if r\le Y then X\le r and as r\le Y by assumption, we use the transitivity of \le to see X\le r\le Y thus X\le Y too - as required
- Thus if Y\le r then X\le Y and as Y\le r by assumption, we use the transitivity of \le to see X\le Y\le r and thus X\le r too - as required.
- So in either case, we have X\le Y and X\le r - as required
- We notice either \Min{r,Y}\eq r if r\le Y, or \Min{r,Y}\eq Y if Y\le r (slightly modify the language for the equality, it doesn't matter though really)
- \implies
Problem statement
Thus we really want to find:
- \P{\text{Median}(X_1,\ldots,X_{2m+1})\le r}\eq\Pcond{X_1\le\cdots\le X_{m+1}\le r}{X_1\le\cdots\le X_{2m+1} }
- \eq\frac{\P{\M\ \text{and}\ \O} }{\P{\O} }
- \eq \big((2m+1)!\big)\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1} }
- Caveat:We now need: \big(X\le r\wedge X\le Y\le Z\big)\implies\big(X\le\Min{r,Y}\le Y\le Z\big) to justify this format. Although that's arguably not that helpful for the integral.
Initial integral
- This isn't about the median specifically, this is just looking at the specific integral.
Suppose we have a sample of length 3, X,Y,Z then we are looking at:
- \P{X\le\Min{r,Y}\le Y\le Z\le t} (where t will be used for a limit towards \infty to get \P{X\le \Min{r,Y}\le Y\le Z} in the end), or as an integral:
- \int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z
- if t>r then the minimum will get involved (for some zs anyway) and limit it to r, otherwise it'll always stay under r - of course in practice (as we'll take t\rightarrow\infty) this will certainly happen.
- \int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z
Progression: 1
We are evaluating: \P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1}\le t } (our answer is \big((2m+1)!\big)\times of this as t\rightarrow\infty ), the full integral follows:
- \int^t_{-\infty}f(x_{2m+1})\left(\int^{x_{2m+1} }_{-\infty}f(x_{2m})\left(\cdots\int^{x_{m+3} }_{-\infty}f(x_{m+2})\left(\int^{\Min{r,x_{m+2} } }_{-\infty} f(x_{m+1}){\left(\int^{x_{m+1} }_{-\infty}f(x_{m} )\left(\cdots\int^{x_2}_{-\infty}f(x_1)\d x_1\cdots\right)\d x_m\right)}\d x_{m+1}\right)\d x_{m+2}\cdots\right)\d x_{2m}\right)\d x_{2m+1}
We operate on the inner bit:
- {\int^{x_{m+1} }_{-\infty}f(x_{m} )\left(\cdots\int^{x_2}_{-\infty}f(x_1)\d x_1\cdots\right)\d x_m}\eq \frac{1}{m!}F(x_{m+1})^m
We substitute this back in to yield:
- \frac{1}{m!}\int^t_{-\infty}f(x_{2m+1})\left(\int^{x_{2m+1} }_{-\infty}f(x_{2m})\left(\cdots\int^{x_{m+3} }_{-\infty}f(x_{m+2})\left(\int^{\Min{r,x_{m+2} } }_{-\infty} f(x_{m+1})F(x_{m+1})^m\d x_{m+1}\right)\d x_{m+2}\cdots\right)\d x_{2m}\right)\d x_{2m+1}
Progression: 2
This'll involve induction and dealing with the \text{Min}() will be "tricky", both for practice and induction we will consider the special cases m\eq 1 and m\eq 2 by evaluating:
- m\eq 1 yields I_1:\eq\frac{1}{1!}\int^t_{-\infty} f(x_3)\left(\int^{\Min{r,x_3} }_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3, by case analysis:
- if t\le r then x_3\le t\le r or x_3\le r over the entire domain of interest, so \Min{r,x_3}\eq x_3 over the entire domain, giving:
- I_1\eq\frac{1}{1!}\int^t_{-\infty}f(x_3)\left(\int^{x_3}_{-\infty}f(x_2)F(x_2)\d x_2\right)\d x_3
- We now use the corollary below to see:
- I_1\eq\frac{1}{2!}\int^t_{-\infty}f(x_3)F(x_3)^2\d x_3
- \eq\frac{1}{3!}F(t)^3
- I_1\eq\frac{1}{2!}\int^t_{-\infty}f(x_3)F(x_3)^2\d x_3
- We now use the corollary below to see:
- I_1\eq\frac{1}{1!}\int^t_{-\infty}f(x_3)\left(\int^{x_3}_{-\infty}f(x_2)F(x_2)\d x_2\right)\d x_3
- if t\ge r then we split (-\infty,t] into (-\infty,r) and [r,t], giving:
- I_1\eq\frac{1}{1!}\left[\int^r_{-\infty} f(x_3)\left(\int^{\Min{r,x_3} }_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3+\int_r^tf(x_3)\left(\int^{\Min{r,x_3} }_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3\right]
- \eq\frac{1}{1!}\left[\int^r_{-\infty}f(x_3)\left(\int^{x_3}_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3+\int_r^tf(x_3)\left(\int^r_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3\right]
- We now use the required corollary immediately below to yield:
- I_1\eq\frac{1}{1!}\left[\int^r_{-\infty}f(x_3)\cdot\frac{1}{2}F(x_3)^2\d x_3+\int_r^tf(x_3)\cdot\frac{1}{2}F(r)^2\d x_3\right]
- \eq\frac{1}{2!}\left[\frac{1}{3}F(r)^3+F(r)^2\int^t_rf(x_3)\d x_3\right], note that: \int^t_rf(x)\d x\eq\int_{-\infty}^tf(x)\d x-\int_{-\infty}^rf(x)\d x \eq F(t)-F(r)
- \eq\frac{1}{2!}F(r)^2\left[\frac{1}{3}F(r)+\big(F(t)-F(r)\big)\right], note that: F(t)-F(r)\eq\frac{3F(t)-3F(r)}{3} which we'll use next
- \eq\frac{1}{2!}F(r)^2\left[\frac{3F(t)-2F(r)}{3}\right]
- \eq\frac{1}{3!}F(r)^2\big(3F(t)-2F(r)\big)
- I_1\eq\frac{1}{1!}\left[\int^r_{-\infty} f(x_3)\left(\int^{\Min{r,x_3} }_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3+\int_r^tf(x_3)\left(\int^{\Min{r,x_3} }_{-\infty}f(x_2)F(x_2) \d x_2\right)\d x_3\right]
- if t\le r then x_3\le t\le r or x_3\le r over the entire domain of interest, so \Min{r,x_3}\eq x_3 over the entire domain, giving:
It is clear that as t\rightarrow\infty that we end up with I_1\eq\frac{1}{3!}F(r)^2\big(3-2F(r)\big)
Thus: \P{X_1\le X_2\le\Min{r,X_3}\le X_3}\eq\frac{1}{3!}F(r)^2\big(3-2F(r)\big)
Finally:
- \Pcond{X_1\le X_2\le r}{X_1\le X_2\le X_3}\eq F(r)^2\big(3-2F(r)\big)
Required corollary
Recall from Probability of i.i.d random variables being in an order and not greater than something that:
- \frac{1}{k!}\int^r_{-\infty}f(x)F(x)^k\d x\eq \frac{1}{(k+1)!}F(r)^{k+1}
So:
- \int^r_{-\infty}f(x)F(x)^k\d x\eq \frac{1}{k+1}F(r)^{k+1}
By applying this to above (with the x_2 integrals):
- \int^r_{-\infty}f(x)F(x)^1\d x\eq \frac{1}{2}F(r)^2 , we then substitute this for the cases r:\eq r and r:\eq x_3
We'll then apply it to the x_3 integrals.