Difference between revisions of "Notes:Distribution of the sample median"
m (→Problem statement) |
m (Saving work) |
||
Line 1: | Line 1: | ||
− | {{ProbMacros}}{{M|\newcommand{\O}[0]{\mathcal{O} } \newcommand{\M}[0]{\mathcal{M} } \newcommand{\Q}[0]{\mathcal{Q} } \newcommand{\Min}[1]{\text{Min}\left({#1}\right)} }} | + | {{ProbMacros}}{{M|\newcommand{\O}[0]{\mathcal{O} } \newcommand{\M}[0]{\mathcal{M} } \newcommand{\Q}[0]{\mathcal{Q} } \newcommand{\Min}[1]{\text{Min}\left({#1}\right)} \newcommand{\d}[0]{\mathrm{d} } }} |
__TOC__ | __TOC__ | ||
==Problem overview== | ==Problem overview== | ||
Line 39: | Line 39: | ||
*: {{MM|\eq \big((2m+1)!\big)\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1} } }} | *: {{MM|\eq \big((2m+1)!\big)\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1} } }} | ||
** {{Caveat|We now need:}} {{MM|\big(X\le r\wedge X\le Y\le Z\big)\implies\big(X\le\Min{r,Y}\le Y\le Z\big)}} to justify this format. Although that's arguably not that helpful for the integral. | ** {{Caveat|We now need:}} {{MM|\big(X\le r\wedge X\le Y\le Z\big)\implies\big(X\le\Min{r,Y}\le Y\le Z\big)}} to justify this format. Although that's arguably not that helpful for the integral. | ||
+ | ==Initial integral== | ||
+ | : This isn't about the median specifically, this is just looking at the specific integral. | ||
+ | Suppose we have a sample of length 3, {{M|X,Y,Z}} then we are looking at: | ||
+ | * {{M|\P{X\le\Min{r,Y}\le Y\le Z\le t} }} (where {{M|t}} will be used for a limit towards {{m|\infty}} to get {{M|\P{X\le \Min{r,Y}\le Y\le Z} }} in the end), or as an integral: | ||
+ | ** {{MM|\int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z}} | ||
+ | *** if {{M|t>r}} then the minimum will get involved (for some {{M|z}}s anyway) and limit it to {{M|r}}, otherwise it'll always stay under {{M|r}} - of course in practice (as we'll take {{M|t\rightarrow\infty}}) this will certainly happen. |
Revision as of 07:15, 12 December 2017
Contents
[hide]Problem overview
Let X_1,\ldots,X_{2m+1} be a sample from a population X, meaning that the X_i are i.i.d random variables, for some m\in\mathbb{N}_{0} . We wish to find:
- \P{\text{Median}(X_1,\ldots,X_{2m+1})\le r} - the Template:Cdf of the median.
Initial work
Since the variables are independent then any ordering is as likely as any other (which I proved the long way, rather than just jumping to \frac{1}{(2m+1)!} - silly me) however the result, found in Probability of i.i.d random variables being in an order and not greater than something will be useful.
I believe the \P{\text{Median}(X_1,\ldots,X_{2m+1})\le r}\eq\Pcond{X_1\le\cdots\le X_{m+1}\le r}{X_1\le\cdots\le X_{2m+1} } . Let us make some definitions to make this shorter.
- \mathcal{O}:\eq X_1\le\cdots\le X_{2m+1} - representing the order part
- \mathcal{M}:\eq X_1\le\cdots\le X_{m+1}\le r - representing the median part
- \mathcal{Q}:\eq\P{\text{Median}(X_1,\ldots,X_{2m+1})\le r}\eq\Pcond{\mathcal{O} }{\mathcal{O} } - representing the question
We should also have some sort of converse, related to r\le X_{m+2}\le\cdots X_{2m+1} or something.
We also have:
- An expression for \P{X_1\le \cdots\le X_n\le r} from Probability of i.i.d random variables being in an order and not greater than something
- It's \eq\frac{1}{n!}F_X(r)^n
Analysis
Let us look at X\le r and X\le Y to see what we can say if both are true (the "and")
- Claim: (X\le r\wedge X\le Y)\iff(X\le\Min{r,Y})
- Proof:
- \implies
- Suppose r\le Y, so \Min{r,Y}\eq r, obviously X\le r\ \implies\ X\le r\eq\Min{r,Y} , so the implication holds in this case
- Suppose Y\le r, so \Min{r,Y}\eq Y, obviously X\le Y\ \implies\ X\le Y\eq\Min{r,Y} , so the implication holds in this case too.
- \impliedby
- We notice either \Min{r,Y}\eq r if r\le Y, or \Min{r,Y}\eq Y if Y\le r (slightly modify the language for the equality, it doesn't matter though really)
- Thus if r\le Y then X\le r and as r\le Y by assumption, we use the transitivity of \le to see X\le r\le Y thus X\le Y too - as required
- Thus if Y\le r then X\le Y and as Y\le r by assumption, we use the transitivity of \le to see X\le Y\le r and thus X\le r too - as required.
- So in either case, we have X\le Y and X\le r - as required
- We notice either \Min{r,Y}\eq r if r\le Y, or \Min{r,Y}\eq Y if Y\le r (slightly modify the language for the equality, it doesn't matter though really)
- \implies
Problem statement
Thus we really want to find:
- \P{\text{Median}(X_1,\ldots,X_{2m+1})\le r}\eq\Pcond{X_1\le\cdots\le X_{m+1}\le r}{X_1\le\cdots\le X_{2m+1} }
- \eq\frac{\P{\M\ \text{and}\ \O} }{\P{\O} }
- \eq \big((2m+1)!\big)\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1} }
- Caveat:We now need: \big(X\le r\wedge X\le Y\le Z\big)\implies\big(X\le\Min{r,Y}\le Y\le Z\big) to justify this format. Although that's arguably not that helpful for the integral.
Initial integral
- This isn't about the median specifically, this is just looking at the specific integral.
Suppose we have a sample of length 3, X,Y,Z then we are looking at:
- \P{X\le\Min{r,Y}\le Y\le Z\le t} (where t will be used for a limit towards \infty to get \P{X\le \Min{r,Y}\le Y\le Z} in the end), or as an integral:
- \int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z
- if t>r then the minimum will get involved (for some zs anyway) and limit it to r, otherwise it'll always stay under r - of course in practice (as we'll take t\rightarrow\infty) this will certainly happen.
- \int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z