Difference between revisions of "Notes:Distribution of the sample median"
From Maths
m (→Problem statement) |
m (Saving work) |
||
Line 1: | Line 1: | ||
− | {{ProbMacros}}{{M|\newcommand{\O}[0]{\mathcal{O} } \newcommand{\M}[0]{\mathcal{M} } \newcommand{\Q}[0]{\mathcal{Q} } \newcommand{\Min}[1]{\text{Min}\left({#1}\right)} }} | + | {{ProbMacros}}{{M|\newcommand{\O}[0]{\mathcal{O} } \newcommand{\M}[0]{\mathcal{M} } \newcommand{\Q}[0]{\mathcal{Q} } \newcommand{\Min}[1]{\text{Min}\left({#1}\right)} \newcommand{\d}[0]{\mathrm{d} } }} |
__TOC__ | __TOC__ | ||
==Problem overview== | ==Problem overview== | ||
Line 39: | Line 39: | ||
*: {{MM|\eq \big((2m+1)!\big)\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1} } }} | *: {{MM|\eq \big((2m+1)!\big)\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1} } }} | ||
** {{Caveat|We now need:}} {{MM|\big(X\le r\wedge X\le Y\le Z\big)\implies\big(X\le\Min{r,Y}\le Y\le Z\big)}} to justify this format. Although that's arguably not that helpful for the integral. | ** {{Caveat|We now need:}} {{MM|\big(X\le r\wedge X\le Y\le Z\big)\implies\big(X\le\Min{r,Y}\le Y\le Z\big)}} to justify this format. Although that's arguably not that helpful for the integral. | ||
+ | ==Initial integral== | ||
+ | : This isn't about the median specifically, this is just looking at the specific integral. | ||
+ | Suppose we have a sample of length 3, {{M|X,Y,Z}} then we are looking at: | ||
+ | * {{M|\P{X\le\Min{r,Y}\le Y\le Z\le t} }} (where {{M|t}} will be used for a limit towards {{m|\infty}} to get {{M|\P{X\le \Min{r,Y}\le Y\le Z} }} in the end), or as an integral: | ||
+ | ** {{MM|\int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z}} | ||
+ | *** if {{M|t>r}} then the minimum will get involved (for some {{M|z}}s anyway) and limit it to {{M|r}}, otherwise it'll always stay under {{M|r}} - of course in practice (as we'll take {{M|t\rightarrow\infty}}) this will certainly happen. |
Revision as of 07:15, 12 December 2017
Problem overview
Let X1,…,X2m+1 be a sample from a population X, meaning that the Xi are i.i.d random variables, for some m∈N0. We wish to find:
- P[Median(X1,…,X2m+1)≤r]- the Template:Cdf of the median.
Initial work
Since the variables are independent then any ordering is as likely as any other (which I proved the long way, rather than just jumping to 1(2m+1)!
- silly me) however the result, found in Probability of i.i.d random variables being in an order and not greater than something will be useful.
I believe the P[Median(X1,…,X2m+1)≤r]=P[X1≤⋯≤Xm+1≤r | X1≤⋯≤X2m+1]. Let us make some definitions to make this shorter.
- O:=X1≤⋯≤X2m+1 - representing the order part
- M:=X1≤⋯≤Xm+1≤r - representing the median part
- Q:=P[Median(X1,…,X2m+1)≤r]=P[O | O] - representing the question
We should also have some sort of converse, related to r≤Xm+2≤⋯X2m+1 or something.
We also have:
- An expression for P[X1≤⋯≤Xn≤r] from Probability of i.i.d random variables being in an order and not greater than something
- It's =1n!FX(r)n
- It's =1n!FX(r)n
Analysis
Let us look at X≤r and X≤Y to see what we can say if both are true (the "and")
- Claim: (X≤r∧X≤Y)⟺(X≤Min(r,Y))
- Proof:
- ⟹
- Suppose r≤Y, so Min(r,Y)=r, obviously X≤r ⟹ X≤r=Min(r,Y), so the implication holds in this case
- Suppose Y≤r, so Min(r,Y)=Y, obviously X≤Y ⟹ X≤Y=Min(r,Y), so the implication holds in this case too.
- ⟸
- We notice either Min(r,Y)=r if r≤Y, or Min(r,Y)=Y if Y≤r (slightly modify the language for the equality, it doesn't matter though really)
- Thus if r≤Y then X≤r and as r≤Y by assumption, we use the transitivity of ≤ to see X≤r≤Y thus X≤Y too - as required
- Thus if Y≤r then X≤Y and as Y≤r by assumption, we use the transitivity of ≤ to see X≤Y≤r and thus X≤r too - as required.
- So in either case, we have X≤Y and X≤r - as required
- We notice either Min(r,Y)=r if r≤Y, or Min(r,Y)=Y if Y≤r (slightly modify the language for the equality, it doesn't matter though really)
- ⟹
Problem statement
Thus we really want to find:
- P[Median(X1,…,X2m+1)≤r]=P[X1≤⋯≤Xm+1≤r | X1≤⋯≤X2m+1]
- =P[M and O]P[O]
- =((2m+1)!)P[X1≤⋯≤Xm+1≤Min(r,Xm+2)≤Xm+2≤Xm+3⋯≤X2m+1]
- Caveat:We now need: (X≤r∧X≤Y≤Z)⟹(X≤Min(r,Y)≤Y≤Z)to justify this format. Although that's arguably not that helpful for the integral.
- =P[M and O]P[O]
Initial integral
- This isn't about the median specifically, this is just looking at the specific integral.
Suppose we have a sample of length 3, X,Y,Z then we are looking at:
- P[X≤Min(r,Y)≤Y≤Z≤t] (where t will be used for a limit towards ∞ to get P[X≤Min(r,Y)≤Y≤Z] in the end), or as an integral:
- ∫t−∞f(z)(∫z−∞f(y)(∫Min(r,y)−∞f(x)dx)dy)dz
- if t>r then the minimum will get involved (for some zs anyway) and limit it to r, otherwise it'll always stay under r - of course in practice (as we'll take t→∞) this will certainly happen.
- ∫t−∞f(z)(∫z−∞f(y)(∫Min(r,y)−∞f(x)dx)dy)dz