Difference between revisions of "Notes:Distribution of the sample median"

From Maths
Jump to: navigation, search
m (Saving work)
(Saving work)
Line 45: Line 45:
 
** {{MM|\int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z}}
 
** {{MM|\int^t_{-\infty}f(z)\left(\int^z_{-\infty}f(y)\left(\int^{\Min{r,y} }_{-\infty} f(x)\d x\right)\d y\right)\d z}}
 
*** if {{M|t>r}} then the minimum will get involved (for some {{M|z}}s anyway) and limit it to {{M|r}}, otherwise it'll always stay under {{M|r}} - of course in practice (as we'll take {{M|t\rightarrow\infty}}) this will certainly happen.
 
*** if {{M|t>r}} then the minimum will get involved (for some {{M|z}}s anyway) and limit it to {{M|r}}, otherwise it'll always stay under {{M|r}} - of course in practice (as we'll take {{M|t\rightarrow\infty}}) this will certainly happen.
 +
==Progression: 1==
 +
We are evaluating: {{MM|\P{X_1\le\cdots\le X_{m+1}\le\Min{r,X_{m+2} }\le X_{m+2}\le X_{m+3}\cdots\le X_{2m+1}\le t } }} (our answer is {{MM|\big((2m+1)!\big)\times}} of this as {{M|t\rightarrow\infty}} ), the full integral follows:
 +
* {{MM|\int^t_{-\infty}f(x_{2m+1})\left(\int^{x_{2m+1} }_{-\infty}f(x_{2m})\left(\cdots\int^{x_{m+3} }_{-\infty}f(x_{m+2})\left(\int^{\Min{r,x_{m+2} } }_{-\infty} f(x_{m+1})<!--
 +
 +
MARKER: (int^x_m+1 ... \d x_m starts here
 +
I want to put an underbrace around it.
 +
 +
--><!--\underbrace-->{\left(\int^{x_{m+1} }_{-\infty}f(x_{m} )\left(\cdots\int^{x_2}_{-\infty}f(x_1)\d x_1\cdots\right)\d x_m\right)}<!--
 +
 +
Marker: \d x_m ends here
 +
 +
 +
-->\d x_{m+1}\right)\d x_{m+2}\cdots\right)\d x_{2m}\right)\d x_{2m+1} }}
 +
We operate on the inner bit:
 +
* {{MM|{\int^{x_{m+1} }_{-\infty}f(x_{m} )\left(\cdots\int^{x_2}_{-\infty}f(x_1)\d x_1\cdots\right)\d x_m}\eq \frac{1}{m!}F(x_{m+1})^m}}
 +
We substitute this back in to yield:
 +
* {{MM|\frac{1}{m!}\int^t_{-\infty}f(x_{2m+1})\left(\int^{x_{2m+1} }_{-\infty}f(x_{2m})\left(\cdots\int^{x_{m+3} }_{-\infty}f(x_{m+2})\left(\int^{\Min{r,x_{m+2} } }_{-\infty} f(x_{m+1})F(x_{m+1})^m\d x_{m+1}\right)\d x_{m+2}\cdots\right)\d x_{2m}\right)\d x_{2m+1} }}

Revision as of 00:34, 16 December 2017

Problem overview

Let X1,,X2m+1 be a sample from a population X, meaning that the Xi are i.i.d random variables, for some mN0. We wish to find:

  • P[Median(X1,,X2m+1)r] - the Template:Cdf of the median.

Initial work

Since the variables are independent then any ordering is as likely as any other (which I proved the long way, rather than just jumping to 1(2m+1)! - silly me) however the result, found in Probability of i.i.d random variables being in an order and not greater than something will be useful.


I believe the P[Median(X1,,X2m+1)r]=P[X1Xm+1r | X1X2m+1]. Let us make some definitions to make this shorter.

  • O:=X1X2m+1 - representing the order part
  • M:=X1Xm+1r - representing the median part
  • Q:=P[Median(X1,,X2m+1)r]=P[O | O] - representing the question


We should also have some sort of converse, related to rXm+2X2m+1 or something.


We also have:

Analysis

Let us look at Xr and XY to see what we can say if both are true (the "and")

  • Claim: (XrXY)(XMin(r,Y))
  • Proof:
      1. Suppose rY, so Min(r,Y)=r, obviously Xr  Xr=Min(r,Y), so the implication holds in this case
      2. Suppose Yr, so Min(r,Y)=Y, obviously XY  XY=Min(r,Y), so the implication holds in this case too.
      • We notice either Min(r,Y)=r if rY, or Min(r,Y)=Y if Yr (slightly modify the language for the equality, it doesn't matter though really)
        • Thus if rY then Xr and as rY by assumption, we use the transitivity of to see XrY thus XY too - as required
        • Thus if Yr then XY and as Yr by assumption, we use the transitivity of to see XYr and thus Xr too - as required.
      • So in either case, we have XY and Xr - as required

Problem statement

Thus we really want to find:

  • P[Median(X1,,X2m+1)r]=P[X1Xm+1r | X1X2m+1]
    =P[M and O]P[O]
    =((2m+1)!)P[X1Xm+1Min(r,Xm+2)Xm+2Xm+3X2m+1]
    • Caveat:We now need: (XrXYZ)(XMin(r,Y)YZ) to justify this format. Although that's arguably not that helpful for the integral.

Initial integral

This isn't about the median specifically, this is just looking at the specific integral.

Suppose we have a sample of length 3, X,Y,Z then we are looking at:

  • P[XMin(r,Y)YZt] (where t will be used for a limit towards to get P[XMin(r,Y)YZ] in the end), or as an integral:
    • tf(z)(zf(y)(Min(r,y)f(x)dx)dy)dz
      • if t>r then the minimum will get involved (for some zs anyway) and limit it to r, otherwise it'll always stay under r - of course in practice (as we'll take t) this will certainly happen.

Progression: 1

We are evaluating: P[X1Xm+1Min(r,Xm+2)Xm+2Xm+3X2m+1t] (our answer is ((2m+1)!)× of this as t ), the full integral follows:

  • tf(x2m+1)(x2m+1f(x2m)(xm+3f(xm+2)(Min(r,xm+2)f(xm+1)(xm+1f(xm)(x2f(x1)dx1)dxm)dxm+1)dxm+2)dx2m)dx2m+1

We operate on the inner bit:

  • xm+1f(xm)(x2f(x1)dx1)dxm=1m!F(xm+1)m

We substitute this back in to yield:

  • 1m!tf(x2m+1)(x2m+1f(x2m)(xm+3f(xm+2)(Min(r,xm+2)f(xm+1)F(xm+1)mdxm+1)dxm+2)dx2m)dx2m+1