Bernoulli(p) Distribution

Discrete Random Variable 
: There are only TWO possible outcomes. (e.g. male or female, success or failure, 1 or 0)

$\bigstar$ $f(x)=p^x(1-p)^{1-x}$ , x=0 or 1 (two outcomes)
                                   0$\leq$ p$\leq$ 1 (probability is always between 0 and 1) 

$\bigstar$ E(X)=p, Var(x)=p(1-p)=pq (where q=1-p) 
Proof
$f(1|p)=p^1(1-p)^{1-1}=p$ (probability of being 1)
$f(0|p)=p^0(1-p)^1=1-p$  (probability of being 0)

If we have n data (n samples), we can calculate the likelihood by using joint distribution. 
$\Rightarrow$ $P(X_{1}=x_{1},X_{2}=x_{2},...,X_{n}=x_{n})=L=\prod_{i=1}^{n}p^{x_{i}}(1-p)^{1-x_{i}}$

To maximize? Take a derivative with respect to P and set the equation to equal to 0. 
$\Rightarrow$ $P=\frac{dL}{dP}=0$ $\Rightarrow \hat{p}_{MLE}$   
(we need a chain rule b/c consisting of product) 

$\Rightarrow log L=l=log(\prod_{i=1}^{n}p^{x_{i}}(1-p)^{1-x_{i}}=\sum_{i=1}^{n} log(p^{X_i}(1-p)^{1-X_{i}})$
   $=\sum_{i=1}^{n}[{x_{i}\cdot logp+(1-X_{i}})\cdot log(1-p) ]$ 
   $= logp \sum_{i=1}^{n}X_{i}+log(1-p)\sum_{i=1}^{n}(1-X_{i})$ 
   $= n\bar{X}\cdot log\hat{p}+n(1-\bar{X})\cdot log(1-\hat{p})$ 
   $\because \sum_{i=1}^{n}X_{i}=n\bar{X}\Rightarrow \bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_{i}$ 

$\Rightarrow \frac{dl}{d\hat{p}}=\frac{n\bar{X}}{\hat{p}} - \frac{n(1-\bar{X})}{1-\hat{p}} = 0$ (for maximizing)
   $= \frac{n\bar{X}}{\hat{p}}=\frac {n(1-\bar{X})}{1-\hat{p}}=\bar{X}\cdot \hat{p}\bar{X}=\hat{p}-\hat{p}\bar{X}$ $\Rightarrow$  $\hat{p}_{MLE}=\bar{X}$



$\bigstar$ Show that $T=\sum_{i=1}^{n}X_{i}$ is a sufficient statistic. 
Proof
By independence, the joint distribution of the random sample is
$\prod_{i}^{n}p^{x_{i}}(1-p)^{1-x_{i}}=p^{\sum X_{i}}(1-p)^{n-\sum X_{i}} \cdot 1$ ,
where $p^{\sum X_{i}}(1-p)^{n-\sum X_{i}}$  = $g(\sum x_{i},p)$ , and 1= $h(X_{1},...,X_{n})$


$\bigstar$ Show that Bernoulli distribution is part of the exponential family. 
Proof 
We need to show $f_{\theta}(X)= \exp {[\sum_{i=1}^{k}]C_{i}(\theta)\cdot T_{j}(X)}+d(\theta)+ s(X)$ Click link to more details 

parameter p, where p= P(X=1)
$p(x|p)=p^x(1-p)^{1-x}$
$p(x|p)=exp{[log(p^x(1-p)^{1-x})]}=exp[x\cdot logp+(1-x)\cdot log(1-p)]$ 
          = $exp [x\cdot log \frac{p}{1-p} + log(1-p)]$     
This shows the Bernoulli distribution belongs to the exponential family with parameter $c(\theta)=\log \frac {p}{1-p}$, $T(x)=x, d(\theta)=\log (1-p), s(x)=0$  



Likelihood Ratio Example 
$Y_{1},...,Y_{n}$ denote a random sample from Bernoulli $P(Y_{i}|p)=p^{y_{i}}(1-p)^{1-y_{i}}$ , where $y_{i}$=0 or 1. Suppose $H_{0}:P=P_{0}$ , $H_{1}:P=P_{a}$, where $P_{0} < $ $P_{a}$

(a) Show that  $\frac{L(P_{o})}{L(P_{a})}=[\frac{P_{0}\cdot (1-P_{a})}{(1-P_{0}) \cdot P_{a}}]^{\sum y_{i}}\cdot (\frac{1-P_{0}}{1-P_{a}})^n$ 
(b) Argue that $\frac{L(P_{o})}{L(P_{a})}$ < K iff $\sum y_{i} < k$ 


No comments:

Post a Comment