Inferential Statistics and Hypothesis Testing

Tags: EDA, Week4

Updated: December 1, 2022

Estimation and InferencePermalink

Estimation is the application of an algorithm, to estimate parameter, e.g. mean, variance, etc. Inference involves putting an accuracy on the estimated value ? Statistical significancy
Machine Learning and Statistical inference are similar. ML uses data to learn/infer qualities of a distirbution that generated the data, which is data-generating process.\

CodesPermalink

                  sns.barplot(x="variable", y="value", data=df)
sns.barplot(y, x=pd.cut(df.variable, bins=#), data=df)
pairplot = data[['x', 'y', 'z']]
sns.pairplot(pairplot, hue = "variable")
sns.jointplot(x="x", y="y", data=df, kind='hex') # hexbin plot

                

Parametric vs Non-parametric Permalink

Non-parametric is creating a distribution(CDF) of the data using a histogram.

Parametric:Permalink

Parametric model is a prticular type of statistical model. e.g.) Nomal distribution. Customer
lifetime value (CLV) is a parametric model.\

Maximum Likelihood Estimation (MLE) Permalink

likelihood function is related to probability and is a function of the parameters of the model

\Lambda_n (\theta) = \Pi_{i=1}^{n} f(X_i, \theta)

Frequentist vs BayesianPermalink

FrequentistPermalink

frequentist is concerened with repeated observations in the limit. Processes may have true frequencies, but we focus on repetition of experiment.

Derive the probabilistic property of a procedure
Apply the probability directly to the observed data
BayesianPermalink

Bayesian describes parameters by orobability distributions. Prior distribution is formulated, this prior is updated after seeing data into posterior distbution.

Hypothesis testingPermalink

Hypothesis is a statement about a population parameter

null hypothesis: $H_0$ and alternative hypothesis: $H_1$
p-value: $P(H_0)$ In Bayesian inference, we don’t get decision boundary.

Bayesian interpretationPermalink

Given Priors $P(H_1) = P(H_2) = 1/2$
Then by Bayes’ Rule, likelihood ratio is defined as below.

\frac{P(H_1|x)}{P(H_2|x)} = \frac{P(H_1)P(x|H_1)}{P(H_2)P(x|H_2)}

Likelihood ratio tells how we should update the priors in reation to seeing a given set of data.

Types of ErrorPermalink

Neyman-Pearson paradigm (1993) Permalink

non-bayesian inference

	Accept $H_0$	Reject $H_0$
$H_0$	Correct	Type 1 Error
$H_1$	Type 2 Error	Correct

Power of a test: 1 - P(Type 1 Error)

TerminologyPermalink

test statistics, rejeciton region, acceptance region, null distribution

Share on

Twitter Facebook LinkedIn

Younghun Lee

Inferential Statistics and Hypothesis Testing

Estimation and InferencePermalink

CodesPermalink

Parametric vs Non-parametric Permalink

Parametric:Permalink

Maximum Likelihood Estimation (MLE) Permalink

Frequentist vs BayesianPermalink

FrequentistPermalink

BayesianPermalink

Hypothesis testingPermalink

Bayesian interpretationPermalink

Types of ErrorPermalink

Neyman-Pearson paradigm (1993) Permalink

TerminologyPermalink

Share on

Leave a comment

You may also enjoy

Policy Gradient Method

Machine Learning on Apple stock daily return2

Machine Learning on Apple stock daily return

Reinforcement Learning