## Covarianceof two random variables: `cov(x, y) = E[ (x - E[x]) (y - E[y]) ]` by linearity of expected value we have:The magnitude of the covariance is not easy to interpret. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation. a matrix whose i, jth element is the covariance between the i ^{th} and j ^{th} elements of the random vector (vector of random variables). generalizes the notion of variance to multiple dimensions. This is equivalent to the following vector multiplication Definition: I have a random vector X = [x1, x2, ..., xn], each of xi's is a random variable with mean and variance. It's covariance matrix is a matrix who's i,j element is the covariance between random variables i and j. Covariance(xi, xj): E[ (xi - E[xi])(xj - E[xj]) ] Spearman or Pearson can be used in finding correlated features and only considering one of them for feature reduction. Spearman's rank correlation coefficient ضریب همبستگی رتبه ای Nonparametric measure of statistical dependence between two variables. If two variables can be described via a monotonic function. Perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect
monotone function of the other.The Spearman correlation coefficient is often described as being "nonparametric". This can have two meanings. First, the fact that a perfect Spearman correlation results when X and Y are related by any monotonic function can be contrasted with the Pearson correlation, which only gives a perfect value when X and Y are related by a linear function. The other sense in which the Spearman correlation is nonparametric in that its exact sampling distribution can be obtained without requiring knowledge (i.e., knowing the parameters) of the joint probability distribution of X and Y.
## MaxDiffin multiple choice questions.
in R: the following code enters this data into R and computes the counts
mdData = matrix(c(NA,NA,1,0,-1,NA,1,NA,NA,0,-1,NA,NA,1,NA,0,NA,-1,0,NA,1,NA,NA,-1,0,-1,1,NA,NA,NA,NA,1,NA,NA,0,-1),6,byrow=TRUE, dimnames=list(Block=1:6,Alternatives = LETTERS[1:6])) mdData apply(mdData,2,sum, na.rm=TRUE) Similarly, for unbalanced designs, the code in R uses apply(mdData,2,mean, na.rm=TRUE) To determine if a result is زمانی یک رابطه از نظر آماری «معنادار» خوانده میشود که به احتمال کمتر
از 5% رابطهی مورد نظر ناشی از تصادف بوده باشد. معنی این گفته این است
که اگر پژوهش تکرار شود، به احتمال 95% به همان نتیجهی قبلی خواهد
انجامید. تعیین عدد 95% دلبخواهی است؛ و استانداردی است که ما انتخاب
کردهایم. یک نقطهی قراردادی دیگر که اهمیت دارد نقطهی 99% است. وقتی
نتیجهی یک آزمایش همبستگی 99% باشد، گفته میشود که نتیجه از نظر آماری
شدیداً معنادار است. Linear discriminant analysis (LDA) and the related Fisher's linear discriminant. find a linear combination of features which characterizes or separates two or more classes of object: LDA seeks to reduce dimensionality while preserving as much of the class discriminatory information as possible: - Assume we have a set of d- dimensional samples {x_1, ..., x_n}, n1 of which belong to class w1 , and n2 to class w2
We seek to obtain a scalar y by projecting the samples x onto a line y=w
^{T}x – Of all the possible lines we would like to select the one that maximizes the separability of the scalarsThe resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.
How to calculate:
Maximize the distance between mean of classes in Y space (μ1 is mean for class 1 and μ2 for class 2 in X space- in Y space we call it μ'): |μ1'-μ2'| = |w
^{T}(μ1-μ2)|Fisher's solution:
s)) |