### Self Organizing Maps

http://www.willamette.edu/~gorr/classes/cs449/Unsupervised/SOM.html

It is a leaning approach used to visualize large sets of data. I think of it as a means of clustering.

In a simplest way: There is a grid of neurons, each with some weight.
a new sample arrives which has its own weight
the neuron with the closest weight will be the winner

One of the topics is that it can be regarded as a means of dimension reduction from n dimension data to 2 dimensions (if the network is connected in 2d),
but this can be combined with some other forms of dimension reduction and data visualization algorithms to better recognize the relationships among the distribution of data in a higher dimension space.

To calculate the distance between two samples, we can't just measure the eucleadian distance between their associative neurons; e.g. two samples might be with a distance of one neuron but that neuron can have a high contrast(distance from the mentioned samples). So to have a good understanding of the distances, we create a sample-sample distance matrix. For example in the case of binary images of characters:

SOM Binary Characters Distance Matrix
And then to visualize this, we can use a kind of graph drawing to embed the graph of samples on 2d maintaining the distance matrix among samples. (We can use something like monte-carlo and iterations to get close to the drawing conforming to these distances)
2d illustration of distances and relative position of samples
From this we can see that I and T are very close to each other which was not very vivid from the distance matrix.

Since SOM distances doesn't correctly demonstrate the true relative distance among samples a data visualization approach is highly encouraged.

## Sammon's projection (mapping)

an algorithm that maps a high-dimensional space to a space of lower dimensionality (see multidimensional scaling).
Denote the distance between ith and jth objects in the original space by , and the distance between their projections by . Sammon's projection aims to minimize the following error function, which is often referred to as Sammon's stress: 