t-SNE
t-distributed stochastic neighbor embedding, t-SNE
Fundamental Concept
- It is a method of dimensionally reducing complex data in high dimensions to two dimensions.
- t-SNE is one of manifold learning and aims to visualize complex data.
Algorithm
- t-SNE steps
- The similarity of x_i and x_j for all i and j pairs is expressed as similarity using Gaussian distribution.
- We randomly place the same number of points y_i as x_i in a low-dimensional space, and show the similarities of y_i and y_j for all i and j pairs using the t distribution.
- If possible, the data point y_i is updated so that the similarity distribution defined in 1 and 2 is the same.
- Repeat step 3 until the convergence condition.
Sample Code
from sklearn.manifold import TSNE
from sklearn.datasets import load_digits
data = load_digits()
model = TSNE(n_components=2)
print(model.fit_transform(data.data))
C:\Users\bl4an\anaconda3\lib\site-packages\sklearn\manifold\_t_sne.py:780: FutureWarning: The default initialization in TSNE will change from 'random' to 'pca' in 1.2.
warnings.warn(
C:\Users\bl4an\anaconda3\lib\site-packages\sklearn\manifold\_t_sne.py:790: FutureWarning: The default learning rate in TSNE will change from 200.0 to 'auto' in 1.2.
warnings.warn(
[[-26.89956 55.44969 ]
[ 17.039724 -16.77795 ]
[-13.765958 -10.344211 ]
...
[ 2.8657644 -3.5069797]
[ 1.727609 23.916462 ]
[ -5.1929913 -0.6161056]]
참고문헌
- 秋庭伸也 et al. 머신러닝 도감 : 그림으로 공부하는 머신러닝 알고리즘 17 / 아키바 신야, 스기야마 아세이, 데라다 마나부 [공] 지음 ; 이중민 옮김, 2019.
댓글남기기