Home - Yikyong/machine-learning-in-Python GitHub Wiki
Machine Learning in Python @ sci-kit learn
- This repository provides the user guide of scikit learn translated into Korean.
- scikit learn μ μ¬μ©μ κ°μ΄λλ₯Ό νκ΅μ΄λ‘ λ²μνμ¬ μ 곡νκ³ μμ΅λλ€.
- Simply click on the links to view the original and translated page on active link as blue due to on-going project
- μ§νμ€μΈ νλ‘μ νΈμΈ κ΄κ³λ‘ νλμμΌλ‘ νμ±νλ λ§ν¬λ₯Ό ν΄λ¦νμ¬ μλ³Έκ³Ό λ²μλ³Έμ νμΈνμκΈ° λ°λλλ€.
Table of Contents
-
1 Supervised learning
- 1.1 Generalized Linear Models
- 1.2. Linear and Quadratic Discriminant Analysis
- 1.3. Kernel ridge regression
- 1.4. Support Vector Machines
- 1.5. Stochastic Gradient Descent
- 1.6. Nearest Neighbors
- 1.7. Gaussian Processes
- 1.8. Cross decomposition
- 1.9. Naive Bayes
- 1.10. Decision Trees
- 1.11. Ensemble methods
- 1.12. Multiclass and multilabel algorithms
- 1.13. Feature selection
- 1.14. Semi-Supervised
- 1.15. Isotonic regression
- 1.16. Probability calibration
- 1.17. Neural network models (supervised)
-
2 Unsupervised learning
- 2.1. Gaussian mixture models
- 2.2. Manifold learning
- 2.3. Clustering
- 2.4. Biclustering
- 2.5. Decomposing signals in components (matrix factorization problems)
- 2.6. Covariance estimation
- 2.7. Novelty and Outlier Detection
- 2.8. Density Estimation
- 2.9. Neural network models (unsupervised)
-
3 Model selection and evaluation
- 3.1. Cross-validation: evaluating estimator performance
- 3.2. Tuning the hyper-parameters of an estimator
- 3.3. Model evaluation: quantifying the quality of predictions
- 3.4. Model persistence
- 3.5. Validation curves: plotting scores to evaluate models
-
4 Dataset transformations
- 4.1. Pipeline and FeatureUnion: combining estimators
- 4.2. Feature extraction
- 4.3. Preprocessing data
- 4.4. Unsupervised dimensionality reduction
- 4.5. Random Projection
- 4.6. Kernel Approximation
- 4.7. Pairwise metrics, Affinities and Kernels
- 4.8. Transforming the prediction target (y)
-
5 Dataset loading utilities
- 5.1. General dataset API
- 5.2. Toy datasets
- 5.3. Sample images
- 5.4. Sample generators
- 5.5. Datasets in svmlight / libsvm format
- 5.6. Loading from external datasets
- 5.7. The Olivetti faces dataset
- 5.8. The 20 newsgroups text dataset
- 5.9. Downloading datasets from the mldata.org repository
- 5.10. The Labeled Faces in the Wild face recognition dataset
- 5.11. Forest covertypes
- 5.12. RCV1 dataset
- 5.13. Boston House Prices dataset
- 5.14. Breast Cancer Wisconsin (Diagnostic) Database
- 5.15. Diabetes dataset
- 5.16. Optical Recognition of Handwritten Digits Data Set
- 5.17. Iris Plants Database
- 5.18. Linnerrud dataset
-
6 Strategies to scale computationally: bigger data
- 6.1. Scaling with instances using out-of-core learning
-
7 Computational Performance
- 7.1. Prediction Latency
- 7.2. Prediction Throughput
- 7.3. Tips and Tricks