Home - Yikyong/machine-learning-in-Python GitHub Wiki

Machine Learning in Python @ sci-kit learn

  • This repository provides the user guide of scikit learn translated into Korean.
  • scikit learn 의 μ‚¬μš©μž κ°€μ΄λ“œλ₯Ό ν•œκ΅­μ–΄λ‘œ λ²ˆμ—­ν•˜μ—¬ μ œκ³΅ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
  • Simply click on the links to view the original and translated page on active link as blue due to on-going project
  • 진행쀑인 ν”„λ‘œμ νŠΈμΈ κ΄€κ³„λ‘œ νŒŒλž€μƒ‰μœΌλ‘œ ν™œμ„±ν™”λœ 링크λ₯Ό ν΄λ¦­ν•˜μ—¬ 원본과 λ²ˆμ—­λ³Έμ„ ν™•μΈν•˜μ‹œκΈ° λ°”λžλ‹ˆλ‹€.

Table of Contents

  • 1 Supervised learning

    • 1.1 Generalized Linear Models
    • 1.2. Linear and Quadratic Discriminant Analysis
    • 1.3. Kernel ridge regression
    • 1.4. Support Vector Machines
    • 1.5. Stochastic Gradient Descent
    • 1.6. Nearest Neighbors
    • 1.7. Gaussian Processes
    • 1.8. Cross decomposition
    • 1.9. Naive Bayes
    • 1.10. Decision Trees
    • 1.11. Ensemble methods
    • 1.12. Multiclass and multilabel algorithms
    • 1.13. Feature selection
    • 1.14. Semi-Supervised
    • 1.15. Isotonic regression
    • 1.16. Probability calibration
    • 1.17. Neural network models (supervised)
  • 2 Unsupervised learning

    • 2.1. Gaussian mixture models
    • 2.2. Manifold learning
    • 2.3. Clustering
    • 2.4. Biclustering
    • 2.5. Decomposing signals in components (matrix factorization problems)
    • 2.6. Covariance estimation
    • 2.7. Novelty and Outlier Detection
    • 2.8. Density Estimation
    • 2.9. Neural network models (unsupervised)
  • 3 Model selection and evaluation

    • 3.1. Cross-validation: evaluating estimator performance
    • 3.2. Tuning the hyper-parameters of an estimator
    • 3.3. Model evaluation: quantifying the quality of predictions
    • 3.4. Model persistence
    • 3.5. Validation curves: plotting scores to evaluate models
  • 4 Dataset transformations

    • 4.1. Pipeline and FeatureUnion: combining estimators
    • 4.2. Feature extraction
    • 4.3. Preprocessing data
    • 4.4. Unsupervised dimensionality reduction
    • 4.5. Random Projection
    • 4.6. Kernel Approximation
    • 4.7. Pairwise metrics, Affinities and Kernels
    • 4.8. Transforming the prediction target (y)
  • 5 Dataset loading utilities

    • 5.1. General dataset API
    • 5.2. Toy datasets
    • 5.3. Sample images
    • 5.4. Sample generators
    • 5.5. Datasets in svmlight / libsvm format
    • 5.6. Loading from external datasets
    • 5.7. The Olivetti faces dataset
    • 5.8. The 20 newsgroups text dataset
    • 5.9. Downloading datasets from the mldata.org repository
    • 5.10. The Labeled Faces in the Wild face recognition dataset
    • 5.11. Forest covertypes
    • 5.12. RCV1 dataset
    • 5.13. Boston House Prices dataset
    • 5.14. Breast Cancer Wisconsin (Diagnostic) Database
    • 5.15. Diabetes dataset
    • 5.16. Optical Recognition of Handwritten Digits Data Set
    • 5.17. Iris Plants Database
    • 5.18. Linnerrud dataset
  • 6 Strategies to scale computationally: bigger data

    • 6.1. Scaling with instances using out-of-core learning
  • 7 Computational Performance

    • 7.1. Prediction Latency
    • 7.2. Prediction Throughput
    • 7.3. Tips and Tricks