深度学习 |
deep learning |
|
机器学习 |
machine learning |
|
机器学习模型 |
machine learning model |
|
逻辑回归 |
logistic regression |
|
回归 |
regression |
|
人工智能 |
artificial intelligence |
|
朴素贝叶斯 |
naive Bayes |
|
表示 |
representation |
|
表示学习 |
representation learning |
|
自编码器 |
autoencoder |
|
编码器 |
encoder |
|
解码器 |
decoder |
|
多层感知机 |
multilayer perceptron |
|
人工神经网络 |
artificial neural network |
|
神经网络 |
neural network |
|
随机梯度下降 |
stochastic gradient descent |
SGD |
线性模型 |
linear model |
|
线性回归 |
linear regression |
|
整流线性单元 |
rectified linear unit |
ReLU |
分布式表示 |
distributed representation |
|
非分布式表示 |
nondistributed representation |
|
非分布式 |
nondistributed |
|
隐藏单元 |
hidden unit |
|
长短期记忆 |
long short-term memory |
LSTM |
深度信念网络 |
deep belief network |
DBN |
循环神经网络 |
recurrent neural network |
RNN |
循环 |
recurrence |
|
强化学习 |
reinforcement learning |
|
推断 |
inference |
|
上溢 |
overflow |
|
下溢 |
underflow |
|
softmax函数 |
softmax function |
|
softmax |
softmax |
|
欠估计 |
underestimation |
|
过估计 |
overestimation |
|
病态条件 |
poor conditioning |
|
目标函数 |
objective function |
|
目标 |
objective |
|
准则 |
criterion |
|
代价函数 |
cost function |
|
代价 |
cost |
|
损失函数 |
loss function |
|
PR曲线 |
PR curve |
|
F值 |
F-score |
|
损失 |
loss |
|
误差函数 |
error function |
|
梯度下降 |
gradient descent |
|
导数 |
derivative |
|
临界点 |
critical point |
|
驻点 |
stationary point |
|
局部极小点 |
local minimum |
|
极小点 |
minimum |
|
局部极小值 |
local minima |
|
极小值 |
minima |
|
全局极小值 |
global minima |
|
局部极大值 |
local maxima |
|
极大值 |
maxima |
|
局部极大点 |
local maximum |
|
鞍点 |
saddle point |
|
全局最小点 |
global minimum |
|
偏导数 |
partial derivative |
|
梯度 |
gradient |
|
样本 |
example |
|
二阶导数 |
second derivative |
|
曲率 |
curvature |
|
凸优化 |
Convex optimization |
|
非凸 |
nonconvex |
|
数值优化 |
numerical optimization |
|
约束优化 |
constrained optimization |
|
可行 |
feasible |
|
等式约束 |
equality constraint |
|
不等式约束 |
inequality constraint |
|
正则化 |
regularization |
|
正则化项 |
regularizer |
|
正则化 |
regularize |
|
泛化 |
generalization |
|
泛化 |
generalize |
|
欠拟合 |
underfitting |
|
过拟合 |
overfitting |
|
偏差 |
biass |
|
方差 |
variance |
|
集成 |
ensemble |
|
估计 |
estimator |
|
权重衰减 |
weight decay |
|
协方差 |
covariance |
|
稀疏 |
sparse |
|
特征选择 |
feature selection |
|
特征提取器 |
feature extractor |
|
最大后验 |
Maximum A Posteriori |
MAP |
池化 |
pooling |
|
Dropout |
Dropout |
|
蒙特卡罗 |
Monte Carlo |
|
提前终止 |
early stopping |
|
卷积神经网络 |
convolutional neural network |
CNN |
小批量 |
minibatch |
|
重要采样 |
Importance Sampling |
|
变分自编码器 |
variational auto-encoder |
VAE |
计算机视觉 |
Computer Vision |
CV |
语音识别 |
Speech Recognition |
|
自然语言处理 |
Natural Language Processing |
NLP |
有向模型 |
Directed Model |
|
原始采样 |
Ancestral Sampling |
|
随机矩阵 |
Stochastic Matrix |
|
平稳分布 |
Stationary Distribution |
|
均衡分布 |
Equilibrium Distribution |
|
索引 |
index of matrix |
|
磨合 |
Burning-in |
|
混合时间 |
Mixing Time |
|
混合 |
Mixing |
|
Gibbs采样 |
Gibbs Sampling |
|
吉布斯步数 |
Gibbs steps |
|
Bagging |
bootstrap aggregating |
|
掩码 |
mask |
|
批标准化 |
batch normalization |
|
参数共享 |
parameter sharing |
|
KL散度 |
KL divergence |
|
温度 |
temperature |
|
临界温度 |
critical temperatures |
|
并行回火 |
parallel tempering |
|
自动语音识别 |
Automatic Speech Recognition |
ASR |
级联 |
coalesced |
|
数据并行 |
data parallelism |
|
模型并行 |
model parallelism |
|
异步随机梯度下降 |
Asynchoronous Stochastic Gradient Descent |
|
参数服务器 |
parameter server |
|
模型压缩 |
model compression |
|
动态结构 |
dynamic structure |
|
隐马尔可夫模型 |
Hidden Markov Model |
HMM |
高斯混合模型 |
Gaussian Mixture Model |
GMM |
转录 |
transcribe |
|
主成分分析 |
principal components analysis |
PCA |
因子分析 |
factor analysis |
|
独立成分分析 |
independent component analysis |
ICA |
稀疏编码 |
sparse coding |
|
定点运算 |
fixed-point arithmetic |
|
浮点运算 |
float-point arithmetic |
|
生成模型 |
generative model |
|
生成式建模 |
generative modeling |
|
数据集增强 |
dataset augmentation |
|
白化 |
whitening |
|
深度神经网络 |
DNN |
|
端到端的 |
end-to-end |
|
图模型 |
graphical model |
|
有向图模型 |
directed graphical model |
|
依赖 |
dependency |
|
贝叶斯网络 |
Bayesian network |
|
模型平均 |
model averaging |
|
声明 |
statement |
|
量子力学 |
quantum mechanics |
|
亚原子 |
subatomic |
|
逼真度 |
fidelity |
|
信任度 |
degree of belief |
|
频率派概率 |
frequentist probability |
|
贝叶斯概率 |
Bayesian probability |
|
似然 |
likelihood |
|
随机变量 |
random variable |
|
概率分布 |
probability distribution |
|
联合概率分布 |
joint probability distribution |
|
归一化的 |
normalized |
|
均匀分布 |
uniform distribution |
|
概率密度函数 |
probability density function |
PDF |
累积函数 |
cumulative function |
|
边缘概率分布 |
marginal probability distribution |
|
求和法则 |
sum rule |
|
条件概率 |
conditional probability |
|
干预查询 |
intervention query |
|
因果模型 |
causal modeling |
|
因果因子 |
causal factor |
|
链式法则 |
chain rule |
|
乘法法则 |
product rule |
|
相互独立的 |
independent |
|
条件独立的 |
conditionally independent |
|
期望 |
expectation |
|
期望值 |
expected value |
|
样本 |
example |
|
特征 |
feature |
|
准确率 |
accuracy |
|
错误率 |
error rate |
|
训练集 |
training set |
|
解释因子 |
explanatory factort |
|
潜在 |
underlying |
|
潜在成因 |
underlying cause |
|
测试集 |
test set |
|
性能度量 |
performance measures |
|
经验 |
experience |
|
无监督 |
unsupervised |
|
有监督 |
supervised |
|
半监督 |
semi-supervised |
|
监督学习 |
supervised learning |
|
无监督学习 |
unsupervised learning |
|
数据集 |
dataset |
|
数据点 |
data point |
|
标签 |
label |
|
标注 |
labeled |
|
未标注 |
unlabeled |
|
目标 |
target |
|
强化学习 |
reinforcement learning |
|
设计矩阵 |
design matrix |
|
参数 |
parameter |
|
权重 |
weight |
|
均方误差 |
mean squared error |
MSE |
正规方程 |
normal equation |
|
训练误差 |
training error |
|
泛化误差 |
generalization error |
|
测试误差 |
test error |
|
假设空间 |
hypothesis space |
|
容量 |
capacity |
|
表示容量 |
representational capacity |
|
有效容量 |
effective capacity |
|
线性阈值单元 |
linear threshold units |
|
非参数 |
non-parametric |
|
最近邻回归 |
nearest neighbor regression |
|
最近邻 |
nearest neighbor |
|
验证集 |
validation set |
|
基准 |
bechmark |
|
基准 |
baseline |
|
点估计 |
point estimator |
|
估计量 |
estimator |
|
统计量 |
statistics |
|
无偏 |
unbiased |
|
有偏 |
biased |
|
异步 |
asynchronous |
|
渐近无偏 |
asymptotically unbiased |
|
标准差 |
standard error |
|
一致性 |
consistency |
|
统计效率 |
statistic efficiency |
|
有参情况 |
parametric case |
|
贝叶斯统计 |
Bayesian statistics |
|
先验概率分布 |
prior probability distribution |
|
最大后验 |
maximum a posteriori |
|
最大似然估计 |
maximum likelihood estimation |
|
最大似然 |
maximum likelihood |
|
核技巧 |
kernel trick |
|
核函数 |
kernel function |
|
高斯核 |
Gaussian kernel |
|
核机器 |
kernel machine |
|
核方法 |
kernel method |
|
支持向量 |
support vector |
|
支持向量机 |
support vector machine |
SVM |
音素 |
phoneme |
|
声学 |
acoustic |
|
语音 |
phonetic |
|
专家混合体 |
mixture of experts |
|
高斯混合体 |
Gaussian mixtures |
|
选通器 |
gater |
|
专家网络 |
expert network |
|
注意力机制 |
attention mechanism |
|
对抗样本 |
adversarial example |
|
对抗 |
adversarial |
|
对抗训练 |
adversarial training |
|
切面距离 |
tangent distance |
|
正切传播 |
tangent prop |
|
正切传播 |
tangent propagation |
|
双反向传播 |
double backprop |
|
期望最大化 |
expectation maximization |
EM |
均值场 |
mean-field |
|
变分推断 |
variational inference |
|
二值稀疏编码 |
binary sparse coding |
|
前馈网络 |
feedforward network |
|
转移 |
transition |
|
重构 |
reconstruction |
|
生成随机网络 |
generative stochastic network |
|
得分匹配 |
score matching |
|
因子 |
factorial |
|
分解的 |
factorized |
|
均匀场 |
meanfield |
|
最大似然估计 |
maximum likelihood estimation |
|
概率PCA |
probabilistic PCA |
|
随机梯度上升 |
Stochastic Gradient Ascent |
|
团 |
clique |
|
Dirac分布 |
dirac distribution |
|
不动点方程 |
fixed point equation |
|
变分法 |
calculus of variations |
|
信念网络 |
belief network |
|
马尔可夫随机场 |
Markov random field |
|
马尔可夫网络 |
Markov network |
|
对数线性模型 |
log-linear model |
|
自由能 |
free energy |
|
局部条件概率分布 |
local conditional probability distribution |
|
条件概率分布 |
conditional probability distribution |
|
玻尔兹曼分布 |
Boltzmann distribution |
|
吉布斯分布 |
Gibbs distribution |
|
能量函数 |
energy function |
|
标准差 |
standard deviation |
|
相关系数 |
correlation |
|
标准正态分布 |
standard normal distribution |
|
协方差矩阵 |
covariance matrix |
|
Bernoulli分布 |
Bernoulli distribution |
|
Bernoulli输出分布 |
Bernoulli output distribution |
|
Multinoulli分布 |
multinoulli distribution |
|
Multinoulli输出分布 |
multinoulli output distribution |
|
范畴分布 |
categorical distribution |
|
多项式分布 |
multinomial distribution |
|
正态分布 |
normal distribution |
|
高斯分布 |
Gaussian distribution |
|
精度 |
precision |
|
多维正态分布 |
multivariate normal distribution |
|
精度矩阵 |
precision matrix |
|
各向同性 |
isotropic |
|
指数分布 |
exponential distribution |
|
指示函数 |
indicator function |
|
广义函数 |
generalized function |
|
经验分布 |
empirical distribution |
|
经验频率 |
empirical frequency |
|
混合分布 |
mixture distribution |
|
潜变量 |
latent variable |
|
隐藏变量 |
hidden variable |
|
先验概率 |
prior probability |
|
后验概率 |
posterior probability |
|
万能近似器 |
universal approximator |
|
饱和 |
saturate |
|
分对数 |
logit |
|
正部函数 |
positive part function |
|
负部函数 |
negative part function |
|
贝叶斯规则 |
Bayes' rule |
|
测度论 |
measure theory |
|
零测度 |
measure zero |
|
Jacobian矩阵 |
Jacobian matrix |
|
自信息 |
self-information |
|
奈特 |
nats |
|
比特 |
bit |
|
香农 |
shannons |
|
香农熵 |
Shannon entropy |
|
微分熵 |
differential entropy |
|
微分方程 |
differential equation |
|
KL散度 |
Kullback-Leibler (KL) divergence |
|
交叉熵 |
cross-entropy |
|
熵 |
entropy |
|
分解 |
factorization |
|
结构化概率模型 |
structured probabilistic model |
|
图模型 |
graphical model |
|
回退 |
back-off |
|
有向 |
directed |
|
无向 |
undirected |
|
无向图模型 |
undirected graphical model |
|
成比例 |
proportional |
|
描述 |
description |
|
决策树 |
decision tree |
|
因子图 |
factor graph |
|
结构学习 |
structure learning |
|
环状信念传播 |
loopy belief propagation |
|
卷积网络 |
convolutional network |
|
卷积网络 |
convolutional net |
|
主对角线 |
main diagonal |
|
转置 |
transpose |
|
广播 |
broadcasting |
|
矩阵乘积 |
matrix product |
|
AdaGrad |
AdaGrad |
|
逐元素乘积 |
element-wise product |
|
Hadamard乘积 |
Hadamard product |
|
团势能 |
clique potential |
|
因子 |
factor |
|
未归一化概率函数 |
unnormalized probability function |
|
循环网络 |
recurrent network |
|
梯度消失与爆炸问题 |
vanishing and exploding gradient problem |
|
梯度消失 |
vanishing gradient |
|
梯度爆炸 |
exploding gradient |
|
计算图 |
computational graph |
|
展开 |
unfolding |
|
求逆 |
invert |
|
时间步 |
time step |
|
维数灾难 |
curse of dimensionality |
|
平滑先验 |
smoothness prior |
|
局部不变性先验 |
local constancy prior |
|
局部核 |
local kernel |
|
流形 |
manifold |
|
流形正切分类器 |
manifold tangent classifier |
|
流形学习 |
manifold learning |
|
流形假设 |
manifold hypothesis |
|
环 |
loop |
|
弦 |
chord |
|
弦图 |
chordal graph |
|
三角形化图 |
triangulated graph |
|
三角形化 |
triangulate |
|
风险 |
risk |
|
经验风险 |
empirical risk |
|
经验风险最小化 |
empirical risk minimization |
|
代理损失函数 |
surrogate loss function |
|
批量 |
batch |
|
确定性 |
deterministic |
|
随机 |
stochastic |
|
在线 |
online |
|
流 |
stream |
|
梯度截断 |
gradient clipping |
|
幂方法 |
power method |
|
前向传播 |
forward propagation |
|
反向传播 |
backward propagation |
|
展开图 |
unfolded graph |
|
深度前馈网络 |
deep feedforward network |
|
前馈神经网络 |
feedforward neural network |
|
前向 |
feedforward |
|
反馈 |
feedback |
|
网络 |
network |
|
深度 |
depth |
|
输出层 |
output layer |
|
隐藏层 |
hidden layer |
|
宽度 |
width |
|
单元 |
unit |
|
激活函数 |
activation function |
|
反向传播 |
back propagation |
backprop |
泛函 |
functional |
|
平均绝对误差 |
mean absolute error |
|
赢者通吃 |
winner-take-all |
|
异方差 |
heteroscedastic |
|
混合密度网络 |
mixture density network |
|
梯度截断 |
clip gradient |
|
绝对值整流 |
absolute value rectification |
|
渗漏整流线性单元 |
Leaky ReLU |
|
参数化整流线性单元 |
parametric ReLU |
PReLU |
maxout单元 |
maxout unit |
|
硬双曲正切函数 |
hard tanh |
|
架构 |
architecture |
|
操作 |
operation |
|
符号 |
symbol |
|
数值 |
numeric value |
|
动态规划 |
dynamic programming |
|
自动微分 |
automatic differentiation |
|
并行分布式处理 |
Parallel Distributed Processing |
|
稀疏激活 |
sparse activation |
|
衰减 |
damping |
|
学成 |
learned |
|
信息传输 |
message passing |
|
泛函导数 |
functional derivative |
|
变分导数 |
variational derivative |
|
额外误差 |
excess error |
|
动量 |
momentum |
|
混沌 |
chaos |
|
稀疏初始化 |
sparse initialization |
|
共轭方向 |
conjugate directions |
|
共轭 |
conjugate |
|
条件独立 |
conditionally independent |
|
集成学习 |
ensemble learning |
|
独立子空间分析 |
independent subspace analysis |
|
慢特征分析 |
slow feature analysis |
SFA |
慢性原则 |
slowness principle |
|
整流线性 |
rectified linear |
|
整流网络 |
rectifier network |
|
坐标下降 |
coordinate descent |
|
坐标上升 |
coordinate ascent |
|
预训练 |
pretraining |
|
无监督预训练 |
unsupervised pretraining |
|
逐层的 |
layer-wise |
|
贪心算法 |
greedy algorithm |
|
贪心 |
greedy |
|
精调 |
fine-tuning |
|
课程学习 |
curriculum learning |
|
召回率 |
recall |
|
覆盖 |
coverage |
|
超参数优化 |
hyperparameter optimization |
|
超参数 |
hyperparameter |
|
网格搜索 |
grid search |
|
有限差分 |
finite difference |
|
中心差分 |
centered difference |
|
储层计算 |
reservoir computing |
|
谱半径 |
spectral radius |
|
收缩 |
contractive |
|
长期依赖 |
long-term dependency |
|
跳跃连接 |
skip connection |
|
门控RNN |
gated RNN |
|
门控 |
gated |
|
卷积 |
convolution |
|
输入 |
input |
|
输入分布 |
input distribution |
|
输出 |
output |
|
特征映射 |
feature map |
|
翻转 |
flip |
|
稀疏交互 |
sparse interactions |
|
等变表示 |
equivariant representations |
|
稀疏连接 |
sparse connectivity |
|
稀疏权重 |
sparse weights |
|
接受域 |
receptive field |
|
绑定的权重 |
tied weights |
|
等变 |
equivariance |
|
探测级 |
detector stage |
|
符号表示 |
symbolic representation |
|
池化函数 |
pooling function |
|
最大池化 |
max pooling |
|
池 |
pool |
|
不变 |
invariant |
|
步幅 |
stride |
|
降采样 |
downsampling |
|
全 |
full |
|
非共享卷积 |
unshared convolution |
|
平铺卷积 |
tiled convolution |
|
循环卷积网络 |
recurrent convolutional network |
|
傅立叶变换 |
Fourier transform |
|
可分离的 |
separable |
|
初级视觉皮层 |
primary visual cortex |
|
简单细胞 |
simple cell |
|
复杂细胞 |
complex cell |
|
象限对 |
quadrature pair |
|
门控循环单元 |
gated recurrent unit |
GRU |
门控循环网络 |
gated recurrent net |
|
遗忘门 |
forget gate |
|
截断梯度 |
clipping the gradient |
|
记忆网络 |
memory network |
|
神经网络图灵机 |
neural Turing machine |
NTM |
精调 |
fine-tune |
|
共因 |
common cause |
|
编码 |
code |
|
再循环 |
recirculation |
|
欠完备 |
undercomplete |
|
完全图 |
complete graph |
|
欠定的 |
underdetermined |
|
过完备 |
overcomplete |
|
去噪 |
denoising |
|
去噪 |
denoise |
|
重构误差 |
reconstruction error |
|
梯度场 |
gradient field |
|
得分 |
score |
|
切平面 |
tangent plane |
|
最近邻图 |
nearest neighbor graph |
|
嵌入 |
embedding |
|
近似推断 |
approximate inference |
|
信息检索 |
information retrieval |
|
语义哈希 |
semantic hashing |
|
降维 |
dimensionality reduction |
|
对比散度 |
contrastive divergence |
|
语言模型 |
language model |
|
标记 |
token |
|
一元语法 |
unigram |
|
二元语法 |
bigram |
|
三元语法 |
trigram |
|
平滑 |
smoothing |
|
级联 |
cascade |
|
模型 |
model |
|
层 |
layer |
|
半监督学习 |
semi-supervised learning |
|
监督模型 |
supervised model |
|
词嵌入 |
word embedding |
|
one-hot |
one-hot |
|
监督预训练 |
supervised pretraining |
|
迁移学习 |
transfer learning |
|
学习器 |
learner |
|
多任务学习 |
multitask learning |
|
领域自适应 |
domain adaption |
|
一次学习 |
one-shot learning |
|
零次学习 |
zero-shot learning |
|
零数据学习 |
zero-data learning |
|
多模态学习 |
multimodal learning |
|
生成式对抗网络 |
generative adversarial network |
GAN |
前馈分类器 |
feedforward classifier |
|
线性分类器 |
linear classifier |
|
正相 |
positive phase |
|
负相 |
negative phase |
|
随机最大似然 |
stochastic maximum likelihood |
|
噪声对比估计 |
noise-contrastive estimation |
NCE |
噪声分布 |
noise distribution |
|
噪声 |
noise |
|
独立同分布 |
independent identically distributed |
|
专用集成电路 |
application-specific integrated circuit |
ASIC |
现场可编程门阵列 |
field programmable gated array |
FPGA |
标量 |
scalar |
|
向量 |
vector |
|
矩阵 |
matrix |
|
张量 |
tensor |
|
点积 |
dot product |
|
内积 |
inner product |
|
方阵 |
square |
|
奇异的 |
singular |
|
范数 |
norm |
|
三角不等式 |
triangle inequality |
|
欧几里得范数 |
Euclidean norm |
|
最大范数 |
max norm |
|
对角矩阵 |
diagonal matrix |
|
对称 |
symmetric |
|
单位向量 |
unit vector |
|
单位范数 |
unit norm |
|
正交 |
orthogonal |
|
正交矩阵 |
orthogonal matrix |
|
标准正交 |
orthonormal |
|
特征分解 |
eigendecomposition |
|
特征向量 |
eigenvector |
|
特征值 |
eigenvalue |
|
分解 |
decompose |
|
正定 |
positive definite |
|
负定 |
negative definite |
|
半负定 |
negative semidefinite |
|
半正定 |
positive semidefinite |
|
奇异值分解 |
singular value decomposition |
SVD |
奇异值 |
singular value |
|
奇异向量 |
singular vector |
|
单位矩阵 |
identity matrix |
|
矩阵逆 |
matrix inversion |
|
原点 |
origin |
|
线性组合 |
linear combination |
|
列空间 |
column space |
|
值域 |
range |
|
线性相关 |
linear dependency |
|
线性无关 |
linearly independent |
|
列 |
column |
|
行 |
row |
|
同分布的 |
identically distributed |
|
词嵌入 |
word embedding |
|
机器翻译 |
machine translation |
|
推荐系统 |
recommender system |
|
词袋 |
bag of words |
|
协同过滤 |
collaborative filtering |
|
探索 |
exploration |
|
策略 |
policy |
|
关系 |
relation |
|
属性 |
attribute |
|
词义消歧 |
word-sense disambiguation |
|
误差度量 |
error metric |
|
性能度量 |
performance metrics |
|
共轭梯度 |
conjugate gradient |
|
在线学习 |
online learning |
|
逐层预训练 |
layer-wise pretraining |
|
自回归网络 |
auto-regressive network |
|
生成器网络 |
generator network |
|
判别器网络 |
discriminator network |
|
矩 |
moment |
|
可见层 |
visible layer |
|
无限 |
infinite |
|
容差 |
tolerance |
|
学习率 |
learning rate |
|
轮数 |
epochs |
|
轮 |
epoch |
|
对数尺度 |
logarithmic scale |
|
随机搜索 |
random search |
|
分段 |
piecewise |
|
汉明距离 |
Hamming distance |
|
可见变量 |
visible variable |
|
近似推断 |
approximate inference |
|
精确推断 |
exact inference |
|
潜层 |
latent layer |
|
知识图谱 |
knowledge graph |
|