linux namespace 背景知识 - songbingyu/docket_study GitHub Wiki
linux nmaespace 简介
Linux Namespaces是一种轻量级的虚拟化形式。操作系统在内存,CPU上,已经使用了虚拟化的技术,让每个进程都认为是自己独占了内存和CPU。但像存储,磁盘,信号等,一些资源,操作系统并没有将其隔离开。
namespace则是将这类资源也隔离开来。使得进程只能看到自己的资源视图。这个功能加上Cgroup,就可以实现一个轻量级的虚拟机。这对于提高主机资源利用率很有用。如果说KVM这类虚拟机是为了隔离,而容器技术更多是为了共享。
linux namespace 分类
-
mnt
mount namespaces control mount points. Upon creation the mounts from the current mount namespace are copied to the new namespace, but mount points created afterwards do not propagate between namespaces (using shared subtrees, it is possible to propagate mount points between namespaces[2]). The mount namespace kind was th1. e first one to be introduced, at a time nobody thought of other namespaces, that's why its clone flag is CLONE_NEWNS.
-
pid
assigns each process a new PID, allows for a different init process (inside of this namespace). process get a PID and can be seen from process in the parent namespace too. can be nested aids in process migration between different hosts
-
net
cannot be nested, each netns is attached to a userns The whole network stack
-
ipc
System V IPC identifiers POSIX message queue filesystem
-
uts
hostname domainname
-
user
uids and gids Permissions for namespace of the other kinds are checked in the user namespace, they got created in.
linux namespace 的实现
参考 深入理解linux内核
如图:
nsproxy结构体:
struct task_struct {
...
/* namespaces */
struct nsproxy *nsproxy;
...
};
nsproxy
nsproxy.h
/*
* A structure to contain pointers to all per-process
* namespaces - fs (mount), uts, network, sysvipc, etc.
*
* 'count' is the number of tasks holding a reference.
* The count for each namespace, then, will be the number
* of nsproxies pointing to it, not the number of tasks.
*
* The nsproxy is shared by tasks which share all namespaces.
* As soon as a single namespace is cloned or unshared, the
* nsproxy is copied.
*/
struct nsproxy {
atomic_t count;
struct uts_namespace *uts_ns;
struct ipc_namespace *ipc_ns;
struct mnt_namespace *mnt_ns;
struct pid_namespace *pid_ns;
struct net *net_ns;
};
namespace 相关函数
主要是三个系统调用
clone() – 实现线程的系统调用,用来创建一个新的进程,并可以通过设计上述参数达到隔离。
unshare() – 使某进程脱离某个namespace
setns() – 把某进程加入到某个namespace