[Trace Code] gtp5g - ianchen0119/Introduce-to-5GC GitHub Wiki

目前 gtp5g 經過 refactor 後,檔案數量與部分原始程式碼可能與本文提到的地方不盡相同,還請讀者們多包涵。

Abbreviations

  • ndo: network device operations
  • PDR (packet detection rule): matching an incoming packet
  • PDI (packet detection information): fields for packet matching
  • FAR (forwarding action rule): packet forwarding
  • BAR (buffering action rule): traffic buffering (used when radio link released, paging, handovering…)
  • QER (QoS enforcement rule): QoS management
  • URR (usage reporting rule): accounting

Basics

sk_buff

https://blog.51cto.com/weiguozhihui/1586777

proc

proc 可以讓我們在 user space 取得與 kernel module 有關的資訊,舉例來說:

ianchen0119@ubuntu:~$ ls /proc/gtp5g/
dbg  far  pdr  qer

可以得知在 /proc/gtp5g/ 底下共有四個 proc,接著,我們可以查看 dbg 的內容得知 gtp5g 的執行狀態:

ianchen0119@ubuntu:~$ cat /proc/gtp5g/dbg
gtp5g kerenl debug level range: 0~4
         0 -> Logging
         1 -> Error(default)
         2 -> Warning
         3 -> Information
         4 -> Trace
Current: 1

generic netlink

Rtnetlink

Rtnetlink allows the kernel's routing tables to be read and altered. It is used within the kernel to communicate between various subsystems, though this usage is not documented here, and for communication with user-space programs. Network routes, IP addresses, link parameters, neighbor setups, queueing disciplines, traffic classes and packet classifiers may all be controlled through NETLINK_ROUTE sockets. It is based on netlink messages; see netlink(7) for more information. -- linux man page

pernet

There is a way to get notified by the network core when a new network namespace is created or destroyed. For example, as a device driver developer or some other kernel code developer your module wants to get notified by the network core when a new network namespace created or destroyed. -- what is net_generic function in linux include/net/net_namespace.h?

Tracing the source code!

gtp5g 使用 late_initcall() 註冊 gtp5g_init(),後者會執行以下行為: 1. 註冊 Rtnetlink 用於修改 linux kernel 的 routing table。 2. 註冊 gerneric netlink family

static struct genl_family gtp5g_genl_family __ro_after_init = {
    .name       = "gtp5g",
    .version    = 0,
    .hdrsize    = 0,
    .maxattr    = GTP5G_ATTR_MAX,
    .netnsok    = true,
    .module     = THIS_MODULE,
    .ops        = gtp5g_genl_ops,
    .n_ops      = ARRAY_SIZE(gtp5g_genl_ops),
};

其中的 ops gtp5g_genl_ops 就包含了 PDR、FAR、QER 的查詢、新增、刪除操作。

3. 呼叫 register_pernet_subsys 註冊 pernet (namespace)

4. 建立 proc

proc_gtp5g = proc_mkdir("gtp5g", NULL);
    if (!proc_gtp5g) {
        GTP5G_ERR(NULL, "Failed to create /proc/gtp5g\n");
        goto unreg_pernet;
	}

    proc_gtp5g_dbg = proc_create("dbg", (S_IFREG | S_IRUGO | S_IWUGO),
        proc_gtp5g, &proc_gtp5g_dbg_ops);
    if (!proc_gtp5g_dbg) {
        GTP5G_ERR(NULL, "Failed to create /proc/gtp5g/dbg\n");
        goto remove_gtp5g_proc;
	}

    proc_gtp5g_pdr = proc_create("pdr", (S_IFREG | S_IRUGO | S_IWUGO),
        proc_gtp5g, &proc_gtp5g_pdr_ops);
    if (!proc_gtp5g_pdr) {
        GTP5G_ERR(NULL, "Failed to create /proc/gtp5g/pdr\n");
        goto remove_dbg_proc;
	}

    proc_gtp5g_far = proc_create("far", (S_IFREG | S_IRUGO | S_IWUGO),
        proc_gtp5g, &proc_gtp5g_far_ops);
    if (!proc_gtp5g_far) {
        GTP5G_ERR(NULL, "Failed to create /proc/gtp5g/far\n");
        goto remove_pdr_proc;
	}

    proc_gtp5g_qer = proc_create("qer", (S_IFREG | S_IRUGO | S_IWUGO), 
        proc_gtp5g, &proc_gtp5g_qer_ops);
    if (!proc_gtp5g_qer) {
        GTP5G_ERR(NULL, "Failed to create /proc/gtp5g/qer\n");
        goto remove_far_proc;
	}

看完 gtp5g_init(),我們可以確定 gtp5g 是由三個子項組成,分別是:

  • 與 network device 相關的 gtp5g_netdev_ops
  • 與 rtnetlink 相關的 gtp5g_link_ops
  • 與 genl 相關的 gtp5g_genl_ops

其他的 structured data 以及 member function 都會被上述的子元件使用到,除非有開發/除錯需求,可以先忽略不看。

1. network device

透過 gtp5g_netdev_ops 可以看到相關的 hook 以及函式:

static const struct net_device_ops gtp5g_netdev_ops = {
    .ndo_init           = gtp5g_dev_init,
    .ndo_uninit         = gtp5g_dev_uninit,
    .ndo_start_xmit     = gtp5g_dev_xmit,
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 11, 0)
    .ndo_get_stats64    = dev_get_tstats64,
#else
    .ndo_get_stats64    = ip_tunnel_get_stats64,
#endif
};

每個 Hook 的定義都可以在 linux kernel 的原始程式碼當中找到:

.ndo_init

This function is called once when a network device is registered. The network device can use this for any late stage initialization or semantic validation. It can fail with an error code which will be propagated back to register_netdev.

static int gtp5g_dev_init(struct net_device *dev)
{
    /* netdev_priv() 用來得到 net device 的 private data */
    struct gtp5g_dev *gtp = netdev_priv(dev);

    gtp->dev = dev;

    dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
    if (!dev->tstats) {
        GTP5G_ERR(dev, "Failled to allocate stats\n");
        return -ENOMEM;
    }

    return 0;
}

gtp5g_dev 是我們自定義的 structured data:

struct gtp5g_dev {
    struct list_head        list;

    struct sock             *sk1u;

    struct net_device       *dev;

    unsigned int            role;

    unsigned int            hash_size;
    struct hlist_head       *pdr_id_hash;
    struct hlist_head       *far_id_hash;
    struct hlist_head       *qer_id_hash;

    struct hlist_head       *i_teid_hash;      // Used for GTP-U packet detect
    struct hlist_head       *addr_hash;        // Used for IPv4 packet detect

    /* IEs list related to PDR */
    struct hlist_head       *related_far_hash;     // PDR list waiting the FAR to handle
    struct hlist_head       *related_qer_hash;     // PDR list waiting the QER to handle

    /* Used by proc interface */
    struct list_head        proc_list;
};

一個 gtp5g_dev 可以視為一個 linked list 的 head,它記錄了:

  • 對應的 network device
  • 對應的 socket
  • 相關的 pdr 與 qer

.ndo_uninit

This function is called when device is unregistered or when registration fails. It is not called if init fails.

static void gtp5g_dev_uninit(struct net_device *dev)
{
    struct gtp5g_dev *gtp = netdev_priv(dev);

    gtp5g_encap_disable(gtp);
    free_percpu(dev->tstats);
}

用來關掉 gtp5g_dev 紀錄的 socket。

.ndo_start_xmit

Called when a packet needs to be transmitted. Returns NETDEV_TX_OK. Can return NETDEV_TX_BUSY, but you should stop the queue before that can happen; it's for obsolete devices and weird corner cases, but the stack really does a non-trivial amount of useless work if you return NETDEV_TX_BUSY. Required; cannot be NULL.

static netdev_tx_t gtp5g_dev_xmit(struct sk_buff *skb, struct net_device *dev)
{
    unsigned int proto = ntohs(skb->protocol);
    struct gtp5g_pktinfo pktinfo;
    int ret = 0;

    /* Ensure there is sufficient headroom */
    if (skb_cow_head(skb, dev->needed_headroom)) {
        goto tx_err;
    }

    skb_reset_inner_headers(skb);

    /* PDR lookups in gtp5g_build_skb_*() need rcu read-side lock. 
     * */
    rcu_read_lock();
    switch (proto) {
    case ETH_P_IP:
        ret = gtp5g_handle_skb_ipv4(skb, dev, &pktinfo);
        break;
    default:
        ret = -EOPNOTSUPP;
    }
    rcu_read_unlock();

    if (ret < 0)
        goto tx_err;

    if (ret == FAR_ACTION_FORW)
        gtp5g_xmit_skb_ipv4(skb, &pktinfo);

    return NETDEV_TX_OK;

tx_err:
    dev->stats.tx_errors++;
    dev_kfree_skb(skb);
    return NETDEV_TX_OK;
}
  • skb_cow_head(skb, dev->needed_headroom) 如果 sk_buffer 小於 needed_headroom,代表 buffer 沒有足夠的空間填入資料,傳送會在這邊回傳失敗。
  • skb_reset_inner_headers() 用於清除原本 packet 的 headers。
  • gtp5g_handle_skb_ipv4() 會根據 destination ip 查詢 PDR,如果有順利找到,就會繼續檢查 FAR 來決定怎麼處理封包:
far = pdr->far;
    if (far) {
        // One and only one of the DROP, FORW and BUFF flags shall be set to 1.
        // The NOCP flag may only be set if the BUFF flag is set.
        // The DUPL flag may be set with any of the DROP, FORW, BUFF and NOCP flags.
        switch (far->action & FAR_ACTION_MASK) {
        case FAR_ACTION_DROP:
            return gtp5g_drop_skb_ipv4(skb, dev, pdr);
        case FAR_ACTION_FORW:
            return gtp5g_fwd_skb_ipv4(skb, dev, pktinfo, pdr);
        case FAR_ACTION_BUFF:
            /* gtp5g_buf_skb_ipv4() 會將封包透過 socket 送給 UPF*/
            return gtp5g_buf_skb_ipv4(skb, dev, pdr);
        default:
            GTP5G_ERR(dev, "Unspec apply action(%u) in FAR(%u) and related to PDR(%u)",
                far->action, far->id, pdr->id);
        }
    }
  • 如果確定 Action 為 FAR_ACTION_FORWgtp5g_xmit_skb_ipv4() 會被呼叫:
static void gtp5g_xmit_skb_ipv4(struct sk_buff *skb, struct gtp5g_pktinfo *pktinfo)
{
    //GTP5G_ERR(pktinfo->dev, "gtp -> IP src: %pI4 dst: %pI4\n",
    //           &pktinfo->iph->saddr, &pktinfo->iph->daddr);
    udp_tunnel_xmit_skb(pktinfo->rt, 
        pktinfo->sk,
        skb,
        pktinfo->fl4.saddr,
        pktinfo->fl4.daddr,
        pktinfo->iph->tos,
        ip4_dst_hoplimit(&pktinfo->rt->dst),
        0,
        pktinfo->gtph_port, 
        pktinfo->gtph_port,
        true, 
        true);
}

這樣一來,Packet 的轉送就完成了!

.ndo_get_stats64

Called when a user wants to get the network device usage statistics. Drivers must do one of the following:

  1. Define @ndo_get_stats64 to fill in a zero-initialised rtnl_link_stats64 structure passed by the caller.
  2. Define @ndo_get_stats to update a net_device_stats structure (which should normally be dev->stats) and return a pointer to it. The structure may be changed asynchronously only if each field is written atomically.
  3. Update dev->stats asynchronously and atomically, and define neither operation.

2. rtnetlink

rtnl 相關的 hook 與函式:

static struct rtnl_link_ops gtp5g_link_ops __read_mostly = {
    .kind         = "gtp5g",
    .maxtype      = IFLA_GTP5G_MAX,
    .policy       = gtp5g_policy,
    .priv_size    = sizeof(struct gtp5g_dev),
    .setup        = gtp5g_link_setup,
    .validate     = gtp5g_validate,
    .newlink      = gtp5g_newlink,
    .dellink      = gtp5g_dellink,
    .get_size     = gtp5g_get_size,
    .fill_info    = gtp5g_fill_info,
};

每個 Hook 的定義都可以在 linux kernel 的原始程式碼當中找到:

.kind

Identifier,gtp5g 的 Identifier 為 gtp5g

.maxtype

Highest device specific netlink attribute numberㄡ

.policy

Netlink policy for device specific attribute validation.

.priv_size

sizeof net_device private space.

.setup

net_device setup function.

gtp5g 中的 net device 會在 rtnl setup 時設定:

static void gtp5g_link_setup(struct net_device *dev)
{
    dev->netdev_ops = &gtp5g_netdev_ops;
    dev->needs_free_netdev = true;

    dev->hard_header_len = 0;
    dev->addr_len = 0;
    dev->mtu = ETH_DATA_LEN -
	    (sizeof(struct iphdr) +
	     sizeof(struct udphdr) +
	     sizeof(struct gtpv1_hdr));

    /* Zero header length. */
    dev->type = ARPHRD_NONE;
    dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;

    dev->priv_flags |= IFF_NO_QUEUE;
    dev->features |= NETIF_F_LLTX;
    netif_keep_dst(dev);

    /* TODO: Modify the headroom size based on
	 * what are the extension header going to support
	 * */
    dev->needed_headroom = LL_MAX_HEADER +
        sizeof(struct iphdr) +
        sizeof(struct udphdr) +
        sizeof(struct gtpv1_hdr) + 
        sizeof(struct gtp1_hdr_opt) +
        sizeof(struct gtp1_hdr_ext_pdu_sess_ctr);
}

最基本的 Network Device Driver 的寫法就是 allocate network device 後再賦予 hook function struct net_device_ops,最後將該 network device 註冊到 kernel 中,Kernel 就可以調用該 Network device。

.validate

Optional validation function for netlink/changelink parameters.

.newlink

Function for configuring and registering a new device.

static int gtp5g_newlink(struct net *src_net, struct net_device *dev,
    struct nlattr *tb[], struct nlattr *data[],
    struct netlink_ext_ack *extack)
{
    struct gtp5g_dev *gtp;
    struct gtp5g_net *gn;
    int hashsize, err;

    if (!data[IFLA_GTP5G_FD1]) {
        GTP5G_ERR(NULL, "Failed to create a new link\n");
        return -EINVAL;
    }

    gtp = netdev_priv(dev);

    err = gtp5g_encap_enable(gtp, data);
    if (err < 0) {
        GTP5G_ERR(dev, "Failed to enable the encap rcv\n");
        return err;
    }

    if (!data[IFLA_GTP5G_PDR_HASHSIZE])
        hashsize = 1024;
    else
        hashsize = nla_get_u32(data[IFLA_GTP5G_PDR_HASHSIZE]);

    err = gtp5g_hashtable_new(gtp, hashsize);
    if (err < 0) {
        GTP5G_ERR(dev, "Failed to create a hash table\n");
        goto out_encap;
    }

    err = register_netdevice(dev);
    if (err < 0) {
        GTP5G_ERR(dev, "Failed to register new netdev err(%d)\n", err);
        goto out_hashtable;
    }

    gn = net_generic(dev_net(dev), gtp5g_net_id);
    list_add_rcu(&gtp->list, &gn->gtp5g_dev_list);
    list_add_rcu(&gtp->proc_list, &proc_gtp5g_dev);

    GTP5G_LOG(dev, "Registered a new 5G GTP interface\n");
    return 0;
out_hashtable:
    gtp5g_hashtable_free(gtp);
out_encap:
    gtp5g_encap_disable(gtp);
    return err;
}

.dellink

Function to remove a device.

.get_size

Function to calculate required room for dumping device specific netlink attributes

.fill_info

Function to dump device specific netlink attributes.

3. genl

以下 operations 已經在 gtp5g_init() 註冊 genl_family 時一併註冊:

static const struct genl_ops gtp5g_genl_ops[] = {
    {
        .cmd = GTP5G_CMD_ADD_PDR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_add_pdr,
        // .policy = gtp5g_genl_pdr_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_DEL_PDR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_del_pdr,
        // .policy = gtp5g_genl_pdr_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_GET_PDR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_get_pdr,
        .dumpit = gtp5g_genl_dump_pdr,
        // .policy = gtp5g_genl_pdr_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_ADD_FAR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_add_far,
        // .policy = gtp5g_genl_far_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_DEL_FAR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_del_far,
        // .policy = gtp5g_genl_far_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_GET_FAR,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_get_far,
        .dumpit = gtp5g_genl_dump_far,
        // .policy = gtp5g_genl_far_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_ADD_QER,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_add_qer,
        // .policy = gtp5g_genl_qer_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_DEL_QER,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_del_qer,
        // .policy = gtp5g_genl_qer_policy,
        .flags = GENL_ADMIN_PERM,
    },
    {
        .cmd = GTP5G_CMD_GET_QER,
        // .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
        .doit = gtp5g_genl_get_qer,
        .dumpit = gtp5g_genl_dump_qer,
        // .policy = gtp5g_genl_qer_policy,
        .flags = GENL_ADMIN_PERM,
    },

};

每一個 cmd 對應到的 function 都跟 patcket rule 有關,大致流程為:

  • 呼叫 gtp5g_find_dev(sock_net(skb->sk), info->attrs) 找到對應的 gtp5g_dev。
  • 如果有找到,就把 genl info 當中的資料拿出來並且塞到正確的位置。

4. pernet (network namespace)

https://blog.csdn.net/sidemap/article/details/102880341

在先前提到的 module initializer (gtp5g_init) 有註冊 register_pernet_subsys(),而 linux pernet 允許我們保留一些空間當作 private data 使用:

static struct pernet_operations gtp5g_net_ops = {
    .init    = gtp5g_net_init,
    .exit    = gtp5g_net_exit,
    .id      = &gtp5g_net_id,
    .size    = sizeof(struct gtp5g_net),
};
  • id 可供之後查找 private data 使用。
  • private data 的 size 為 sizeof(struct gtp5g_net)

gtp5g_net 可以視為用來存放 gtp5g_dev_list 的 list head:

struct gtp5g_net {
    struct list_head gtp5g_dev_list;
};

了解 pernet 以及 private data 以後,我們再回到 gtp5g 的原始程式碼中就可以發現:當我們需要查找 gtp5g device 之前都會先呼叫 struct gtp5g_net *gn = net_generic(net, gtp5g_net_id); 取得指向 gtp5g_net 的 base addr。

相關資源