How to realize load balance and configure it - Trojan-Plus-Group/trojan-plus GitHub Wiki

Abstract

Load balance can enhance more in bandwidth that we should in this wiki, now we just look how to realize it and how it works.

Actually we already have a old way to get load balance by reuse-port feature of socket, different processes can bind same port in linux-like system, if you run two trojan processes in same port with reuse-port config as nat/socks mode, the kernel of system will dispatch connections from client randomly, if you these two trojans connect different server, you should got simplest bandwidth load-balance.

trojan_config.json
{
    "run_type": "client",
    "local_addr": "0.0.0.0",
    "local_port": 2062,
    ...
    "tcp": {
        ...
        "reuse_port": true,
        ...
    }
}

but, randomly dispatching doesn't know it's used for load-balance, needs to increase data query speed, it just dispatches randomly, so that you may got a unstabitily bandwidth, sometime multi-sockets downloading will use same host to download, bandwidth is same as before that using single host.

so, if software own has a exact way to dispatch connections, load-balance effect must be stable.

Realization and Config

Google Draw Image

Looking from above graph, this load balance feature need to be supported by new pipeline mode, so you need enable pipeline mode and add other configs path:

trojan_config.json
{
    "run_type": "client",
    "local_addr": "0.0.0.0",
    "local_port": 2062,
    ...
    "experimental":{
        "pipeline_num" : 5,
        "pipeline_loadbalance_configs" : [
            "/etc/trojan/client_config1.json"
        ]
    }
}

the other configs must be run individually, it uses other different ssl config with main config. But the password will be same with main config, so balance servers must have same password for visit.

Limitation

I just realized the simplest balance strategy, sequence polling, if you has 10 pipelines of trojan, 5 connected one host and 5 connected other, it will dispatched client query one by one.

That's to say, I haven't used complex algorithm to optimize load balance, in a sophisticated load balance system, it will consider point's availability, latency, bandwidth to selection best host for clients, might be, someone will continue this work.

select server hosts

If you have many server hosts as trojan servers, you need to select them in near location, they may show different trace route information from your client location:

Please DO NOT select server host in different locations:

Why I don't suggest you use hosts in different locations? Because of DNS, if you send a DNS query to proxy host in Singapore, some CDN network such as Cloudflare will return a IP that closed to your proxy host, yes, near from Singapore, and if you connect this IP address via the hosts where in Los Angeles, the terrible situation will happen.

You will got a high latency network. Of course I can use stronger load balance algorithm to dispatch connections and don't let that situation happen, but it's too complex to have time for me to develop.