In this post, we will see how to communicate from gvisor user space network stack to Internet. We will be making a HTTP request to google.com from user space network stack.

This post is building on top of my previous Gvisor related posts

Routing Using Linux Network Stack

Lets briefly take a look at how a typical TCP request makes its way from the application to the target address.

Linux Network Stack Routing

Protocol Logic in Linux Network Stack

Assuming we wrote a GoLang program which is trying to connect to google.com. GoLang application here is a user space application which is making a net.Dial("tcp", address) function call, this call will eventually result in a tcp_connect() syscall. The logic related to TCP handshake is present in Linux Network Stack, so kernel is responsible for doing the TCP handshake, NOT the GoLang program we are executing. Same thing happens when user space program terminates the TCP connection (TCP connection termination logic is executed by the kernel).

There is an excellent post to understand this TCP handling in Linux in more depth - https://medium.com/@dipakkrdas/tcp-handling-in-linux-cc864f35818b

Routing Using User Space Network Stack

Using gvisor, we can think of moving the network specific logic from linux kernel to the user space and using kernel as packet forwarder for out of host network communication.

Userspace Network Stack Routing

Protocol Logic in Userspace Network Stack

To follow the complete code, refer - https://github.com/viveksb007/gvisor-experiment/blob/main/cmd/userspace-tcpclient/main.go

Packets generated by user space network stack are written to the TUN interface. In code, we have hardcoded tun0 interface, so we need some initial setup for user space network stack to communicate with internet.

In setup.sh bash script, we are creating a tun interface for our user space network stack to write packets to. As packets from that tun interface will not be directly routable to the internet, so we have to write iptable rules to enable forwarding and MASQUERADE the packets from tun to eth0/ens5 (or any other interface you have for public routing).

Run the setup script sudo ./cmd/userspace-tcpclient/setup.sh

post running the setup script, you can validate the following:

  • tun0 interface created
  • forwarding is enabled
  • a POSTROUTING MASQUERADE iptable rule is added

See sample outputs from my setup

viveksb007@lima-ubuntux86-64:/Users/viveksb007/lima/gvisor-experiment$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.5.15  netmask 255.255.255.0  broadcast 192.168.5.255
        inet6 fe80::5055:55ff:fe00:d929  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:00:d9:29  txqueuelen 1000  (Ethernet)
        RX packets 502962  bytes 597559240 (597.5 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 113463  bytes 28050400 (28.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 114433  bytes 27282629 (27.2 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 114433  bytes 27282629 (27.2 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tun0: flags=4241<UP,POINTOPOINT,NOARP,MULTICAST>  mtu 1500
        inet 192.168.1.1  netmask 255.255.255.0  destination 192.168.1.1
        inet6 fe80::2a1c:582d:2cfa:c76a  prefixlen 64  scopeid 0x20<link>
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 500  (UNSPEC)
        RX packets 403  bytes 31774 (31.7 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 509  bytes 400021 (400.0 KB)
        TX errors 0  dropped 37 overruns 0  carrier 0  collisions 0

viveksb007@lima-ubuntux86-64:/Users/viveksb007/lima/gvisor-experiment$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1

viveksb007@lima-ubuntux86-64:/Users/viveksb007/lima/gvisor-experiment$ sudo iptables-save
# Generated by iptables-save v1.8.10 (nf_tables) on Thu Nov 21 21:37:09 2024
*nat
:PREROUTING ACCEPT [27:1620]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [990:76515]
:POSTROUTING ACCEPT [201:13943]
:LIMADNS - [0:0]
-A PREROUTING -j LIMADNS
-A OUTPUT -j LIMADNS
-A POSTROUTING -o eth0 -j MASQUERADE
COMMIT
# Completed on Thu Nov 21 21:37:09 2024

NOTE - my iptable rules looks a bit weird because I am working on lima on MacOS. This would need more deep dive to explain what exactly is going on here, seems like lima created its own chain for nat table. Some details https://serverfault.com/questions/859953/custom-chains-iptables-and-predefined
Anyways for now, we can see the POSTROUTING rule added.

Running program and capturing packets

Now that all the network related bits are in place, we can run the user space program and verify that the network calls are going out to the internet.

go run cmd/userspace-tcpclient/main.go -domain google.com

I am passing a domain flag, so the program is doing the following:

  1. Do a DNS lookup to get the IP corresponding to given domain
  2. Establish a TCP connection with the resuling IP from step 1
  3. Make HTTP GET request on top the TCP connection created in step 2

We are also capturing packets from tcpdump to verify the protocol details like TCP handshake done by user space network stack.

Running the user space go program and tcpdump to capture packets from tun0 interface
viveksb007@lima-ubuntux86-64:/Users/viveksb007/lima/gvisor-experiment$ go run cmd/userspace-tcpclient/main.go -domain google.com
I1121 21:47:18.447863   48963 sniffer.go:378] send tcp 192.168.1.2:61026 -> 172.253.62.100:80 len:0 id:7622 flags:  S       seqnum: 3215243184 ack: 0 win: 29184 xsum:0xbc2e options: {MSS:1460 WS:7 TS:true TSVal:3731565394 TSEcr:0 SACKPermitted:false Flags:        }
I1121 21:47:18.467162   48963 sniffer.go:378] recv tcp 172.253.62.100:80 -> 192.168.1.2:61026 len:0 id:f42f flags:  S  A    seqnum: 3303876076 ack: 3215243185 win: 29184 xsum:0x72e2 options: {MSS:1460 WS:7 TS:true TSVal:592919306 TSEcr:3731565394 SACKPermitted:false Flags:        }
I1121 21:47:18.467348   48963 sniffer.go:378] send tcp 192.168.1.2:61026 -> 172.253.62.100:80 len:0 id:7623 flags:     A    seqnum: 3215243185 ack: 3303876077 win: 228 xsum:0xfb7 options: {TS:true TSVal:3731565413 TSEcr:592919306 SACKBlocks:[]}
2024/11/21 21:47:18 {1 192.168.1.2 61026 } <nil>
2024/11/21 21:47:18 {1 172.253.62.100 80 } <nil>
2024/11/21 21:47:18 TCP connection to 172.253.62.100:80 is successful
I1121 21:47:18.467487   48963 sniffer.go:378] send tcp 192.168.1.2:61026 -> 172.253.62.100:80 len:68 id:7624 flags:    PA    seqnum: 3215243185 ack: 3303876077 win: 4096 xsum:0x4646 options: {TS:true TSVal:3731565413 TSEcr:592919306 SACKBlocks:[]}
I1121 21:47:18.467747   48963 sniffer.go:378] recv tcp 172.253.62.100:80 -> 192.168.1.2:61026 len:0 id:0000 flags:     A    seqnum: 3303876077 ack: 3215243253 win: 227 xsum:0xf74 options: {TS:true TSVal:592919306 TSEcr:3731565413 SACKBlocks:[]}
I1121 21:47:18.493076   48963 sniffer.go:378] recv tcp 172.253.62.100:80 -> 192.168.1.2:61026 len:773 id:0000 flags:    PA    seqnum: 3303876077 ack: 3215243253 win: 4096 xsum:0x53 options: {TS:true TSVal:592919332 TSEcr:3731565413 SACKBlocks:[]}
...

-------------


viveksb007@lima-ubuntux86-64:/Users/viveksb007$ sudo tcpdump -vv -nn  -i tun0
tcpdump: listening on tun0, link-type RAW (Raw IP), snapshot length 262144 bytes
21:47:18.447928 IP (tos 0x0, ttl 64, id 30242, offset 0, flags [none], proto TCP (6), length 60)
    192.168.1.2.61026 > 172.253.62.100.80: Flags [S], cksum 0xbc2e (correct), seq 3215243184, win 29184, options [mss 1460,nop,nop,TS val 3731565394 ecr 0,nop,wscale 7], length 0
21:47:18.467054 IP (tos 0x0, ttl 63, id 62511, offset 0, flags [none], proto TCP (6), length 60)
    172.253.62.100.80 > 192.168.1.2.61026: Flags [S.], cksum 0x72e2 (correct), seq 3303876076, ack 3215243185, win 29184, options [mss 1460,nop,nop,TS val 592919306 ecr 3731565394,nop,wscale 7], length 0
21:47:18.467393 IP (tos 0x0, ttl 64, id 30243, offset 0, flags [none], proto TCP (6), length 52)
    192.168.1.2.61026 > 172.253.62.100.80: Flags [.], cksum 0x0fb7 (correct), seq 1, ack 1, win 228, options [nop,nop,TS val 3731565413 ecr 592919306], length 0
21:47:18.467503 IP (tos 0x0, ttl 64, id 30244, offset 0, flags [none], proto TCP (6), length 120)
    192.168.1.2.61026 > 172.253.62.100.80: Flags [P.], cksum 0x4646 (correct), seq 1:69, ack 1, win 4096, options [nop,nop,TS val 3731565413 ecr 592919306], length 68: HTTP, length: 68
	GET / HTTP/1.1
	Host: google.com
	User-Agent: Go-http-client/1.1

21:47:18.467687 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    172.253.62.100.80 > 192.168.1.2.61026: Flags [.], cksum 0x0f74 (correct), seq 1, ack 69, win 227, options [nop,nop,TS val 592919306 ecr 3731565413], length 0
21:47:18.493036 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 825)
    172.253.62.100.80 > 192.168.1.2.61026: Flags [P.], cksum 0x0053 (correct), seq 1:774, ack 69, win 4096, options [nop,nop,TS val 592919332 ecr 3731565413], length 773: HTTP, length: 773
	HTTP/1.1 301 Moved Permanently
	Location: http://www.google.com/
	Content-Type: text/html; charset=UTF-8
	Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-p4gx5v_3lcLcfGMxwDvIQw' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp
	Date: Fri, 22 Nov 2024 02:47:18 GMT
	Expires: Sun, 22 Dec 2024 02:47:18 GMT
	Cache-Control: public, max-age=2592000
	Server: gws
	Content-Length: 219
	X-XSS-Protection: 0
	X-Frame-Options: SAMEORIGIN

	<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
	<TITLE>301 Moved</TITLE></HEAD><BODY>
	<H1>301 Moved</H1>
	The document has moved
	<A HREF="http://www.google.com/">here</A>.
	</BODY></HTML>
21:47:18.493234 IP (tos 0x0, ttl 64, id 30245, offset 0, flags [none], proto TCP (6), length 52)
    192.168.1.2.61026 > 172.253.62.100.80: Flags [.], cksum 0xfd24 (correct), seq 69, ack 774, win 4089, options [nop,nop,TS val 3731565439 ecr 592919332], length 0
21:47:18.493346 IP (tos 0x0, ttl 64, id 30246, offset 0, flags [none], proto TCP (6), length 152)
    192.168.1.2.61026 > 172.253.62.100.80: Flags [P.], cksum 0x943b (correct), seq 69:169, ack 774, win 4096, options [nop,nop,TS val 3731565439 ecr 592919332], length 100: HTTP, length: 100
	GET / HTTP/1.1
	Host: www.google.com
	User-Agent: Go-http-client/1.1
	Referer: http://google.com
...

References