iptables

Table of Contents

Keywords: iptables network namespaces

A network packet, originating from a process or comming from a network interface, traverses the network stack. iptables is a "client application" to specify actions on packets. These actions can be: accept the packet, drop the packet, change destination ip or port, etc for those packets that match some criterium (destination port, source ip, etc).

Terminology

A rule specifies an action (target) to be done on packets matching some critera.

A chain is a grouping of rules under a name. There are predefined chains:

  • INPUT: rules applied before sending the packet to a process
  • OUTPUT: rules applied when the packet leaves a process
  • PREROUTING: rules applied when a packet comes to a interface
  • POSTROUTING: rules applied when a packet is about to be sent from an interface
  • FORWARD: rules applied when the packet is routed to a different network.

There are predefined tables that can be seen as a namespace to group chains:

  • filter: to decide if the packet continues or not. This is the default table. Contains INPUT (for packets that are destined to local sockets), OUTPUT (for locally generated packets), and FORWARD (for packets being routed) chains.
  • nat: to perform network address translation. Contains PREROUTING (to alter packets as soon as they come in), OUTPUT (to alter packets that are produced by a local process) and POSTROUTING (for altering packets that are about to come out).
  • mangle: specialized packet alteration. Contains all the predefined chains.
  • raw: to inspect packets based on status (new connection, connection established, etc). It is used mainly for configuring exemptions from connection tracking in combination with the NOTRACK target. Contains the chains: PREROUTING (form packets comming from any interface) and OUTPUT (for packets generated by local processes).
Table Chains
filter INPUT, FORWARD, OUTPUT
nat PREROUTING, OUTPUT, POSTROUTING
mangle PREROUTING, INPUT, OUTPUT, FORWARD, POSTROUTING
raw PREROUTING, OUTPUT

Some sample targets:

  • ACCEPT: the packet continues its way
  • DROP: the packet simply disappears.
  • REJECT: as drop but the host sending the packet is notified.
  • LOG: log the actions (it is a non terminating target)
  • DNAT: change destination IP of the packet
  • SNAT: change source IP of the packet
  • TOS: change type of service (usually routers do not make decisions based on this, but iproute2 can use it).
  • TTL: change time to live of the packet
  • MARK: set special marks. These marks can be used by iproute2, bandwidth limiting, or Class Based Queuing based on these marks.

When a packet goes to a local process:

Table Chain Comment
    The packet is on the wire
    The packet comes to an interface
raw PREROUTING Handles the packet before connection tracking is executed
    Connection tracking code is executed
mangle PREROUTING Modify TOS, TTL, etc
nat PREROUTING This chain is used mainly for DNAT
    Routing decision, the packet is for this host or should
    be forwarded (in this case is for a local process)
mangle INPUT Alter the packet before sending to the process
filter INPUT Filtering
    The packet is delivered to a process

When the packet is generated by a local process:

Table Chain Comment
    The packet generated by a process
    Routing decision, what source address, interface,…
raw OUTPUT Handles the packet before connection tracking is executed
    Connection tracking code for local packets is executed
mangle OUTPUT Modify TOS, TTL, etc
nat OUTPUT NAT outgoing packets
filter OUTPUT Filter outgoing packets
    Routing decision as mangle or nat can change the packet
mangle POSTROUTING Change packets before leaving
nat POSTROUTING To perform SNAT
    The packet goes out on some interface
    The packet is on the wire

Finally, if the packet is destined for another host on another network (forwarded packets):

Table Chain Comment
    The packet is on the wire
    The packets comes in on an interface
raw PREROUTING  
    Connection tracking for incomming packets
mangle PREROUTING Mangle packets (change TOS, TTL, etc)
nat PREROUTING Used mainly for DNAT
    Routing decision: the packet must be forwarded
mangle FORWARD Mangle packets after routing decision
filter FORWARD  
mangle POSTROUTING Mangle packets before leaving
nat POSTROUTING Is used for SNAT
    The packets goes out on an interface
    The packet is on the wire

Source of the tables: https://www.frozentux.net/iptables-tutorial/iptables-tutorial.html

Sample iptables commands

# List all chains and rules in mangle
sudo iptables -t mangle -L -n

# List all chains and rules in nat table and show line numbers
sudo iptables -t nat -L -n --line-numbers

# Set default policy: if no rule match then drop the packet
sudo iptables -P INPUT DROP

# Create a new chain, add a rule, and add to chain PREROUTING of nat table
sudo iptables -N my_chain
sudo iptables -A my_chain -d 10.0.0.10 -j DNAT --to-destination 10.20.0.10
sudo iptables -A -t nat -A PREROUTING -j my_chain

# Delete all rules in chain
sudo iptables -t nat -F my_chain

# Insert a rule in position 2
sudo iptables -I INPUT 2 -s 10.0.0.3 -j DROP
# Replace rule in position 2
sudo iptables -I INPUT 2 -s 10.0.0.4 -j DROP
# Delete rule in position 2
sudo iptables -D INPUT 2

#Specify interfaces:
sudo iptables -A INPUT -i eth0 -j ACCEPT
sudo iptables -A OUTPUT -o eth1 -d 10.0.0.3 -j DROP

# Use modules to perform more complex operations:
sudo iptables -A INPUT -p tcp -m tcp --dport 22 -s 10.0.0.4 -j DROP
sudo iptables -A INPUT -p tcp -m multiport --dports 20,5001 -s 10.0.0.4 -j DROP

Test environment and examples

Setup

We are going to create the following setup:

network-namespaces-vnics.png

# Create network namespaces
sudo ip netns add ns1
sudo ip netns add ns2
sudo ip netns add ns3

# Create veth pairs and assign to namespaces
sudo ip link add veth1_ns1 netns ns1 type veth peer name veth1_ns2 netns ns2
sudo ip link add veth2_ns2 netns ns2 type veth peer name veth2_ns3 netns ns3

# Set up all the interfaces
sudo ip netns exec ns1 ip link set veth1_ns1 up
sudo ip netns exec ns2 ip link set veth1_ns2 up
sudo ip netns exec ns2 ip link set veth2_ns2 up
sudo ip netns exec ns3 ip link set veth2_ns3 up

# Set address to interfaces in ns1 and ns3
sudo ip netns exec ns1 ip addr add 10.1.0.10/32 dev veth1_ns1
sudo ip netns exec ns3 ip addr add 10.2.0.10/32 dev veth2_ns3

# Set address to interfaces in ns2
sudo ip netns exec ns2 ip addr add 10.1.0.1/32 dev veth1_ns2
sudo ip netns exec ns2 ip addr add 10.2.0.1/32 dev veth2_ns2

# Set routes
sudo ip netns exec ns1 ip route add default via 10.1.0.1 dev veth1_ns1
sudo ip netns exec ns2 ip route add 10.1.0.0/24 dev veth1_ns2
sudo ip netns exec ns2 ip route add 10.2.0.0/24 dev veth2_ns2
sudo ip netns exec ns3 ip route add default via 10.2.0.1 dev veth2_ns3

# We can ping all interfaces from all namespaces:
# sudo ip netns exec ns1 ping 10.1.0.1
# sudo ip netns exec ns2 ping 10.1.0.10
# sudo ip netns exec ns3 ping 10.2.0.1
# sudo ip netns exec ns2 ping 10.2.0.10
# sudo ip netns exec ns1 ping 10.2.0.10
# sudo ip netns exec ns3 ping 10.1.0.10

To enable log of iptables in namespaces:

echo 1 > /proc/sys/net/netfilter/nf_log_all_netns

Create iptables rules to log in all tables and chains:

function log_iptables_ns(){
    ns=$1

    echo "Configuring iptables in $ns"

    sudo ip netns exec $ns iptables -t filter -F
    sudo ip netns exec $ns iptables -t nat -F
    sudo ip netns exec $ns iptables -t mangle -F
    sudo ip netns exec $ns iptables -t raw -F

    sudo ip netns exec $ns iptables -t filter -A INPUT -j LOG --log-prefix "IPTABLES_"$ns"_FILTER_INPUT "
    sudo ip netns exec $ns iptables -t filter -A FORWARD -j LOG --log-prefix "IPTABLES_"$ns"_FILTER_FORWARD "
    sudo ip netns exec $ns iptables -t filter -A OUTPUT -j LOG --log-prefix "IPTABLES_"$ns"_FILTER_OUTPUT "

    sudo ip netns exec $ns iptables -t nat -A PREROUTING -j LOG --log-prefix "IPTABLES_"$ns"_NAT_PREROUTE "
    sudo ip netns exec $ns iptables -t nat -A INPUT -j LOG --log-prefix "IPTABLES_"$ns"_NAT_INPUT "
    sudo ip netns exec $ns iptables -t nat -A OUTPUT -j LOG --log-prefix "IPTABLES_"$ns"_NAT_OUTPUT "
    sudo ip netns exec $ns iptables -t nat -A POSTROUTING -j LOG --log-prefix "IPTABLES_"$ns"_NAT_POSTROUTING "

    sudo ip netns exec $ns iptables -t mangle -A PREROUTING -j LOG --log-prefix "IPTABLES_"$ns"_MANGLE_PREROUTING "
    sudo ip netns exec $ns iptables -t mangle -A INPUT -j LOG --log-prefix "IPTABLES_"$ns"_MANGLE_INPUT "
    sudo ip netns exec $ns iptables -t mangle -A FORWARD -j LOG --log-prefix "IPTABLES_"$ns"_MANGLE_FORWARD "
    sudo ip netns exec $ns iptables -t mangle -A OUTPUT -j LOG --log-prefix "IPTABLES_"$ns"_MANGLE_OUTPUT "
    sudo ip netns exec $ns iptables -t mangle -A POSTROUTING -j LOG --log-prefix "IPTABLES_"$ns"_MANGLE_POSTROUTING "

    sudo ip netns exec $ns iptables -t raw -A PREROUTING -j LOG --log-prefix "IPTABLES_"$ns"_RAW_PREROUTING "
    sudo ip netns exec $ns iptables -t raw -A OUTPUT -j LOG --log-prefix "IPTABLES_"$ns"_RAW_OUTPUT "
    }

log_iptables_ns ns1
log_iptables_ns ns2
log_iptables_ns ns3

Analysis: perform DNAT in ns2 to change destination IP

We are going to change the destination IP for packets that go to 169.254.169.254 (this IP is used by cloud-init to obtain information about the instance):

sudo ip netns exec ns2 iptables -t nat -A PREROUTING -d 169.254.169.254 -j DNAT --to-destination 10.1.0.10

From ns3 we ping that IP (this packet can leave ns3 because we set 10.2.0.1 as default gateway) and we see that, after adding that rule, the ping obtains a response:

sudo ip netns exec ns3 ping -c1 169.254.169.254
# Output:
PING 169.254.169.254 (169.254.169.254) 56(84) bytes of data.
64 bytes from 169.254.169.254: icmp_seq=1 ttl=63 time=0.171 ms

--- 169.254.169.254 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.171/0.171/0.171/0.000 ms

In this case we expect a DNAT and forwarding. This is the iptables log output commented:

The packet is produced in a process (ping) in namespace ns3:

[28115.222927] IPTABLES_ns3_RAW_OUTPUT IN= OUT=veth2_ns3 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.222933] IPTABLES_ns3_MANGLE_OUTPUT IN= OUT=veth2_ns3 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.222939] IPTABLES_ns3_FILTER_OUTPUT IN= OUT=veth2_ns3 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

[28115.222942] IPTABLES_ns3_MANGLE_POSTROUTING IN= OUT=veth2_ns3 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

ping-output-process-ns3.png

The packet comes to namespace ns2:

[28115.222959] IPTABLES_ns2_RAW_PREROUTING IN=veth2_ns2 OUT= MAC=1a:55:54:40:c2:b4:ea:c4:17:01:10:5c:08:00 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.222971] IPTABLES_ns2_MANGLE_PREROUTING IN=veth2_ns2 OUT= MAC=1a:55:54:40:c2:b4:ea:c4:17:01:10:5c:08:00 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.222977] IPTABLES_ns2_NAT_PREROUTING IN=veth2_ns2 OUT= MAC=1a:55:54:40:c2:b4:ea:c4:17:01:10:5c:08:00 SRC=10.2.0.10 DST=169.254.169.254 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

[28115.222992] IPTABLES_ns2_MANGLE_FORWARD IN=veth2_ns2 OUT=veth1_ns2 MAC=1a:55:54:40:c2:b4:ea:c4:17:01:10:5c:08:00 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.222997] IPTABLES_ns2_FILTER_FORWARD IN=veth2_ns2 OUT=veth1_ns2 MAC=1a:55:54:40:c2:b4:ea:c4:17:01:10:5c:08:00 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

[28115.223000] IPTABLES_ns2_MANGLE_POSTROUTING IN= OUT=veth1_ns2 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.223003] IPTABLES_ns2_NAT_POSTROUTING IN= OUT=veth1_ns2 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

We can see that the packet reaches the FORWARD chain in mangle table with the destination IP changed.

forward-ns2-1.png

The packet comes to namespace ns1 and is delivered to a process that produces a response:

[28115.223016] IPTABLES_ns1_RAW_PREROUTING IN=veth1_ns1 OUT= MAC=ce:7b:cd:cb:83:40:de:57:dc:c6:0b:b6:08:00 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.223021] IPTABLES_ns1_MANGLE_PREROUTING IN=veth1_ns1 OUT= MAC=ce:7b:cd:cb:83:40:de:57:dc:c6:0b:b6:08:00 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

[28115.223030] IPTABLES_ns1_MANGLE_INPUT IN=veth1_ns1 OUT= MAC=ce:7b:cd:cb:83:40:de:57:dc:c6:0b:b6:08:00 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1
[28115.223034] IPTABLES_ns1_FILTER_INPUT IN=veth1_ns1 OUT= MAC=ce:7b:cd:cb:83:40:de:57:dc:c6:0b:b6:08:00 SRC=10.2.0.10 DST=10.1.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=8335 DF PROTO=ICMP TYPE=8 CODE=0 ID=39135 SEQ=1

[28115.223045] IPTABLES_ns1_RAW_OUTPUT IN= OUT=veth1_ns1 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1
[28115.223048] IPTABLES_ns1_MANGLE_OUTPUT IN= OUT=veth1_ns1 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1
[28115.223050] IPTABLES_ns1_FILTER_OUTPUT IN= OUT=veth1_ns1 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

[28115.223053] IPTABLES_ns1_MANGLE_POSTROUTING IN= OUT=veth1_ns1 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

forward-ns1.png

The packet comes to namespace ns2:

[28115.223060] IPTABLES_ns2_RAW_PREROUTING IN=veth1_ns2 OUT= MAC=de:57:dc:c6:0b:b6:ce:7b:cd:cb:83:40:08:00 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1
[28115.223065] IPTABLES_ns2_MANGLE_PREROUTING IN=veth1_ns2 OUT= MAC=de:57:dc:c6:0b:b6:ce:7b:cd:cb:83:40:08:00 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

[28115.223069] IPTABLES_ns2_MANGLE_FORWARD IN=veth1_ns2 OUT=veth2_ns2
MAC=de:57:dc:c6:0b:b6:ce:7b:cd:cb:83:40:08:00 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1
[28115.223073] IPTABLES_ns2_FILTER_FORWARD IN=veth1_ns2 OUT=veth2_ns2 MAC=de:57:dc:c6:0b:b6:ce:7b:cd:cb:83:40:08:00 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

[28115.223076] IPTABLES_ns2_MANGLE_POSTROUTING IN= OUT=veth2_ns2 SRC=10.1.0.10 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

forward-ns2-2.png

Finally, the packet comes back to namespace ns3:

[28115.223083] IPTABLES_ns3_RAW_PREROUTING IN=veth2_ns3 OUT= MAC=ea:c4:17:01:10:5c:1a:55:54:40:c2:b4:08:00 SRC=169.254.169.254 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1
[28115.223087] IPTABLES_ns3_MANGLE_PREROUTING IN=veth2_ns3 OUT= MAC=ea:c4:17:01:10:5c:1a:55:54:40:c2:b4:08:00 SRC=169.254.169.254 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

[28115.223093] IPTABLES_ns3_MANGLE_INPUT IN=veth2_ns3 OUT= MAC=ea:c4:17:01:10:5c:1a:55:54:40:c2:b4:08:00 SRC=169.254.169.254 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1
[28115.223098] IPTABLES_ns3_FILTER_INPUT IN=veth2_ns3 OUT= MAC=ea:c4:17:01:10:5c:1a:55:54:40:c2:b4:08:00 SRC=169.254.169.254 DST=10.2.0.10 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=35512 PROTO=ICMP TYPE=0 CODE=0 ID=39135 SEQ=1

response-ns3.png

Links

Date: 21/02/2021

Author: Juan GutiƩrrez Aguado

Emacs 27.1 (Org mode 9.4.4)

Validate