HomeDownloadSupportRoadmap

Quo vadis, Xtables2?

Here are some notes on what is currently being worked on and what goals there still are to reach. This list is by no means exhaustive, but should give a picture where I (want to) see Xtables2 proceed to.

Pending/In Progress

Submission Submitted to mailing list

Xtables2 uses a strategy of gradual improvement and there have been a number of postings labeled Xtables2 over the years. The most recent submission is a patchset of approximately 35 patches to introduce a Netlink-based command transport channel and data structure manipulation functions. (Table/Chain/Rule manipulation functions are a must-have, because traditionally, iptables took complete care of building the ruleset in userspace, there being no designated way to edit the ruleset in the kernel.)

This latest submission has been accompanied by some debate, about which LWN.net editor Jake Edge has written an article about which I would like to share here. (Please do consider supporting the work of LWN by subscribing.) The unfiltered mailing list conversation can be accessed at various archives: spinics.net, marc.info or gmane.org, whichever interface you like better.

Xtables/CLI library Requested by a packet filter userSubmitted to mailing list

An often-expressed desire is to embed (link) iptables (or whatever displaces it), and using the well-known text representation to access and manipulate the ruleset. With the libxtadm library released in the xtadm tarball, exactly that will become possible. No dealing with binary blobs, guaranteed.

By the way, if you like to follow on Freecode, the projects are [1] and [2].

Plans

Roughly sorted in order of current priority.

Ruleset change notification Requested by a packet filter user

Users would like to get notified of ruleset changes, to, I quote [1], [2], “My use case would be for a routing daemon to detect when nat was present on an interface, so as to not advertise invalid routes.” and “Our use case is to control an external dataplane.

Make more xt_*.c modules NFPROTO_UNSPEC aware

Unhappy with the required rule duplication between iptables and ip6tables? Xtables2 is completely family-agnostic, so as to do away with this duplication, with the ultimate goal is that things like “-m tcp” work in either case. A handful of extensions (such as xt_tcpudp) however do not yet provide a “NFPROTO_UNSPEC” entrypoint.

Enforce a cycle-free ruleset iptables had this feature already

N.B.: Graph theory defines two types: loops and and cycles.

As everybody likely knows, iptables allows users to define their own chains (so-called “user-defined chains”, this technical term popularized by ebtables), and to call into, and return from, these chains. Up to and including Linux 2.6.34, iptables stored the return pointers which are required for the call-return semantics in the ruleset itself, that is, the execution logic in the ipt_do_table function used dedicated reserved spots in the ruleset (struct ipt_entry.comefrom). Allowing cycles/loops in the ruleset would have meant that this one field would be overwritten when a chain is re-entered, leading to problems during return. Therefore, the iptables kernel part (ip_tables.ko) simply rejects cycles and loops in the ruleset.

Starting with Linux 2.6.35, a dedicated jump stack that is independent from the ruleset has been introduced, to support the xt_TEE extension which does cause a cycle. When the jump stack is full, the packet is simply dropped. However, ip_tables.ko continues to check for non-TEE cycles/loops, and I think that is a good idea because it keeps rulesets simpler, for the sake of supporters in forums, IRC and other media.

Xtables2, at the kernel level, rejects loops, but does not attempt to discover cycles. Given loops and cycles do not strictly matter anymore, that cycle detection can occur in userspace instead. It is a trade-off between the two extremes (1) doing the verification in the kernel and (2) considering that cycle detection requires full view of the ruleset, which requires that userspace retrieves the entire table from the kernel.

Use hash-type datastructure for chain lookup

Xtables2, in its December 2012 submissions, uses only linked lists to store chains. (This decision was made to keep the patchset simple, because it already is 30+ patches.) This only affects chain lookups by name, e.g. when dumping a single chain. (A whole-table dump will just cheaply iterate through the list.) Ideally, this linked list should be replaced or augmented by something like a hash-bsaed lookup.

Allow filtering before AF_PACKET sockets get their copy Requested by a packet filter user

Linux has a concept of so-called protocol taps. The IPv4 protocol uses a tap, IPv6 too, and so does every AF_PACKET socket. Prominent users of AF_PACKET sockets are dhcpd and network diagnosis tools such as tcpdump. Every packet received by the networking code is passed on to all taps. Dropping a packet inside iptables however only (essentially) terminates processing in the particular tap, which is why filtering at the iptables level cannot be used to screen out DHCP packets, for example. (Moreover, the tap calling order is unspecified, so even if it did stop all further taps, prior ones would have received it.)

To enable AF_PACKET filtering, a call to invoke an Xtables2 chain would need to be added to __netif_receive_skb. It does not sound as simple though, since the skb network header pointer is not yet set.

Integration of Xtables2 into tc/act_ipt Submitted to mailing list

The packet scheduler (net/sched/) has a module act_ipt.c which is able to Xtables target extensions, and does so by direct invocation of the target. This way of executing Xtables code however is somewhat volatile, as act_ipt and the system administrator have to lie or guess about some parameters that are to be passed to the target. I think this can (and should) be replaced by a proper call to the Xtables2 chain execution function. The benefits of doing so: 1. act_ipt would gain the possibility to not only execute targets, but matches as well. 2. The layering issues in act_ipt would go away, and m_xt (the corresponding userspace plugin for tc) would no longer need to stab around in libxtables.

A draft of how this could look in code has been posted on December 16, 2012.

Implement kernel-level chain rename (maybe?)

(I am yet undecided about this one.) There shall be an NFXTM_CHAIN_MOVE Netlink message which causes a rename of the chain.

There already is a way to rename a chain (and have any referencing chains automatically switch from the inner representation of "-j foo" to "-j bar").

1. Using NFXTM_TABLE_REPLACE. The rename is atomic, though this requires one to transfer the entire ruleset forth and back to the kernel.

I have so far there are two possible approaches that came to mind that incur less transfer:

1. A new single NFXTM_CHAIN_MOVE-type Netlink message that will do the rename in the kernel, so to speak. This would be relatively simple (30 LOC?), and would be atomic to NFXTM_TABLE_DUMP, but non-atomic to everything else, like NFXTM_CHAIN_{NEW,DELETE,DUMP}, meaning that while one thread is issuing NFXTM_CHAIN_MOVE(foo, bar), another thread could issue NFXTM_CHAIN_DUMP(foo) and NFXTM_CHAIN_DUMP(bar) and get results for both.

2. A new single NFXTM_CHAIN_MOVE-type NL message that does a whole table-replace in the kernel. So one saves the large user<->kernel exchange, but the entire table still has to be duplicated (that's the essence of table-replace), atomically swapped, and the old one then discarded.

Jump target integration into ipset

Have ipset store jump targets such that one can use, for example, hash-based lookups on large lists to quickly reach specific chains.


Designed for CSS-less browsing too

Last modified: 2013-02-04 19:50 UTC