TTY replay daemon

   
  The source is small, and so I think it can be understood in a clear way. However, here is some background story and infos.
 
First stage - Kernel Patch >

The first step in capturing the data off a tty is directly in the kernel, within drivers/char/tty_io.c. It is easier than it sounds, actually, I was pretty much driven by the User Mode Linux patch which was applied to 2.4.21-SuSE, as it already had some kind of TTY logging. Well as always, something was missing or just did not fit in the scene.

Five variable function pointers -- rpl_qopen, rpl_qread, rpl_qwrite, rpl_qclose and rpl_qioctl -- are exported from tty_io.c for module hook-up. That means that a module has to redirect them to its own functions, when it shall go into action.

Excerpt from drivers/char/tty_io.c

#ifdef CONFIG_TTY_RPL
# include <linux/rpl.h>
int (*rpl_qopen)(struct tty_struct *, struct tty_struct *);
int (*rpl_qread)(const char *, size_t, struct tty_struct *);
int (*rpl_qwrite)(const char *, size_t, struct tty_struct *);
int (*rpl_qclose)(struct tty_struct *);
int (*rpl_qioctl)(struct tty_struct *, struct tty_struct *, unsigned int, unsigned long);

#ifdef CONFIG_TTY_RPL
if(rpl_qread != NULL) { rpl_qread(buf, i, tty); }
#endif

 
Second stage - Kernel Module >

The 2nd stage consists of the functions behind rpl_qopen, etc. (The functions as present in the module.) They copy the data captured by the tty driver to a buffer, so that rpl_qopen() return as soon as possible to not block the tty driver.

This can get a little problematic if you have a lot of simultaenous users entering or producing a lot of text. I am not out to the disk's speed and/or the log size, but the module buffer size where the tty data is temporarily copied to before it is passed on.

I have not tested how much the buffer fills up under heavy load or collaborative shell developmental work.

Excerpt from kernel-2.6/rpldev.c

static int krn_read(const char *buf, size_t count, struct tty_struct *tty) {
    struct rpld_packet p;
    get_tty_devnr(&p.dev, tty);
    p.event = EV_READ;
    p.magic = P_MAGIC;
    p.size  = 0;
    return mv_buffer(&p, buf, count);
}

 
Third stage - Userspace Gateway >

The module provides a character-based device node, /dev/rpl (or /dev/misc/rpl) to read from. Device and userspace logging daemon must use the same protocol for the data passed over the channel.

The device node is automatically created by misc_register() in drivers/char/misc.c by using devfs -- or what is left of it in 2.6. The device is read-only even though the initial mknod devfs (or misc) issues sets it to read-write. That should not hurt as it is prohibited to open the character device with O_RDWR or O_WRONLY and trying so will fail with EPERM. Furthermore, the device can only be opened once, because it would be unspecified which tty packet would go to which rpld instance.

As far as ttyrpld is concerned, its Kernel module allocates the buffer and hooks up on the rpl_* function pointers when the device is successfully opened, not when the module is loaded. Not quite sure how I came across the first decision, but the second is due to the module design: krn_write() will always enqueue something into the buffer. (That is, I did not want to have a global variable indicating the device is opened and having an if() within krn_write().)

Similar applies when closing the device. That way, the Kernel module can stay loaded even when rpld is not active without requiring resident memory of the size you chose for the ring buffer.

Excerpt from kernel-2.6/rpldev.c

static ssize_t uif_read(struct file *filp, char *buf, size_t count, loff_t *ppos) {
    ...
    // Data is available, so give it to the user
    count = imin(count, avail_R());
    mv_user(buf, count);
    ...
    return count;
}

 
Fourth stage - Userspace Logging Daemon >

A userspace daemon reads, evaluates and stores the data retrieved via the device.

The reasons why I think this design is good are that:

  • Only the real necessary stuff is compiled into the Kernel, everything else goes as a module.

  • The amount of changes you need to apply when modifying one stage. You could change the inner working of the module, change the way the device responds to the logging daemon, etc. without needing to change too much.
  • User memory can be swapped out if is not used, Kernel memory can not. If there is no tty activity, the logging daemon will not become active and thus can be swapped out to give other applications some more physical memory.
  • You have all your favorite libraries in userspace.
 
"Fifth stage" - Replaying logs >

The only technical thing in ttyreplay is the delay overhead correction algorithm. The minimum delay period for user-space applications within the SCHED_OTHER priority is 1/HZ seconds. (See linux/include/asm/param.h). So when wanting a 5000 microsecond delay, the real delay we are doing is between 10000 to 15000 µs. To get around this, the algorithm checks the time it has actually spent for a particular delay.

Excerpt from user/replay.c

gettimeofday(&sa, NULL);
rv = nanosleep(req, NULL);
gettimeofday(&sb, NULL);

/* Calculate the actual duration and the overhead (actual time minus wanted time) */
dur = MICROSECOND * (sb.tv_sec - sa.tv_sec) + (sb.tv_usec - sa.tv_usec);
over = dur - (req->tv_sec * 1000 + req->tv_nsec / 1000);
if(over > 0) { *xdelay += over; }

It keeps a counter with the total cumulative overhead. On the next delay which is to be executed, the Wanted Time is decreased by a certain amount to account for some overhead.

if(*xdelay > 0) {
    ...
    *xdelay -= take;
    ...
        --req->tv_sec;
        req->tv_sec -= -new / NANOSECOND;
        req->tv_nsec = NANOSECOND - new;
    ...
}