Programmer's Life: Applied to Embedded Systems

Wednesday, November 11, 2009

Applied to Embedded Systems

You may be wondering how all this information on
network protocols relates to embedded systems firmware. The answer is
that, because I am building the entire interface, I must deal with
incoming packets at the Ethernet level. The Ethernet device itself
guarantees that only packets destined for its own MAC address are
accepted. The driver firmware is then usually responsible for dropping
all packets, excluding broadcasts, that are not addressed to the local
IP address.

To write the driver code, you need to understand how to
detect when an incoming packet has arrived and how to retrieve the
packet from the Ethernet controller. Both operations are very specific
to the device itself, so you need to refer to the device documentation.
Generally, an Ethernet device looks like a FIFO or uses buffer
descriptors where each descriptor set describes an incoming or outgoing
buffer (or packet).

The buffer descriptor model is pretty common and quite
easy to work with once you get the hang of it. Each device imposes its
own definition of what a buffer descriptor looks like. The buffer
descriptor typically contains a command/status field, a length field, a
field containing a pointer to the actual buffer, and a field pointing
to the next buffer descriptor. This definition implies a linked list of
structures where each structure points to a buffer.

Figure 9.1: Ethernet Driver Buffer Descriptors.

It is common for Ethernet controllers to communicate
with applications about the location of data using a linked chain of
buffer descriptors. Notice that the descriptors form a circularly
linked list.

Figure 9.1
shows a typical buffer descriptor and buffer configuration. The number
of descriptors and buffers depends entirely on the application and the
amount of memory space available in the target system. The Ethernet
device is usually given a pointer to the first buffer descriptor in the
list, and, when data transfer begins, the device assumes that the
buffer descriptors have been properly initialized.

It is very common in C to interface to a
peripheral device by creating a structure that looks like the device
and then overlaying that structure onto the address space that is
occupied by the device. When doing this, you must be aware of some code
generation issues that can otherwise be taken for granted. When using
structures in normal programs, the compiler/CPU combination determines
pre cisely how the structure is mapped on to memory. When you write a
function to access a member in this structure, you can be assured that
the underlying code will access the proper member at the appropriate
offset from the base of the struc ture. You don’t really need to know
exactly how each member in the structure is mapped onto memory.

When using a C structure as a clean way to interface to
a peripheral, you can’t take anything for granted. The peripheral has a
fixed format onto which you are building a structure that must match
the peripheral’s hardware. Suddenly, you must know exactly how the
structure is formatted; otherwise, it will not properly align with the
peripheral’s hardware.

Without going into the various reasons why, the
compiler may add “padding” between members of the structure. This
“hidden” padding of the structure can confuse access to the device
because although the hardware and software struc tures appear to match,
the unseen padding causes a mismatch. In most cases, the solution to
this problem is simple awareness. You need to tell the compiler that
you don’t want it to insert any padding in the structure definition.
You essentially force the compiler to not do something that it would
otherwise consider natural. If you investigate the cross-compiler
documentation, you are likely to find some some special syntax (often
called pragmas) that will direct the compiler to not insert any invisible padding into the relevant structure.

Listing 9.1: Initializing a Block of Buffer Descriptors.

struct rdesc {
    ulong   cmdstat;        /* Command/Status             */
    uchar   *bp;            /* Pointer to buffer          */
    ushort  bsize;          /* Size of buffer             */
    ushort  psize;          /* Size of received packet    */
    struct  rdesc  *ndp;    /* Pointer to next descriptor */
};

unsigned char RbufBase[BUF_SIZE * RX_DESCRIPTOR_COUNT];
struct  rdesc RdescBase[RX_DESCRIPTOR_COUNT];

for(i=0;i<RX_DESCRIPTOR_CNT;i++) {
    RdescBase[i].cmdstat = DEVICE_IS_IDLE;
    RdescBase[i].bsize = BUF_SIZE;
    RdescBase[i].psize = 0;
    RdescBase[i].bp = (uchar *)&RbufBase[i * BUF_SIZE];
    if (i == RX_DESCRIPTOR_CNT-1) {
        RdescBase[i].ndp = RdescBase;
    }
    else {
        RdescBase[i].ndp = &RdescBase[i+1];
    }
}

The example code in Listing 9.1 shows how one might initialize a block of buffer descriptors. The structure rdesc is
used as the buffer descriptor and is declared to match the format of
the descriptor as it would be defined by the device. One block of
memory (RbufBase) is used for buffer allocation, and one block of memory (RdescBase) is used for buffer descriptor allocation. The for loop initializes the linked list. Note that this list is circular; the last item in the list points to the first.

The processPACKET() Function

In the case of packet reception, the firmware
must retrieve the Ethernet payload and extract the packet type to
determine if the packet is ARP or IP. If the packet is neither ARP nor
IP, it is thrown away. That’s about all the Ethernet code does. Next,
the packet is passed to either the IP or ARP packet-processing
function. If the packet is ARP, the appropriate response is sent, and
handling of the packet is complete. If the packet is IP, the software
must further parse the header to determine which packet-processing
subsection should process it. The interface handles UDP, DHCP/BOOTP,
ICMP, and bits of TCP. So, based on the contents of the incoming
packet, one of several different higher layer packet processors is
called.

Listing 9.2: processPACKET().

/* processPACKET():
 *  This is the top level of the message processing after a complete
 *  packet has been received over ethernet.  It's all just a lot of
 *  parsing to determine whether the message is for this board's IP
 *  address (broadcast reception may be enabled), and the type of
 *  incoming protocol.  Once that is determined, the packet is either
 *  processed (TFTP, DHCP, ARP, ICMP-ECHO, etc...) or discarded.
 */
void
processPACKET(struct ether_header *ehdr, ushort size)
{
    int i;
    ushort  *datap, udpport;
    ulong   csum;
    struct ip *ihdr;
    struct Udphdr *uhdr;

    if (ehdr->ether_type == htons(ETHERTYPE_ARP)) {
        processARP(ehdr,size);
        return;
    }
    else if (ehdr->ether_type != htons(ETHERTYPE_IP)) {
        printPkt(ehdr,size,ETHER_INCOMING);
        return;
    }

    /* If we are NOT in the middle of a DHCP or BOOTP transaction, then
     * if destination MAC address is broadcast, return now.
     */
    if ((DHCPState == DHCPSTATE_NOTUSED) &&
        (!memcmp((char *)&(ehdr->ether_dhost),BroadcastAddr,6))) {
        return;
    }

    /* If source MAC address is this board, then assume we received our
     * own outgoing broadcast message...
     */
    if (!memcmp((char *)&(ehdr->ether_shost),BinEnetAddr,6)) {
        return;
    }

The processPACKET() function starts with Listing 9.2.
The incoming parameters are the raw packet from the Ethernet driver and
the size of the packet. The Ethernet header is tested immediately to
see if the packet is ARP or IP. If the packet is ARP, execution
branches to the ARP processing code. If the packet is some unrecognized
protocol, it is printed to the console (as a diagnostic aid) before
discarding it. The remaining code in this function assumes that the
packet is IP. Any incoming broadcasts not associated with DHCP/BOOTP
are dropped. Also, because the incoming packet might be the outgoing
packet just sent by this target (depends on how the Ethernet interface
device is configured), if the Ethernet source address of the incoming
packet is the same as this target’s address, the packet is dropped.

Listing 9.3: Verifying the Packet Integrity.

    ihdr = (struct ip *) (ehdr + 1);

    /* If not version # 4, return now... */
    if (getIP_V(ihdr->ip_vhl) != 4)
        return;

    /* IP address filtering:
     * At this point, the only packets accepted are those destined for this
     * board's IP address, plus, DHCP, if active.
     */
    if (memcmp((char *)&(ihdr->ip_dst),BinIpAddr,4)) {
        if (DHCPState == DHCPSTATE_NOTUSED)
            return;
        if (ihdr->ip_p != IP_UDP)
            return;
        uhdr = (struct Udphdr *)(ihdr+1);
        if (uhdr->uh_dport != htons(DhcpClientPort)) {
            return;
        }
    }

    /* Verify incoming IP header checksum...
     */
    csum = 0;
    datap = (ushort *) ihdr;
    for (i=0;i<(sizeof(struct ip)/sizeof(ushort));i++,datap++)
        csum += *datap;
    csum = (csum & 0xffff) + (csum >> 16);
    if (csum != 0xffff) {
        EtherIPERRCnt++;
        return;
    }
    
    printPkt(ehdr,size,ETHER_INCOMING);

The next phase is to overlay the IP structure (see Listing 9.3)
onto the incoming packet and perform some validation. First, the code
verifies that the packet conforms to IP, then that the incoming packet
is addressed to the local IP address, and, finally, that the checksum
matches the packet. If the packet passes all of this validation, I call
printPkt() to display the packet (printPkt() only prints if verbosity has been enabled).

Listing 9.4: Dispatching for Protocol-Specific Processing.

    if (ihdr->ip_p == IP_ICMP) {
        processICMP(ehdr,size);
        return;
    }
    else if (ihdr->ip_p == IP_TCP) {
        processTCP(ehdr,size);
        return;
    }
    else if (ihdr->ip_p != IP_UDP) {
        int j;

        SendICMPUnreachable(ehdr,ICMP_UNREACHABLE_PROTOCOL);
        if (!(EtherVerbose & SHOW_INCOMING))
            return;
        for(j=0;protocols[j].pname;j++) {
            if (ihdr->ip_p == protocols[j].pnum) {
                printf("%s not supported\n",
                    protocols[j].pname);
                return;
            }
        }
        printf("<%02x> protocol unrecognized\n", ihdr->ip_p);
        return;
    }

The code in Listing 9.4
checks the IP packet type to determine how (if at all) to process the
incoming IP packet. These checks are similar to the checks in Listing 9.2 for the incoming Ethernet frames. MicroMonitor supports some UDP, ICMP, and minimal TCP, so the logic calls processICMP() for incoming ICMP and processTCP() for incoming TCP. MicroMonitor then sends an ICMP Unreachable error response if the incoming packet is not UDP.

Listing 9.5: Processing UDP Packets.

    uhdr = (struct Udphdr *)(ihdr+1);

    /* If non-zero, verify incoming UDP packet checksum...
     */
    if (uhdr->uh_sum) {
        int     len;
        struct  UdpPseudohdr    pseudohdr;

        memcpy((char *)&pseudohdr.ip_src.s_addr,
            (char *)&ihdr->ip_src.s_addr,4);
        memcpy((char *)&pseudohdr.ip_dst.s_addr,
            (char *)&ihdr->ip_dst.s_addr,4);
        pseudohdr.zero = 0;
        pseudohdr.proto = ihdr->ip_p;
        pseudohdr.ulen = uhdr->uh_ulen;

        csum = 0;
        datap = (ushort *) &pseudohdr;
        for (i=0;i<(sizeof(struct UdpPseudohdr)/sizeof(ushort));i++)
            csum += *datap++;

        /* If length is odd, pad and add one. */
        len = ntohs(uhdr->uh_ulen);
        if (len & 1) {
            uchar   *ucp;
            ucp = (uchar *)uhdr;
            ucp[len] = 0;
            len++;
        }
        len >>= 1;

        datap = (ushort *) uhdr;
        for (i=0;i<len;i++)
            csum += *datap++;
        csum = (csum & 0xffff) + (csum >> 16);
        if (csum != 0xffff) {
            EtherUDPERRCnt++;
            return;
        }
    }
    udpport = ntohs(uhdr->uh_dport);

    if (udpport == MoncmdPort)
        processMONCMD(ehdr,size);
    else if (udpport == DhcpClientPort)
        processDHCP(ehdr,size);
    else if ((udpport == TftpPort) || (udpport == TftpSrcPort))
        processTFTP(ehdr,size);
    else {
        if (EtherVerbose & SHOW_INCOMING) {
            uchar *cp;
            cp = (uchar *)&(ihdr->ip_src);
            printf("  Unexpected IP pkt from %d.%d.%d.%d ",
                cp[0],cp[1],cp[2],cp[3]);
            printf("(sport=0x%x,dport=0x%x)\n",
                ntohs(uhdr->uh_sport),ntohs(uhdr->uh_dport));
        }
        SendICMPUnreachable(ehdr,ICMP_UNREACHABLE_PORT);
    }
}

The processing in Listing 9.5 is similar to that in Listing 9.3
but one level up in the protocol. This code overlays the UDP structure
onto the IP payload and runs a checksum verification of the UDP header
and pseudoheader. This checksum might not be necessary if the incoming
packet has the checksum value set to zero. Finally, the handler
processes the incoming UDP port number and branches to the appropriate
code deeper in the MicroMonitor software (TFTP, DHCP, or MONCMD). If
the port is not one that MicroMonitor supports, the handler returns an
ICMP error response.

Note

MONCMD is port 777 in the monitor. This port supports the ability to execute CLI commands over UDP.

In General

The implementation of the driver code is in C,
and structures are overlaid onto the incoming packet to simplify the
packet parsing. For example, the structure shown in Listing 9.6 could be used for the IP packet processing.

Listing 9.6
contains a few details of which you should be aware. First, depending
on where the packet ends up in the memory space of the target, you need
to consider different CPU alignment requirements. For example, some
CPUs do not like to look at a two- or four-byte integer (short or long,
respectively) unless the integer is aligned on an even address. If the
data is on an odd address and C code attempts to access it, the
processor can throw an alignment exception. Be
aware of this issue, and make adjustments where necessary. If you can’t
predict the alignment of your memory buffer, you have no choice but to
copy the two- or four-byte datum to an aligned space before accessing
it.

Listing 9.6: A Structure Declaration for the IP Header.

struct ip_header {
    uchar  ip_vhl;      /* version and header length */
    uchar  ip_tos;      /* type of service           */
    ushort ip_len;      /* length of packet          */
    ushort ip_id;       /* identification            */
    ushort ip_offset;   /* fragment offset field     */
    uchar  ip_ttl;      /* time to live              */
    uchar  ip_proto;    /* protocol                  */
    ushort ip_csum;     /* checksum                  */
    ulong  ip_source;   /* source IP address         */
    ulong  ip_dest;     /* destination IP address    */
}

An added complication is the fact that the CPU of
the target might be big or little endian. All data on a network is
transferred in network-byte order, which is big endian. If you are
lucky enough to be running with a big-endian CPU, big-endian data is no
issue; if the CPU is little endian, things like packet length,
checksum, and other multi-byte quantities in the headers must be
endian-reversed. The network-to-host long (ntohl) host-to-network long (htonl), network-to-host short (ntohs) and host-to-network short (htons)
conversion macros perform this endian reversal. On a big-endian
machine, these macros do nothing; on a little-endian machine, they
perform the conversion.

Programmer's Life

Wednesday, November 11, 2009

Applied to Embedded Systems

Applied to Embedded Systems

The processPACKET() Function

In General

No comments:

Blog Archive

About Me

Link