Serial communications
Theory
How do we communicate data between computers, or between
peripherals and the CPU? We've already talked about one way to
communicate data: using a bus. The thing is, a bus is expensive
(it needs a wire for every bit, plus wires for control information).
It turns out that we can trade some money for speed, and cram all the
bits down a single wire. This is the idea behind serial communication.
Parallel-serial conversion
Let's suppose we have some data we want to pass down the line. To do
this, we need a "shift register" at each end. A shift register is a
register that we can shift data through, one bit at a time, bringing
data in one end and putting it out the other. Normally, you can also
read or write all of the bits of a shift register at the same time.
So the idea is that we can put a whole byte into the shift register
all at once, and then ship it down the wire one bit at a time.
To use shift registers in serial communication, we need two of them.
At the transmitter end, we need to be able to load up all the data
bits in parallel, and then shift them out one bit at a time. At the
receiver end, we need to be able to shift the data in one bit at a
time, and then read the whole eight bits at once.
Bits per Second and Baud Rate
Now, we start to get into the details. First, how long (in time) is a
bit? If the transmitter and the receiver don't agree, the receiver
will get the wrong data. There are a bunch of standard bit-per-second
speeds that are used; these are all multiples of 300 bps for
historical reasons (note -- the new 56K modems seem to be an exception
to this). Also for historical reasons, we frequently say baud (named
after Baudot) when we mean bps. It turns out that these don't mean
quite the same thing -- baud means the actual number of voltage
transitions possible; modems use some pretty fancy signal processing
techniques to encode more bits/second than the actual baud rate. Next
question: how many bits are in a character? The original baudot code
was a five bit code (I have no idea how they got the whole alphabet
plus numbers in there!); for a very long time we normally saw a seven
bit character; today we normally see an eight bit character.
The next problem is making sure the data is sent correctly. There are
two subproblems here: getting it to work at all, and getting it to
work in the presence of noise.
Start Bit
The transmitter is not going to send data continuously. You've seen
this in doing downloads, and in running hexmon: there's normally more
time spent between sending characters, than actually sending them. When
you're not sending data, the line is held at a steady value of "1."
So, how do you tell when a data character is coming? Suppose we see a
binary sequence 11111111000011111
. If we're using a
seven-bit ASCII character, this could be 1110000
('p
'), 1100001
('a
'),
1000011
('C
'), or 0000111
(BEL
). It could even be two characters, with some of the
0's in each one! These possibilities are 1111110 0001111
('~/
', 1111100 0011111
('|US'
)
or 1111000 0111111
('x?'
). How can we tell
the difference?
What we'll do is put a 0
bit in front of every character,
to tell it that a character is coming (the 0
isn't be
part of the data, it's in front of the data). We call this a "start
bit;" it marks the start of a data character.
Stop Bit(s)
Now, suppose we get some noise on the line. We can end up thinking a
data bit is the start bit, and read a character of nonsense. There are a
couple of things we can do about this. The first one is to use a stop
bit: we follow each data character with some fixed number of
1's. This means the receiver does the following to try to read a
character:
- Wait until it sees a 0.
- Read the bits following the 0.
- Make sure the character is followed by the right number of 1's.
If it isn't, we had an error.
Parity
Notice that this will miss a lot of errors, especially if there's a
lot of time between characters. If the noise ended halfway through a
character, we'll see everything after the next 0 in the character as
data, and then see the 1's that go between characters as more data.
We can do a little better than this by adding another bit, called a
parity bit. On the transmitter end, we'll take a look at a character
to be sent and ask, "are there an even or an odd number of 1's in
this character?" Then, we'll inject an extra bit after the character
(but before the stop bit) to force the number of 1's to be either even
or odd (our choice). So if we're going to send out "odd parity," we
make sure that the character+parity bit always contains an odd number
of 1's. If we're going to send out even parity, we'll make sure it's
even. There are five parity functions:
- No parity. Don't send out a parity bit at all.
- 0 parity. Make the parity bit always 0.
- 1 parity. Make the parity bit always 1.
- Even parity. Use the parity bit to make sure the parity of the
character+bit is even.
- Odd parity. Use the parity bit to make sure the parity of the
character+bit is odd.
A Few Last Notes on Parity, Character Length, and Stop Bits
Programming RS-232 ports would be hard enough if you actually had to
get the configuration right for it to work. Amazingly, there are a
bunch of ways you can come up with configurations that don't agree,
but which work - or even worse, that almost work.
For instance, you might have one end set to 8 bits of data, 1 parity,
and 1 stop bit, while the other end is set to 8 bits of data, no
parity, and two stop bits. It'll work!
Or, you could have one end set to 7 bits of data, 0 parity, and 1 stop
bit, with the other end set to 8 bits of data and 1 stop bit. This
will work as long as you only send ASCII characters; as soon as the
end supporting 8 bits of data sends a character with the most
significant bit set the other end will report a parity error. This
will manifest itself as something like successfully running a text
editor over the wire (for hours at a time!), but failing very quickly
when you try to download a file.
Real Life
At one time, the capabilities described for serial ports were all
needed to make sure data was not lost. People tended to work at dumb
computer terminals, which communicated over relatively long distances
to central computers. Where I went to school, they even economized by
using only two wires to connect the terminals to the department's
computer: one wire for data in each direction. They relied on the
building ground to provide a signal ground! This worked surprisingly
well for a long time; we just saw noise on the line whenever the air
conditioner started up or something. As time went on we kept adding
more terminals and computers; then one day the ground plane got noisy
enough that the whole scheme basically quit working completely, which
required an emergency rewiring of the whole building.
Networks
This idea of putting extra "stuff" around the data you're actually
interested in is called putting the data in a frame, and is a very
standard technique in networking.
For most networking, the frame is quite a bit more complicated than
this, because additional routing information is also required.
A data packet is typically sent out with information that encodes
things like what computer the packet started at, what computer it is
supposed to go to, what process on the destination computer it is
intended for, and so forth. We already know all this stuff in our
application, so we can just send a bunch of bytes.
An example of a more complicated data frame is how data is sent on an
ethernet. An Ethernet packet contains:
- A 62 bit preamble (consisting of alternating 1's and 0's) and
two bit start of frame delimiter (this is two 1's). This
is an extension of the start bit, which serves to let all the
receivers on the wire know that a packet has started.
On an oscilloscope, this preamble is a 5MHz square wave (for 10 Mbit
ethernet). This gives the receiver a good long chance to sync
up its phase-locked-loop oscillator to the signal.
- A 48 bit source and 48 bit destination field. It turns out that
every NIC card ever built has a "unique" 48 bit ID number, called
its MAC (Media Access Control) address (this is frequently
called its hardware address). So we are able to send a packet
from a sender to a receiver on an ethernet segment (if the
sender and receiver are on different segments, or have a
telephone link or something between them, things get more
complicated in a hurry). (I say "unique" because, while the
MAC address was originally supposed to be unique, it really
isn't any more, for two reasons. First, blocks of MAC addresses
have been assigned to various vendors; some of them have run out
of addresses and recycled them. Second, most modern NICs can
have the MAC address set by software. Still, it doesn't matter
much as long as all the NICs on a given ethernet segment have
unique MAC addreses)
- A 16 bit field encoding the type of the packet and its length.
For instance, a TCP/IP packet will have a value of 0x80 0x00 in
this field.
- The actual data, which will be between 46 and 1500 bits.
- A 16 bit CRC (Cyclic Redundency Check) field, which is used to
check that the data was sent correctly. This is an extension o
the parity bit described before.
Practice
HC11 Serial Capabilities
The HC11 is capable of sending and receiving data at a wide variety of
speeds from 75 to 125,000 bps (with the clock rate we're using). It
supports one start bit, one stop bit, and eight or nine data bits
(notice that we can play some games to "fake" other parameters: we
can set it for nine data bits, calculate the parity by hand, and put
it in the ninth bit, for example). It can deliver interrupts on a
variety of conditions (which we'll describe in a minute).
A very nice feature of the SCI is that it provides some limited
buffering, which makes it easier to keep the transmitter line
full at all times, and gives some latitude in receiving. Here's a
conceptual picture of the SCI port:
The "Data In" and "Data Out" lines are the actual serial IO wires
going in to and out of the chip.
On the input side, bits are read into a "shift register," one bit at a
time. When all the bits for a character have come in, it's moved into
the input buffer and is ready to be read. A more complete diagram,
showing things like the start and stop bits, is in the reference
manual as figures 9-1 and 9-2 (pages 320 and 322).
On the output side, the programmer writes a character to the output
buffer. The character is transferred to another shift register, and
sent out one bit at a time.
An odd thing about this diagram is the way
I've represented the SCDR (serial communication data register). The
idea here is that it's really two separate registers; when you read
it (by reading address $102f
), you read the input
buffer. When you write to it (by writing to the same address, you
write to the output buffer. This seems really weird, but isn't at all
uncommon.
Configuration
Configuring the port requires setting the speed, defining the
character format, and enabling interrupts on desired conditions.
Looking at these in turn:
BAUD register
The BAUD register (at $102b
) determines the speed the SCI
is running. The SCP1-0 and SCR2-0 bits select the rate (in
conjunction with the system clock speed). The other bits in the baud
register are not used. Table 9-1 in the reference manual gives a
table showing the various maximum baud rates possible depending on the
setting of these BAUD register bits; all we really care about in this
class is that setting them to 11 lets us get 9600 bps. Likewise,
table 9-2 (page 329) gives the actual baud rates obtained by combining
SCP1-SCP0 and SCR2-SCR0; again, what we care about here is that
setting SCR2-SCR0 to 000 will give us 9600 bps.
SCCR1 and SCCR2
The only important bit in SCCR1 ($102c
) is the M bit (bit 4),
which selects 8 or 9 bit mode. The other bits are used when the
serial port is being used to implement a network, with a single wire
connecting a bunch of ports on a serial bus.
SCCR2 ($102d
) is used to enable and disable virtually the
whole subsystem... transmitter, receiver, interrupts, etc. etc. The
bits are:
- TIE (bit 7)
- Transmit Interrupt Enable. When set to 1, the SCI will request
an interrupt when it's possible to write to the output buffer
without losing any characters.
- TCIE (bit 6)
- Transmit Interrupt Complete Interrupt Enable. When set to 1,
the SCI will request an interrupt when we're completely done
sending data. This is important if we're going to change the
speed, so we don't do something like try to change the speed
halfway through sending a character.
- RIE (bit 5)
- Receive Interrupt Enable. When set to 1, the SCI will request
an interrupt when new data has arrived for us to read.
- ILIE (bit 4)
- Idle Line Interrupt Enable. When set to 1, the SCI will request
an interrupt when the serial line is quiescent. Again, we only
want to change speeds when there is nothing on the line.
- TE (bit 3)
- Transmit Enable. Turns on the transmitter, so we can send data.
- RE (bit 2)
- Recieve Enable. Turns on the receiver, so we can receive data.
We don't really care about bits 1 and 0. Bit 1 is a "receiver wakeup
bit"; it's possible to set the receiver in a sleeping state and have
it wake up automatically. Bit 0 sends a "break" character; that's
implementing by sending enough 0's in a row on the line to guarantee
that the other side will get a framing error (what that means will be
described in a minute).
SCSR
The SCI Status Register ($102e
) is used to report the
current state of the serial interface. This includes both data
presence/absence and error conditions. This register contains:
- TDRE (bit 7)
- Transmit Data Register Empty. This means the output buffer in
the figure is empty, so it's safe to feed a new character to the
system. If you put characters in before the port is ready, they
overwrite the characters already in the buffer and characters
are lost.
- TC (bit 6)
- This says the output buffer, and the output shift register, are
both empty. The reference manual talks in terms of using this
condition to see when a modem can be disabled; you would also
use it to see when you can safely change the transmission speed.
- RDRF (bit 5)
- This says the input buffer has data in it, and you can read the
data. If you just blindly read a lot of data without checking
this, you'll get the same character over and over again.
- IDLE (bit 4)
- This says the serial line is quiescent (no data is being sent on
it). Just in case we wanted to do something exotic like have
multiple transmitters and receivers on a single wire or
something.
- OR (bit 3)
- Overrun: oops. We didn't get around to reading a character
that had come in, and a second one came in while we were
waiting. The second character has been lost.
- NF (bit 2)
- Noise Flag: oops. The receive line is so noisy we can't trust
the data we got.
- FE (bit 1)
- Framing Error: oops. When we were done reading the character,
we got a 0 instead of our expected 1 for a stop bit.
One important thing to mention is that to clear these flags, you need
to read the SCSR, and then either read or write the SCDR as
appropriate (read it to clear the receiver-related flags, write it to
clear the transmitter-related flags).
SCDR
The SCI Data Register ($102f
) is used to send/receive
data. It is actually two registers, which share a single address.
When you write to the register, you write to the transmiter UART.
When you read from it, you read from the receiver UART.