Use the FAT File Format to Move DataJeff concludes his introduction to SD technology with information
about using the FAT file format to move data. He also describes how
to create directories and file entries, and delete, save, and
append data using ring buffers.
Whether it is audio, video, graphic, database,system, executable,
or any number of other file types, they can all be stored, moved,
and copied from one storage medium to another, thanks to the
operating system‘s ability to understand how each medium handles
the data. While we can't necessarily execute an OS-specific file
across multi-OS platforms, many platforms support multiple file
systems. So, we can at least exchange data via several supported
types of media.
When floppies took over from tapes as the storage medium of choice,
DOS introduced us to the FAT file system. At the time,clever
designers used shortcuts (space-saving data packing) to cram the
most data onto the available media. This inherently put maximum
physical limits on what the system could handle. With file sizes
measured in kilobytes, this limit was of no concern at the time.
Today‘s needs show how this may have been a bit shortsighted.
Improvements to the original FAT file system have met those needs
for now. Can we expect this to remain adequate? It services its
intended need well; and while we should never say never, I believe
it will continue to be supported for many years because it is so
ingrained into today‘s media.
Last month, I started discussing how you can use SD solid-state
storage media in your embedded designs. Although the intended
nibble interface is open to those who join the SD Association, a
simplified version of the physical layer specification is available
at www.sdcard.com. This interface is a standard SPI. While the
throughput might be slower (1 bit versus 4 bits), it works well
with a microcontroller's standard SPI hardware. Although I consider
the USB thumb drive to be the most popular file transfer media,most
cameras, phones, PDAs, MP3 players, and other portable electronics
use a smaller form of solid-state storage. The SD card is widely
used and available. This makes it a perfect match for embedded
products (see Photo 1)。
Photo 1-Like everything else, even removable solid-state memory
devices have undergone a shrinking process. SD cards are great for
adding file storage to your project.
If you haven‘t read the first article in this series, please take a
few minutes to review the points I covered last month("Access SD
Memory Cards (Part 1): Solid-State Storage Media in Embedded Apps,"
Circuit Cellar 222, 2009)。 The FAT file system is divided into four
sections: the reserved region, the FAT region, the root directory
region, and the file and directory data region. The reserved
section contains the boot sector or BIOS parameter block with
definitions of various media parameters including where to find the
other regions. The FAT region is a list of tags indicating the
status of each cluster or group of sectors. The root directory
region contains a list of files or directory names and
corresponding information. The file and (sub)directory region hold
the actual file data or additional subdirectories.
With the interfacing and protocols introduced last month, I had
room to describe only the first of four sections of the FAT file
system. This month, I will begin by digging into the root directory
region using the FAT16 format. From the reserved region, I
previously determined a number of important parameters. For your
reference, there are 512 entries in a directory. There are 237
sectors in each of two FATs. The first FAT begins in sector 0x86.
There are 512 bytes per sector. There are 32 sectors per
cluster,and there are 32 bytes per directory entry. Because I know
that the FAT begins at sector 0x86 and each of the two FATs are
237(0xED) sectors each, the root directory must be at 0x86 + 0xED +
0xED or sector 0x260. I‘ll start by looking at this sector or the
logical block address (LBA)。
ROOT DIRECTORY, FAT REGIONS
A dump of LBA 260 (the first root directory sector in Figure 1)
shows nine entries: a volume ID, four files, two directories (one
deleted), and two entries associated with a long file name. All of
the directory entries,whether they are file or directory
entries,take on the same format. When they are initially formatted,
all directory entries contain 0x00s. A 0x00 as the first byte of
directory entries indicates that the entry is empty and there are
no more used entries in the rest of the directory. (This fact saves
search time.) A file name (or directory name) must consist (for the
most part) of one or more alphanumeric characters. Although long
file names can exceed eight characters, I will not discuss them
here. (You can read about how they are handled in most overviews of
the FAT file system.) The directory structure begins with an
eight-character name and a three-character extension. A file name
has an implied "。" between its name and extension. This is
determined by the twelfth byte in the directory entry. The
attribute byte contains flag bits indicating, among other things,
if the entry is a file or a directory. Other bytes contain various
time and date information. Besides the name and attribute
parameters, the last two are significant. The cluster value tells
you where to look for the first sector of the file or subdirectory
and the FAT location associated with the entry. The file length
indicates the file size in bytes. (Directories always have a zero
file length as can empty files.)

The first sector (of either copy) of a FAT region holds 256 16-bit
pointers for the cluster numbers 0x00 to 0xFF (with the following
sectors holding additional pointers for the remaining clusters)。
The 16-bit value stored at each cluster position begins life as
zero. (Except for cluster 0 and cluster 1, these are reserved and
can not be used.) A value of zero means the cluster is not in use.
When a directory entry is created, its cluster low word value (at
offset 0x1a in the directory entry) points to the first cluster
used by the entry. In Figure 1, cluster 2 was assigned to directory
entry FILE.TXT, a text file of 4 bytes. This refers to the cluster
where the data of the file begins and the FAT location where more
information is held. When the entry is a directory or a file that
is less than 16,384 bytes, it does not require more than a single
cluster (32 sectors)(see Table 1)。 Therefore, the 16-bit FAT entry
for cluster 0x0002 contains a value that indicates that this is the
last cluster in the file with a value of (0xFFF8- 0xFFFF)。
Otherwise, the 16-bit FAT entry for cluster 0x0002 would contain
the value of the next cluster used by the file. Unless a cluster is
the last, each FAT entry would then point to an additional cluster,
and so on.

In the dump of the first sector of the FAT, you can see how this
works (see Figure 2)。 The sector data begins after the block byte
at address 0x90E. The first few words are 0xFFF8s and 0xFFFFs. The
third word is for cluster 0x0002. The 0xFFFF indicates that this is
the last cluster. I added a large file (HALFDO~ 1.JPG) so you can
see how the chaining works. In the root directory,this file was
assigned cluster 0x000C(12th)。 The twelfth word in the FAT entry is
not 0xFFFF, but 0x000D. This file‘s data does not end in cluster
0x000C but continues on into cluster 0x000D. Now look at the
thirteenth FAT word and you will see it doesn't end there but each
one points to another cluster up to cluster 0x00AD. This FAT entry
is 0xFFFF, indicating that this is the last cluster of the file.
Back in the last 4 bytes of this file‘s entry in the root directory
sector, you can see that the file's size for this file is
0x00285915 (2,644,245 bytes) and requires many clusters.

Additional clusters were allocated sequentially for this file, but
they don‘t have to be. The chaining process can jump to any unused
cluster. This is a good time to mention what happens when something
is deleted. If a single byte-the first character of a directory
entry-is changed to 0xE5, the file (or subdirectory) is considered
unavailable. Note that all of the information including the actual
data has not been altered in any way. This makes every deleted file
potentially recoverable, unless it is damaged by data that has been
subsequently written to the disk. Thus,remember that deleting a
file does not delete the information. Apply a full format to
totally wipe the media, not just a quick erase.
PROJECT OBJECTIVESI could have wimped out and just added the necessary functions to
do logging to and dumping from a file already on the SD card. But,
I wanted to make this project as helpful as possible. I added
functions that help show how things are done but certainly aren't
necessary for this project. The 2 × 20 LCD and threebutton
interface really made this a challenge. Assuming your formatted SD
card has no files on it, you would need to be able to create an
entry in the root directory region. This project gives you three
choices: create a file, create a (sub) directory,or exit without
doing anything. I use the top line of the LCD to give you a choice
and the second line to indicate the function of the buttons.
Usually, this is button 1, display the next item, button 2, display
the previous item, and button 3, choose the item. Let‘s create a
(sub) directory in the root directory region (see Figure 3)。

Figure 3-This flowchart greatly simplifies the process for creating
a directory. When searching the FAT, for instance, you may need to
search every word in every sector of every cluster (up to the
maximum 237 sectors per FAT) for an entry just to find out that
there is no room left. Obviously, the search loops are exercised
more as the media fills up.
Entering a directory name is the first challenge. I can take
advantage of a previously required routine to move 20 characters
from the RAM into row 1 of the LCD. The blinking cursor never shows
up on the LCD because the LCD print statement leaves it beyond the
last visible character position. By moving the cursor to character
position 1-20, it can be used as a prompt to you. Buttons 1 and 2
cycle forward and backward through legal characters (including a
blank space and backspace arrow for correcting mistakes)。 Button 3
advances the cursor to the next position (or backspaces)。 The entry
routine exits after eight characters are entered.
All of this entry is worthless if we don‘t have any open
clusters,so we'd better check the FAT region. I use three basic
routines with the FAT: locate a FAT entry (cluster), read the
cluster entry, and write the cluster entry. I want to find an
unused FAT entry (read 0x0000) and claim it for the new directory
(write 0xFFFF)。 The offset into the FAT where the unused entry is
found becomes the cluster number for the new directory. Note that
directories require a single cluster and therefore will not require
chaining in the FAT. A second copy of the FAT is kept for safety
measures. While this isn‘t normally used, it's a good idea to keep
it updated for compatibility.
Now the new cluster becomes the home for the new second- level
directory. You must add two directory entries to this clean slate,
the "dot" and "dot dot" directory entries. The "。" directory uses
the new cluster number as its cluster pointer. The "......"
directory uses the parent cluster as its cluster pointer.
One last operation is necessary before this new directory can be
used. You must go back to the parent directory and add the
directory name (and pertinent data) to an empty directory entry.
This completes the chain that can point us to the new directory‘s
cluster.
NEW COMMANDSAdding a file to a directory is actually easier than adding a
directory. But there are a few differences. The file name includes
a three-character extension. This is indicated on the LCD by a "。"
separating the eight-character name from the extension. The FAT is
handled the same way. (At this point, the file size is zero.)
Because the file size is zero, nothing needs to be placed anywhere
in the new cluster. An empty directory entry needs to be filled in
with the file name and pertinent data.
Assuming a new directory and file have been created(exist) in the
root directory cluster, a number of new commands become available.
The appropriate commands are offered when a directory or a file
name is chosen. When a directory is selected, you can change to the
new directory, delete the directory, add a new directory,add a new
file, or return to the present directory. When a file is selected,
you can delete the file, add a file,dump a file, log to a file, add
a new directory, or return to the present directory.
Execution spends most of its time in the operational mode. This is
where all the file and directory names in the present directory are
displayed (one at a time)。 When an SD card is inserted, this will
always be the root directory (in this case, that's LBA 0x0260)。 If
a directory name is selected (e.g., ONE) and you choose to switch
directories, the entry‘s first cluster pointer (0x0004) is used as
the cluster to be used as the present directory. (Refer to Figure 1
and you will see that directory ONE uses cluster 0x0004.) Figure 4
shows a dump of this directory‘s first sector, LBA = 0x0003
(Cluster-1)× 0x20 (sectors per cluster) + 0x0260 (root directory)or
0x02C0. You can see the "dot," "dot dot," and (sub) directory TWO
entries in the ONE (sub) directory. Note the first cluster word
locations (at offset 0x1A of each directory entry pointing to the
respective clusters of those entities)。By changing the first
character 0x54 of directory TWO to 0x0E5, this entry would be
eliminated. When using this application to delete a directory, no
check is made as to which files or subdirectories might be deleted
as a result of this action. Many operating systems won‘t allow a
directory to be deleted unless it is completely empty!

On the file side, choosing FILE1.TXT in the root directory gives us
the opportunity to dump this file(see Figure 1)。 The first cluster
position (offset 0x1A) in the FILE1.TXT directory entry
(0x0002)points to where this file‘s data is stored. From the
File_Size position (0x1C) in the FILE1.TXT directory entry, this
file's length is 0x00000004. The first sector dump of cluster
0x0002 LBA 0x0280 shows the first four characters of this file to
be "Test" (see Figure 5)。 This circuit‘s serial port is used as the
output device for the dump operation. The file length determines
how many characters will be sent. The characters are collected from
every LBA(every sector of every cluster in the file's chain)until
the appropriate number of characters have been transmitted. This is
easy for FILE1.TXT because it has only four characters. However,
the HALFDO~1.JPG file has multiple clusters. To dump this file, we
begin with the first sector of cluster 0x000C, LBA 0x03C0 (0x000B ×
0x20 + 0x0260), and send all 512 bytes. The LBA is incremented and
all bytes of each sector are sent until the whole cluster (0x20
sectors) has been sent(0x100 bytes × 0x20 sectors = 0x2000 or 8,192
bytes)。 The FAT entry at word offset 0x000C is then interrogated to
find out which cluster holds the next portion of data. An entire
cluster‘s worth of data is sent (another 8,192 bytes.) The whole
FAT thing is repeated until 2,644,245 bytes have been transmitted.

RING AROUND THE ROSIE
The last function, and the main reason for this project, is the
logging of data from the serial port. If you have been following
the processes up to this point, you should have a good
understanding of how this is accomplished. The serial port has been
implemented with output and input ring buffers. Each ring buffer
has a head and a tail pointer. When the buffers are empty, the head
pointer equals the tail pointer. Each pointer can point to any
address of the buffer from its lowest address to its highest
address (the buffer's length)。 The pointers are usually incremented
and must be repositioned to the beginning of the buffer if they
exceed the buffer‘s length-thus the term ring, or circular buffer.
Once the ring buffers are implemented, you no longer have to deal
with the serial port hardware directly, just the loading of the
output buffer and unloading of the input buffer.
From the serial port side, any characters received cause an RX
interrupt. The interrupt routine handles taking the received
character and putting it into the input ring buffer. The routine
may add only characters to the ring buffer via the buffer‘s head
pointer. Received characters are placed at the head pointer (and
the head pointer is incremented) only if there is room in the
buffer. There is room until the head pointer + 1 equals the tail
pointer. At this point, adding a character (and incrementing the
head pointer) makes the head and tail pointers equal. This was
previously defined as an empty buffer, so this would produce a
buffer overrun condition and the buffer's data would be lost. One
of two things must happen at this point, either the serial port
must use flow control to stop the data from coming in or the data
must be tossed out. This should not occur if the application can
remove the data from the ring buffer and store it in the SD media
faster than the data can be received.
On the serial output side, the output ring buffer will be emptied
via the output ring buffer‘s tail pointer. Unless the output ring
buffer's tail point equals the head pointer,there is a character
available for transmission. The TX interrupt routine is responsible
for keeping the output ring buffer empty.
When the four characters of FILE1.TXT were dumped, the characters
were placed into the output ring buffer using the buffer‘s head
pointer. Because moving characters from the sector buffer to the
output ring buffer will be fast,the application may stall waiting
for the TX interrupt routine to empty the output ring buffer. So
the dump time will be directly related to the data rate.
When logging data,assuming the data rate is sufficiently high
compared to the data input rate, any bottleneck will come from the
SD cards inability to write a block of data fast enough and get
back for more without allowing the ring buffer to wrap. I thought
I'd try logging a 0.5-MB file at a data rate of 19,200 bps for a
test. I expected approximately 2,000 characters per second. I saw a
sector write (lasting 35 ms) every 250 ms. That‘s four sectors, or
2,048 (i.e., 512 ×4) bytes per second. I pulled the SD card out and
put it in my PC to check the file. It was the proper length at
567,408 bytes and viewed correctly. So, while I was in Windows
Explorer, I used it to create an empty text file to try another
test at a higher data rate.
I put the SD card back into my project board and repeated the test.
I saw a sector write (lasting 35 ms) every 125 ms. Looking good!
However, when I ended this logging session, the directory was
trashed. (It viewed as if it had lots of garbage entries.) Hmm. It
must have run into timing issues. But wait, that would have caused
a loss of data and not affected the directory. Hmm. The short story
is the directory entry created using Windows Explorer didn't assign
a FAT entry (so the FAT entry was zero)because an empty file has 0
bytes. When I began logging, I looked at the directory entry‘s FAT
and assumed it had been assigned. (After all, that's what I do in
this application.) Because a FAT entry of zero is used by the root
directory,logging to it causes the root directory‘s sector to be
overwritten, causing catastrophic results. With this incorrect
assumption corrected, logging at 38,400 bps worked as expected.
SO MUCH MOREWhile this project succeeds in performing the tasks required to
explain how to use SD memory in a project, there is plenty more
that can be discussed. The Microchip Technology PIC24FJ64GA002 has
other useful hardware that you can explore. I purposely left time
and date stamping out of my directory entry routines(other than
making sure the entries were legal)。 This microcontroller has a
hardware real-time clock that you can use to implement accurate
time and date entries.
You will also find a programmable- length CRC generator, which
would make using CRCs a lot easier. I'll leave these and other
enhancements for to you to experiment with. If you would like me to
devote additional space to any of this,drop me an e-mail. You‘ll
find C routines implementing the FAT file system offered by many
manufacturers. However, you won't learn much about it by just
calling someone else‘s code. I like to seize every opportunity to
expand my knowledge base. Time constraints don't always allow this
process, but I hope I‘ve sparked your curiosity. Every so often,
you should try to take this less traveled path.
Jeff Bachiochi (pronounced BAH-key-AH-key)has been writing for
Circuit Cellar since 1988. His background includes product design
and manufacturing. You can reach him at
jeff.bachiochi@imaginethatnow.com or at www.imaginethatnow.com.
SOURCEPIC24FJ64GA002 Microcontroller
Microchip Technology, Inc.
www.microchip.com