Filesystem Corruption with SD cards

From embeddedTS Manuals

Disk corruption is a common issue in embedded development and considerations must be made for a robust system. The 4GB Sandisk MicroSD cards we test with and provide can support on average 8TB of writes without any corruption.

This graph shows our SD write endurance test for 40 TS-7553 boards running a doublestore stress test on 4GB Sandisk MicroSD cards. A failure is considered when a single bit of corruption is found.

Counterfeit cards, or unsupported media

It is estimated that a third of all Sandisk branded flash cards are fake. We recommend Sandisk SD cards as that is what we use for our testing, but make sure you use a reputable source for acquiring any flash media. All SD cards supplied by us come in very controlled batches, and several SD cards from each order are tested to verify changes in the batches have not impacted compatibility with our boards.

If you use any other brand or supplier we strongly recommend testing the cards thoroughly with our board before going into production. We recommend avoiding ATP flash media as well as "Industrial SD cards". We have experienced corruption with Industrial Cards, though they seem to work if multiwrite is disabled but this makes the card extremely slow to write.

You can use sdctl to evaluate the card which will write and verify data until the card shows any corruption. This test can take roughly 4 weeks when using a Sandisk card, but obvious incompatibilities will show up almost immediately or within a few days.

# This will launch sdctl in the background and 
sdctl --dblstor-stresstest &

To see the progress of the stresstest you can view the log with:

sdctl --dmesg-follow

Once per second dmesg will output:

 <1>                     <2>       <3>             <4>   /<5>                    <6>           <7>     <8>/<9>     <10>
 Jul  2 15:53:06 dblstor 0: reqs:  864 writes:     3.53MB/583.48MB reads:        0KB seeks:    0 errs: 0/0 status: OK
  1. Date / Time
  2. SD Socket # (LUN)
  3. Write Requests
  4. MB Written since last log
  5. Write Total MB
  6. Read Total MB
  7. Seeks since last log
  8. Errors
  9. Unrecoverable Errors
  10. Current card status

Once sdctl detects a failure it will print messages with more details on the failures. For example:

 May  7 17:22:20 0: 515 more contiguous CRCFAILS
 May  7 17:22:20 0: FALLBACK_BADBLOCK: sector 0x1ad800, len 520
 May  7 17:22:20 0: CRCFAIL: sector 0x1ada08/900.99MB, expected crc 0x5af0ed37, got crc:0x5af0ed37/sector:0x361208 fallback=1
 May  7 17:22:20 0: CRCFAIL: sector 0x1ada09/900.99MB, expected crc 0x46aaeade, got crc:0x46aaeade/sector:0x361209 fallback=1
 May  7 17:22:20 0: CRCFAIL: sector 0x1ada0a/900.99MB, expected crc 0x55115f66, got crc:0x55115f66/sector:0x36120a fallback=1
 May  7 17:22:20 0: CRCFAIL: sector 0x1ada0b/900.99MB, expected crc 0xd7136c18, got crc:0xd7136c18/sector:0x36120b fallback=1
 May  7 17:22:20 0: CRCFAIL: sector 0x1ada0c/900.99MB, expected crc 0xd6d75d2d, got crc:0xd6d75d2d/sector:0x36120c fallback=1
 May  7 17:22:20 0: FALLBACK_BADBLOCK: sector 0x1ada08, len 1528
 May  7 17:22:21 0: 1523 more contiguous CRCFAILS
 May  7 17:22:21 0: FALLBACK_RECOVERED: sector 0x1ad800, len 2048

CRCFAIL errors are the most common when a card begins failing which indicates that it read a block of data where the fallback and primary do not match. Contact us for more information on any other errors you receive.

Interrupting a Write

The most common issue resulting in data corruption is when powering off SD cards in the middle of a write. SD cards usually use MLC NAND flash for storage coupled with a manufacturer and model specific firmware. NAND flash has a limitation where in order to perform a single byte write it must first read about 128KB to 256KB (or more) containing that byte into memory and erase that sector on the NAND chip. This is the erase block size and can vary based on the card. It takes the block in memory, changes the single byte in that copy, erases the intended location on the flash and then commits it back to the disk. If power is lost in the middle of this write procedure it will lose whatever data was being written.

Most SD card firmwares also contain a wear leveling mechanism where they maintain a logical to physical mapping. This means that writing a contiguous file may actually end up in different areas in the NAND chip. If you interrupt a write cycle where it has erased a block, but not yet committed changes to the disk it will have lost data seemingly randomly across the card.

There are several strategies you can adopt to avoid or limit your chance of corruption.

Using a Completely Read Only FS

The most safe method is possible if you do not need to perform any writes that need to persist across reboots. If you are designing a data logger this is certainly not a good option, but if you're only responding to outside I/O this is your best choice. Once you get your application developed and ready for deployment you can run it from the initramfs. This is already a read only filesystem which will never write to the disk, but it mounts the Debian filesystem as read only so all of Debian is still available under /mnt/root. The largest difference is that the Debian init system is not used, so any daemons or services needed will need to be manually launched. See the Starting Automatically section for more details on the initramfs startup script at /ts/init.

In the initramfs none of the filesystems aside from a volatile ramdisk are mounted read/write so it is safe to power off at any time. You can still write data which will go to memory, but it will be lost on a poweroff.

If you require the full Debian filesystem you can set the soft jumper #4. This will mount the Debian filesystem on the SD card as read only, but will commit any attempted writes to a ramdisk using unionFS. This creates an environment where Debian applications think they can write, but since there are no writes to the SD it is safe to disconnect power. Writes however are not unlimited since they will use RAM so any running applications that write files continuously should be cleaned occasionally.

Calculated Risk

The next best option is to reduce the chance of corruption to an acceptable level with appropriate software changes. Depending on how much data you are writing and how often power outages occur you can often bring the chance of SD corruption to a manageable level. There are 3 crucial factors to determine if this option is viable. The number of power interrupts, the amount of data to be written, and how often this data is written.

In order to implement this to limit the amount of time where the system is vulnerable to limit a write there is some configuration necessary. Using Linux in its default behavior is optimized for best throughput, but it accomplishes this by keeping data in ram and deciding automatically when it is appropriate to commit changes to disk. This makes controlling the writes very difficult, so this must be configured first to be more predictable.

The simplest way to accomplish this is to use a single filesystem to contain all of your writes. You can design your software application to write to /tmp, and periodically commit the writes to disk. For example, if you repartition your SD card to have a third partition for writes:

## If the directory does not already exist
#mkdir /mnt/sd/
mount /dev/nbd0p3 /mnt/sd/ -oro

This will keep the data readable, but safe until a write is ready to be written. When you want to write data:

mount -o remount,rw,sync /mnt/sd/
cp /tmp/datatowrite /mnt/sd/
mount -o remount,ro /mnt/sd/
sync
Note: A filesystem cannot be remounted while file handles are open, so the root filesystem should not be used for the writes unless you control all of the writes by every process running. If there are any open files the remount will fail and report that the filesystem is busy. You can use the "lsof" command to find what processes have open file handles.

This will perform the write and immediately switch the SD card back into a read only state. If the power to the device is human operated it may be a good idea to use an LED to indicate when it is writing and should not be powered off. This can also be useful for debugging. For example:

tshwctl --redledon
# This will give the user time to react to the LED
sleep .5
mount -o remount,rw,sync /mnt/sd/
cp /tmp/datatowrite /mnt/sd/
mount -o remount,ro /mnt/sd/
sync
tshwctl --redledoff

If you are booting to Debian the root filesystem should be mounted using the read only mechanism by setting the soft jumper #4. You can repartition the SD card to have a separate partition for the writes which can be mounted when the writes actually need to occur, or use a second SD card to handle the writes. If you are operating from the initramfs, you may be able to use the root filesystem depending on what open file handles the running applications have open.

Battery Backup

If writes are continuous then a battery backup may be required. A battery backup should either be able to keep the system alive through all power outages, or should have a mechanism for reading the power level and usually disconnecting power in software. Once you reach lower levels you can simply run a "shutdown -h now" to safely unmount the filesystem. The power supply should reapply power to the board when power is available.

Other Resources