Technologic Systems Double Store



DoubleStore is a software layer unique to Technologic Systems that provides RAID-like filesystem redundancy on SD flash media. Using DoubleStore robust file storage can be achieved either using 2 separate cards, or on one card by splitting capacity in half. Furthermore, TS has put the ability to boot from a DoubleStore SD card in the low-level TS-BOOTROM startup firmware, providing a fault tolerant bootup of the Linux kernel and initrd on TS boards.

The SD Card Problem

Realizing a highly reliable embedded system flash data store implemented on top of SD media has been an ongoing challenge for Technologic Systems engineers, due to the following issues:

  • Unlike hard drive head crashes, SD cards rarely fail as a whole, instead they exhibit bit corruption or lack of programmability on a sector by sector basis.
  • Even though there are CRC checks on data transfer to/from SD cards and the ability for the SD card to return that a requested read/write has failed, rarely have we found a SD card that properly reports media failures to the controller hardware and instead prefers to return every op as a success. This makes strategies often adopted on high end hardware such as common RAID schemes unreliable.
  • The commoditization and miniaturization of flash over the years has reduced reliability in general and allowed for not only sub-standard SD card designs but also counterfeits.

The Solution

TS has created a layer on top of the SD block device called DoubleStore. DoubleStore solves both the problems of detection and correction of silent data corruption on SD media using CRCs and allows tried and tested filesystem architectures such as EXT2 to be placed on top of it.

The CRC32 can be used to detect corruption and the sector number is used detect failure modes of SD where correct data is written to the wrong sector. When a corruption is detected, a copy of the sector is retrieved from the fallback copy on the same card, or the second card if it is available. After recovering the data from the fallback sector, the correct data is rewritten back to the original sectors. Several SD failure modes are transient, and rewriting/scrubbing the sector often permanently corrects it.

It is worth noting that the DoubleStore storage scheme significantly compromises write speed and SD card capacity by 44% since data+CRC is written twice. Read speed is also reduced to 89% of max. However this is often of little consequence with an embedded system since it is usually more important that data is safe from corruption.

Pros and Cons of Double Store

Pros

  • Redundancy: Data stored using DoubleStore is stored twice. Should a CRC failure be detected, the fallback store is consulted and restored automatically.
  • Ultra-reliable bootup: DoubleStore isn’t just used for the filesystem, we also use it to store a kernel and initial ramdisk image.
  • File system flexibility: Since DoubleStore operates at the block level, any filesystem supported by the kernel can be used.
  • Self-healing: Any time a sector is found corrupt and successfully restored, it will be automatically rewritten or "scrubbed". If a new, blank replacement fallback SD is inserted it will automatically be rebuilt from the primary.
  • Diagnostics: DoubleStore keeps a count of all data written to the card so you know when to expect your flash to wear out. If DoubleStore catches a card behaving strangely or experiencing silent data corruption, it will blink an LED on the board letting an operator know a card should be replaced.
  • Automatic Health Checks: Every bootup starts a low priority background verification of all sectors on the card so errors can be discovered and fixed before being required by applications.

Cons

  • Reduced write speed performance: Due to the need to write everything twice and also write additional metadata containing CRC’s, sequence numbers, etc. write speed performance is a little worse than 2.125x slower.
  • Storage capacity: Using DoubleStore requires 2.125x the capacity to store data because of overhead for fallback data storage and CRCs.
  • Resilvering CPU usage: Every bootup the board will likely be busy for up to a half hour as it verifies every sector of data.  This is done in the background at the lowest priority possible, but it still may effect startup performance of some applications.

Platform support

Technologic Systems has introduced DoubleStore on the TS-7520-BOX and it will be a feature on several more new products in 2012.

Contact Information

Technologic Systems

16525 E Laser Dr
Fountain Hills, AZ,
USA

tele: 480.837.5200
fax: 480.837.5300
info@embeddedarm.com
www.embeddedARM.com

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • TwitThis