Nandctl: Difference between revisions

From embeddedTS Manuals
No edit summary
Line 80: Line 80:


= FAQ =
= FAQ =
== Why are is the XNAND driver implemented in userspace instead of a kernel driver? ==
On previous FPGA devices we did implement device drivers.  This adds magnitudes of difficulty for debugging, and makes a port to a new kernel a much larger task as all of the drivers need to be updated for newer kernel APIs.  We have done tests with userspace vs kernel drivers and there is only an extremely minimal speed gain in moving to the kernel especially for the amount of maintenance it requires.
== I'm seeing very low performance, is this from the userspace driver? ==
The slow performance is not due to the userspace implementation, but is due to the redundancy mechanism.  The tradeoff for the slower speed is the significantly increased reliability.
== Can I still use a standard flash filesystem? ==
We currently do not support any flash filesystems such as jffs2.
== How do I tell if the XNAND is failing? ==
== How do I tell if the XNAND is failing? ==
The stats option will tell you how many fallbacks there have been.
The stats option will tell you how many fallbacks there have been.
Line 102: Line 111:
</source>
</source>


Fallbacks are fairly common, however when there are failures in exactly the right places you can get up to level 3 fallbacks which means there is no parity for some of the data.  When there is no parity left you should consider replacing the NAND chip.
Fallbacks are fairly common, however when there are failures in exactly the right places you can get up to level 3 fallbacks which means there is no parity for some of the data.  When there is no parity left you should consider replacing the NAND chip
 
Newer implementations of nandctl have an added output of --listbb that outputs 'xnand_fail_danger' as either 1 or 0.  All new nandctl implementations will feature this.  Any older products will have this option added if a new release is necessary in the future.  This checks to see if failures are part of the same RAID area in XNAND.  Multiple bad blocks in one area is bad, so if xnand_fail_danger is non-zero the flash should be replaced before the unit is deployed.


== Does a failed repair from an audit mean failure? ==
== Does a failed repair from an audit mean failure? ==

Revision as of 11:38, 28 October 2011

Overview

The nandctl utility allows manipulation of the FPGA XNAND core. This allows to you read/write data, and present a network block device to the OS. See our white paper on the XNAND here].

Usage

In our scripts nandctl is typically invoked with these options:

nandctl -X -z 131072 --nbdserver lun0:disc,lun0:part1,lun0:part2,lun0:part3,lun0:part4

-X

This tells the command to use the XNAND layer. Currently using nandctl without XNAND is not supported.

-z 131072

This sets the block size for the XNAND. The number of blocks will be automatically detected.

--nbdserver lun0:disc,lun0:part1,lun0:part2,lun0:part3,lun0:part4

This sets up an nbd server with the various partitions and raw block devices.

lun0:disc will create the raw block device at port 7525.

lun0:part1 will create the first partition at port 7526.

lun0:part2 will create the first partition at port 7527.

...

You can set up any number of partitions you need this way. The network block device ports are accessed using the standard nbd-client. Typically they will be invoked like this:

  nbd-client 127.0.0.1 7525 /dev/nbd0
  nbd-client 127.0.0.1 7526 /dev/nbd1
  nbd-client 127.0.0.1 7527 /dev/nbd2
  nbd-client 127.0.0.1 7528 /dev/nbd3
  nbd-client 127.0.0.1 7529 /dev/nbd4

This way /dev/nbd0 will be the block device, /dev/nbd1 will be the first partition, and so on.

Help

General options:
  -R, --read=N            Read N blocks of flash to stdout
  -W, --write=N           Write N blocks to flash
  -x, --writeset=BYTE     Write BYTE as value (default 0)
  -i, --writeimg=FILE     Use FILE as file to write to NAND
  -t, --writetest         Run write speed test
  -r, --readtest          Run read speed test
  -n, --random=SEED       Do random seeks for tests
  -z, --blocksize=SZ      Use SZ bytes each read/write call
  -k, --seek=SECTOR       Seek to 512b sector number SECTOR
  -e, --erase=NSECTORS    Erase NSECTORS 512b sectors
  -d, --nbdserver=NBDSPEC Run NBD userspace block driver server
  -I, --bind=IPADDR       Bind NBD server to IPADDR
  -Q, --stats             Print NBD server stats
  -f, --foreground        Run NBD server in foreground
  -l, --lun=N             Use chip number N
  -X, --xnand             Use XNAND RAID layer
  -A, --autormw           Use AUTORMW layer
  -s, --stress=BLOCK      Stress block BLOCK until it breaks
  -H, --hwtest=BLOCK      Hardware profile block BLOCK
  -b, --break=SECTOR      Erase sector SECTOR for testing
  -I, --xnandinit=NSECT   Initialize flash chip for XNAND RAID
  -L, --listbb            List all factory bad blocks
  -a, --audit             Check integrity of XNAND data
  -Y, --yes               Answer yes to all audit repairs
  -N, --no                Answer no to all audit repairs
  -v, --verbose           Be verbose (-vv for maximum)
  -P, --printmbr          Print MBR and partition table
  -M, --setmbr            Write MBR from environment variables
  -h, --help              This help

When running a NBD server, NBDSPEC is a comma separated list of
devices and partitions for the NBD servers starting at port 7525.
e.g. "lun0:part1,lun1:disc" corresponds to 2 NBD servers, one at port
7525 serving the first partition of chip #0, and the other at TCP
port 7526 serving the whole disc device of chip #1.

FAQ

Why are is the XNAND driver implemented in userspace instead of a kernel driver?

On previous FPGA devices we did implement device drivers. This adds magnitudes of difficulty for debugging, and makes a port to a new kernel a much larger task as all of the drivers need to be updated for newer kernel APIs. We have done tests with userspace vs kernel drivers and there is only an extremely minimal speed gain in moving to the kernel especially for the amount of maintenance it requires.

I'm seeing very low performance, is this from the userspace driver?

The slow performance is not due to the userspace implementation, but is due to the redundancy mechanism. The tradeoff for the slower speed is the significantly increased reliability.

Can I still use a standard flash filesystem?

We currently do not support any flash filesystems such as jffs2.

How do I tell if the XNAND is failing?

The stats option will tell you how many fallbacks there have been.

# nandctl --stats
nbdpid=452
nbd_readreqs=0
nbd_read_blks=0
nbd_writereqs=0
nbd_write_blks=0
nbd_seek_past_eof_errs=0
xnand_xfixs=0
xnand_scrubs=0
xnand_fallbacks=0
xnand_level2_fallbacks=0
xnand_level3_fallbacks=0
xnand_write_fails=0
xnand_data_losses=0
xnand_blk_erases=0
read_seeks=0
write_seeks=0

Fallbacks are fairly common, however when there are failures in exactly the right places you can get up to level 3 fallbacks which means there is no parity for some of the data. When there is no parity left you should consider replacing the NAND chip.

Newer implementations of nandctl have an added output of --listbb that outputs 'xnand_fail_danger' as either 1 or 0. All new nandctl implementations will feature this. Any older products will have this option added if a new release is necessary in the future. This checks to see if failures are part of the same RAID area in XNAND. Multiple bad blocks in one area is bad, so if xnand_fail_danger is non-zero the flash should be replaced before the unit is deployed.

Does a failed repair from an audit mean failure?

# nandctl -X -a
Chip ID: 0xDCEC
Size: 524288 sectors, 2048 blocks of 131072 bytes
Auditing XNAND data..(2048 blocks)
Unable to read primary block 888. Repair? (y/n) y
Unable to read primary block 888.
Unable to fix.
Unable to read primary block 1458. Repair? (y/n) y
Unable to read primary block 1458.
Unable to fix.
Audit complete.
Bad blocks=2
Repaired blocks=0

When you run an audit you will very likely see bad blocks. This is common as almost every flash chip from the factory will have some bad blocks that are marked as bad. These do not indicate any failure, but you can have your system list all of the known bad blocks with:

nandctl --listbb