Linux filesystems. XFS
Última actualització: maig 2013
L'experiència m'ha demostrat que si vols un sistema de fitxers que aguanti miles de fitxers alguns d'ells de gran tamany, una molt bona elecció és XFS.
M'he trobat que la instrucció xfs_check no funciona bé amb sistemes de fitxers grans (obtenim un error "Out of memory"). En aquest casos utilitzarem l'instrucció xfs_repair.
--
Aquí algunes característiques de diferents sistemes de fitxers. Son unes línies extretes de "Guide to Linux File Systems" de Val Henson:
Choosing and tuning the right file system for your workload:
XFS only Linux FS to support more than 1TB reliably
'iostat' -- useful tool
No single best file system, workload-dependent
Factor of 10^6 in time between CPU/memory ops and I/O ops -- ns versus ms
How FS like to be treated:
-Mostly reads
-Large, contiguous IO
-Medium-sized files -- 4K-128K
-Medium-sized dirs -- 10-1000 entries
-Most IO near beginning of file
-Few metadata ops
-Clean unmount
How to abuse your FS:
-Fill one dir with a million files
-Simultaneously create one huge file with remaining space
-Randomly create and delete small files in same dir
-Randomly read and write single bytes of the large file
-Add and remove ACL/extended attribs
-Slowly yank the power plug
Diffs betw FS's:
-File system and file size
-Number of inodes
-Dir size and lookup algorithm
-File data R/W performance
-File create/delete performance
-Space efficiency
-Special features -- direct IO, execute in place, etc
Crash recovery method:
-Ease of repair
-Stability
-Support
ext2:
simple, fast, stable, slow recovery, easy to repair
ext3:
rock stable, fast recovery, slow metadata ops
reiser3:
lots of small files, big dirs, less stable, poor repair, less support
xfs:
large files, big dirs, big FS's, slow repair
jfs:
end-of-life'd by IBM
others less well tested, poor support
Common workloads:
embedded
avoid writing flash unless necessary
ext2 (for read-only) / ext3, minix for ramdisks
jffs2 for flash without write-balancing (modern flash _has_ write-bal)
laptop
withstanding frequent crashes
low performance demands
ext3 is best
eliminate writes as much as possible
mount -o noatime,nodiratime
group writes with laptop mode, read Documentation/laptop-mode.txt
desktop
sweet spot of most FS's
ext3 or reiser
reiser notail option improves performance at cost of efficient storage
large file working set? increase # of inodes cached in memory
Documentation/sysctl/fs.txt
file server
ext3 for few metadata ops
reiser for more metadata ops, small files
xfs for large streaming reads/writes, large dirs
ext3: data=writeback trades speed for data integrity after a crash
faster
ext3: data=journal reduces latency of sync NFS writes
ext3: default is data=ordered
can tweak block size
for v high perf, consider ext2
some cluster FS's use ext2 as the per-node base
mail server
mbox format (one big file) -- ext3
maildir format (lots of small files) -- reiser
ext3 w small blocks, high inode-to-file ratio can be good for maildir
don't cut any corners on your mail server -- reliability is key
database server
ocfs2 for cluster oracle databases
direct IO often imp
tuning FS's for DB's is an arcane art
video server
large files, write-once, read-many
streaming access
XFS clear winner
ext3 with larger reservation could work
NFS tuning tips
raise r/w size to ~8192 (8K)
use NFSv3 and TCP (not UDP)
async raises write perf but could cause probs in crash --
no longer the default
Can't recommend NFSv4 yet
distributed FS's
tradeoff of latency vs consistency
most are buggy and slow
use optimized for one case
NFS - multi read, single write
OCFS2 - DB's