Panasas Storage Guideline for Huge Directory

Panasas Storage Guideline for Huge Directory 

Note: Taken from Knowledge Base at panasas.com

Under some circumstances [1], the volume owner Director must perform directory lookups for a DirectFLOW™ client. These include any of the following:

This article discusses minimizing the volume owner Director's lookup times within a directory.

Minimize directory load time

A volume owner Director has a memory cache set aside for directories.  The entire directory must be loaded into this cache for lookups to be performed.  Thus lookups in very large directory can take much longer than lookups in smaller directories.

 

Directories can be optimized for best lookup performance by organizing files into functional groups.  Only files that a job will use during a given phase should be in the same directory.  Files that are used by other jobs, or during different time periods in the same jobs, should be located in other directories.  If the directory structure is organized this way, the directory will only load the entries for files the job may access in the near future.  This minimizes directory load time.

Example directory structure 1.  This directory contains over 800,000 files from multiple job runs.

$ ls -l

total 26151816600095

-rw-rw-r-- 1 ledmondson users  9367214 May 21 17:53 1000file

-rw-rw-r-- 1 ledmondson users 50148631 May 21 17:53 1001file

-rw-rw-r-- 1 ledmondson users 37359738 May 21 17:53 1002file

-rw-rw-r-- 1 ledmondson users 12348154 May 21 17:53 1003file

. . .

The capacity taken up by the directory structure and the entries is very large.

$ ls -ld .

drwxrwxr-x 6 ledmondson users 136536064 May 21 20:02 .

Example directory structure 2.  In this example, subdirectories have been created for each job run.  Subdirectories have been created under each job directory for input files, output files, and configuration settings.  An archive subdirectory has been created for files or subdirectories that are no longer actively used.  The application may be able to search in only one of these subdirectories at any moment, and thus reduce the size of the directories currently being read.  The number of irrelevant entries that must be searched through to find the relevant ones is reduced.

$ ls -ld .

drwxrwxr-x 6 ledmondson users 8192 May 22 01:04 .

$ ls -l

total 128

drwxrwxr-x 2 ledmondson users 4096 May 22 01:04 archive

drwxrwxr-x 5 ledmondson users 4096 May 21 22:06 job1

drwxrwxr-x 5 ledmondson users 4096 May 21 22:06 job2

drwxrwxr-x 5 ledmondson users 4096 May 21 22:07 job3

$ ls -l job1/

total 91232

drwxrwxr-x 2 ledmondson users 15171584 May 21 23:14 config

drwxrwxr-x 2 ledmondson users 16273408 May 21 22:18 input

drwxrwxr-x 2 ledmondson users 15171584 May 21 23:25 output

$ ls -l job1/input/

total 4452496215102

-rw-rw-r-- 1 ledmondson users 50148631 May 21 20:03 1000000file

-rw-rw-r-- 1 ledmondson users 37359748 May 21 20:03 1000001file

-rw-rw-r-- 1 ledmondson users 12348154 May 21 20:04 1000002file

. . .

$ /bin/ls -fa1 job1/input/ | wc -l

119179

The subdirectories with the files, such as  job1/input/, are still large, but are much more manageable.  job1/input/ contains 119179 entries.  The capacity of the directory is 16273408, or 5.52 MB.   A directory the size of job1/input/ will be must faster to search through than the directory in the first example, which contained 888891 entries, and had a capacity of 136536064, or 130.21 MB.

Avoid directory cache thrashing

The size of a volume owner Director's directory cache is limited.  A 64-bit Director's cache can contain about 1.5 to 1.7 million total entries.

To limit possible metadata performance degradation from very large directories, PanFS OS versions 5.x and higher limit the number of directory entries in a single directory to one million.  (The limit on the number of directory entries in previous PanFS OS versions is 350,000.)  This limit applies to the number of entries in the directory itself, not to the number of entries in the directory tree.

 

As noted above, DirectFLOW clients must ask the File Manager (FM) on the volume owner Director to perform operations in directories on their behalf under a number of circumstances.  Since a number of clients may be asking the FM to perform directory operations in a number of large directories at the same time, the FM's directory cache may become full even if the number of entries in any one of those directories is well below the directory entry limit. 

If the directory cache is full, and another directory needs to be loaded, the FM must unload a directory already in the cache.  If the FM must repeatedly load and unload directories, the Director will experience a high CPU utilization.  Directory lookups will become slow.  This problem is known as directory cache thrashing. 

To avoid this slowness, it is recommended that the number of entries in a single directory be kept to about 150,000 or lower.  Best performance may be seen if the number of entries in directories is kept to about 50,000 or lower.

References:

[1] Panasas Knowledge Base at panasas.com