Storagebenchmarks

From AdminWiki

Jump to: navigation, search

Contents

What why how

We need information on how current RAID controllers (and software RAID solutions?) perform in varioius scenarios. As of now (Februar 2007) there is no such project to my knowledge. Current acquirement practices usually involve just buying whatever brand the person in charge fancys; hardly any benchmarking is done and results are often less than satisfactory.

Education about setting up RAID controllers might also be helpful. How many people can tell you by heart what the difference between Write-Back and Write-Through is and how this affects application performance and data safety.

Benchmark methods

Types of benchmarks

I'd opt for blockdevice-level benchmarks and application-level Benchmarks. FS (bonnie++, iozone, etc.) benchmarks are too muddy for raw controller performance due to the many layers between the Filesystem and the blockdevice to give useful numbers.

Block Device Benchmarks

FIXME

Application Benchmarks

Databases

Research sane benchmark sets for various databases.

I'd use MySQL with MyISAM tuned for "speed" as candidate for the relational-textfile style of database user and PostgreSQL configured for reliability (default) for real Database workload.

Fileserving

Use a webserver (lighttpd, apache2 mpm_worker) configured for maximum throughput to serve a given fileset. The client must be able to replay request patterns and report statistics.

Etc

Reproducibility

Much detail must be given to reproduce an identical test setup. This means:

  • Identical software setup

We must choose a set of software to do benchmarks with in a given time frame. This includes Kernel, Distribution and used Software (Webservers, Database Servers, etc).

  • Identical hardware setup

Except the RAID Controller in question the hardware shall remain unchanged. This includes the used Disk Drives as well. If we have to use disks with a different interface we should try to get ones with similiar specifications.

  • Identical working sets

We must ensure that the working set of our data stays the same on the block device; a restructured table on disk (in case of database benchmarks) or file set (in case of webserver benchmarks) could change the results quite noticeable. Either use standardized methods for reproducing the working sets and hope that the results on-disk are deterministic or use images for the working-set data.

RAID Controller Setup

The RAID controllers should always be tuned for maximum speed (RAID0) or for maximum reliability (all other RAID levels).

Personal tools