|
The hidden cost of command latency
With new
SAS and SATA based subsystems reaching the market,
developers are learning first hand how difficult
it is to isolate tagged command queuing problems
across large subsystems. Random I/O operations
(commonly associated with transaction processing)
are more likely to stress device queue algorithms
during periods of peak activity. Its here that
subsystems that work flawlessly at low link utilization
rates may exhibit problems with stranded commands
during periods of high disk activity.
Multi-initiator environments found in clustering
applications create added complexity because each
SAS controller can transmit queued commands to
each device in the subsystem concurrently. Assuming
a blade server with 128 tags (typical SAS HDD
queue depth) x 128 HDDs = over 16,000 potential
outstanding queued commands. To effectively track
every outstanding command to completion requires
an analyzer capable of maintaining and monitoring
timing on thousands of queued operations.
Commands
that are ACKed by a device but fail to complete
or are slow to complete are surprisingly difficult
to isolate. SAS and SATA based disk drives configured
in RAID environments aggravate the problem because
they can generate data rates up 12 Gbps. Even
with maximum filtering techniques, these high
sustained data rates drastically reduce the amount
of elapsed time that can be recorded with conventional
analyzers.
|