HISTORY OF HARD DISK |
THE FIRST HARD DISK INTRODUCE IN 1954
A hard
disk drive
(HDD), hard disk, hard drive,
or fixed disk, is an electro-mechanical data storage
device that stores and retrieves digital
data using magnetic storage with one or more rigid rapidly
rotating platters coated with magnetic material. The platters are
paired with magnetic heads, usually arranged on a
moving actuator arm, which read and write data to the platter
surfaces. Data is accessed in a random-access manner, meaning
that individual blocks of data can be stored and retrieved in any order.
HDDs are a type of non-volatile storage, that retains stored data when
powered off. Modern HDDs are typically in the form of a
small rectangular box.
Introduced
by IBM in 1956, HDDs were the dominant secondary
storage device for general-purpose computers beginning in the
early 1960s. HDDs maintained this position in the modern era
of servers and personal computers, though personal computing
devices produced in large volumes, like cell phones and tablets, rely
on flash memory storage devices. More than 224 companies
have produced HDDs historically, though, after extensive industry
consolidation, most units are manufactured by Seagate, Toshiba,
and Western Digital. HDDs dominate the volume of storage produced
(exabytes per year) for servers. Though production is growing slowly (by
exabytes shipped), sales revenues and unit shipments are declining
because solid-state drives (SSDs) have higher data-transfer rates,
higher areal storage density, somewhat better reliability, and much lower
latency and access times.
The revenues for
SSDs, most of which use NAND flash memory, slightly exceeded those for
HDDs in 2018. Flash storage products had more than twice the revenue of
hard disk drives as of 2017. Though SSDs have four to nine times higher
cost per bit, they are replacing HDDs in applications where speed, power
consumption, small size, high capacity, and durability are important. As of
2019, the cost per bit of SSDs is falling, and the price premium over HDDs has
narrowed.
The primary
characteristics of an HDD are its capacity and performance. Capacity is
specified in unit prefixes corresponding to powers of 1000: a
1-terabyte (TB) drive has a capacity of 1,000 gigabytes (GB;
where 1 gigabyte = 1 billion (109) bytes). Typically, some of an HDD's capacity is
unavailable to the user because it is used by the file system and the
computer operating system, and possibly inbuilt redundancy for error
correction and recovery. There can be confusion regarding storage capacity since capacities are stated in decimal gigabytes (powers of 1000) by HDD
manufacturers, whereas the most commonly used operating systems report
capacities in powers of 1024, which results in a smaller number than
advertised. Performance is specified as the time required to move the heads to
a track or cylinder (average access time), the time it takes for the desired
sector to move under the head (average latency, which is a function of the
physical rotational speed in revolutions per minute), and
finally the speed at which the data is transmitted (data rate).
The two most common form factors for modern HDDs are 3.5-inch, for desktop computers, and 2.5-inch, primarily for laptops. HDDs are connected to systems by standard interface cables such as PATA (Parallel ATA), SATA (Serial ATA), USB, or SAS (Serial Attached SCSI) cables.
HARD DISK PRICE
First HDD Price 10 MegaByte
HISTORY
The first production IBM hard disk drive, the 350 disk storage, shipped in 1957 as a component of the IBM 305 RAMAC system. It was approximately the size of two medium-sized refrigerators and stored five million six-bit characters (3.75 megabytes) on a stack of 52 disks (100 surfaces used). The 350 had a single arm with two read/write heads, one facing up and the other down, that moved both horizontally between a pair of adjacent platters and vertically from one pair of platters to a second set. Variants of the IBM 350 were the IBM 355, IBM 7300, and IBM 1405.
In 1961 IBM
announced, and in 1962 shipped, the IBM 1301 disk storage unit, which
superseded the IBM 350 and similar drives. 1301 consisted of one (for
Model 1) or two (for Model 2) modules, each containing 25 platters, each
platter about 1⁄8-inch (3.2 mm) thick and 24 inches (610 mm) in
diameter. While the earlier IBM disk drives used only two read/write heads
per arm, 1301 used an array of 48[m] heads (comb), each array moving horizontally as a
single unit, one head per surface used. Cylinder-mode read/write
operations were supported, and the heads flew about 250 micro-inches (about
6 µm) above the platter surface. The motion of the head array depended upon a
binary adder system of hydraulic actuators which assured repeatable
positioning. The 1301 cabinet was about the size of three home refrigerators
placed side by side, storing the equivalent of about 21 million eight-bit bytes
per module. Access time was about a quarter of a second.
Also in 1962, IBM
introduced the model 1311 disk drive, which was about the size of a
washing machine and stored two million characters on a removable disk
pack. Users could buy additional packs and interchange them as needed, much
like reels of magnetic tape. Later models of removable pack drives, from
IBM and others, became the norm in most computer installations and reached
capacities of 300 megabytes by the early 1980s. Non-removable HDDs were called
"fixed disk" drives.
In 1963 IBM
introduced 1302, with twice the track capacity and twice as many
tracks per cylinder as 1301. 1302 had one (for Model 1) or two (for
Model 2) modules, each containing a separate comb for the first 250 tracks and
the last 250 tracks.
Some
high-performance HDDs were manufactured with one head per track, e.g.,
Burroughs B-475 in 1964, and IBM 2305 in 1970 so that no time was
lost physically moving the heads to a track and the only latency was the time
for the desired block of data to rotate into position under the
head. Known as fixed-head or head-per-track disk drives, they were very expensive
and are no longer in production.
In 1973, IBM
introduced a new type of HDD code-named "Winchester". Its primary
distinguishing feature was that the disk heads were not withdrawn completely
from the stack of disk platters when the drive was powered down. Instead, the
heads were allowed to "land" on a special area of the disk surface
upon spin-down, "taking off" again when the disk was later powered
on. This greatly reduced the cost of the head actuator mechanism but precluded
removing just the disks from the drive as was done with the disk packs of the
day. Instead, the first models of "Winchester technology" drives
featured a removable disk module, which included both the disk pack and the
head assembly, leaving the actuator motor in the drive upon removal. Later
"Winchester" drives abandoned the removable media concept and
returned to non-removable platters.
In 1974 IBM
introduced the swinging arm actuator, which made it feasible because the Winchester
recording heads function well when skewed to the recorded tracks. The simple
design of the IBM GV (Gulliver) drive, invented at IBM's UK Hursley Labs,
became IBM's most licensed electro-mechanical invention of all time, the
actuator and filtration system being adopted in the 1980s eventually for all
HDDs, and still universal nearly 40 years and 10 Billion arms later.
Like
the first removable pack drive, the first "Winchester" drives used
platters 14 inches (360 mm) in diameter. In 1978 IBM introduced a swing
arm drive, the IBM 0680 (Piccolo), with eight-inch platters, exploring the
possibility that smaller platters might offer advantages. Other eight-inch
drives followed, then 5+1⁄4 in
(130 mm) drives, sized to replace the contemporary floppy disk
drives. The latter were primarily intended for the then-fledgling personal computer
(PC) market.
Over time, as
recording densities were greatly increased, further reductions in disk diameter
to 3.5" and 2.5" were found to be optimum. Powerful rare earth magnet
materials became affordable during this period and were complementary to the
swing arm actuator design to make possible the compact form factors of modern
HDDs.
As the 1980s
began, HDDs were a rare and very expensive additional feature in PCs, but by
the late 1980s, their cost had been reduced to the point where they were standard
on all but the cheapest computers.
Most HDDs in the
early 1980s were sold to PC end users as an external, add-on subsystem. The
subsystem was not sold under the drive manufacturer's name but under the
subsystem manufacturer's name such as Corvus Systems and Tallgrass
Technologies, or under the PC system manufacturer's name such as the Apple
ProFile. The IBM PC/XT in 1983 included an internal 10 MB HDD,
and soon thereafter internal HDDs proliferated on personal computers.
External HDDs
remained popular for much longer on the Apple Macintosh. Many Macintosh
computers made between 1986 and 1998 featured a SCSI port on the
back, making external expansion simple. Older compact Macintosh computers did
not have user-accessible hard drive bays (indeed, the Macintosh
128K, Macintosh 512K, and Macintosh Plus did not feature a hard
drive bay at all), so on those models, external SCSI disks were the only
reasonable option for expanding upon any internal storage.
HDD improvements
have been driven by increasing areal density, listed in the table above.
Applications expanded through the 2000s, from the mainframe
computers of the late 1950s to most mass storage applications
including computers and consumer applications such as the storage of entertainment
content.
In the 2000s and
2010s, NAND began supplanting HDDs in applications requiring portability or
high performance. NAND performance is improving faster than HDDs, and
applications for HDDs are eroding. In 2018, the largest hard drive had a
capacity of 15 TB, while the largest capacity SSD had a capacity of
100 TB. As of 2018, HDDs were forecast to reach 100 TB
capacities around 2025, but as of 2019, the expected pace of
improvement was pared back to 50 TB by 2026. Smaller form factors, 1.8 inches and below, were discontinued around 2010. The cost of solid-state
storage (NAND), represented by Moore's law, is improving faster than HDDs.
NAND has a higher price elasticity of demand than HDDs, and this
drives market growth. During the late 2000s and 2010s, the product life
cycle of HDDs entered a mature phase, and slowing sales may indicate the
onset of the declining phase.
The 2011 Thailand floods damaged the manufacturing plants and impacted hard disk drive costs adversely between 2011 and 2013. In 2019, Western Digital closed its last Malaysian HDD factory due to decreasing demand, to focus on SSD production. All three remaining HDD manufacturers have had decreasing demand for their HDDs since 2014.
CAPACITY
Two Seagate Barracuda drives from 2003 and 2009, respectively
160 GB and 1 TB. As of 2022, Seagate offers capacities of up to 20TB.
The
highest-capacity HDDs shipping commercially in 2022 are 20 TB. The
capacity of a hard disk drive, as reported by an operating system to the end
user, is smaller than the amount stated by the manufacturer for several
reasons, e.g., the operating system using some space, use of some space for
data redundancy, space used for file system structures. Confusion
about decimal prefixes and binary prefixes can also lead to
errors.
Modern hard disk
drives appear to their host controller as a contiguous set of logical blocks,
and the gross drive capacity is calculated by multiplying the number of blocks
by the block size. This information is available from the manufacturer's
product specification, and from the drive itself through the use of operating
system functions that invoke low-level drive commands. Older IBM and
compatible drives, e.g., IBM 3390, using the CKD record format
have variable length records; such drive capacity calculations must take into
account the characteristics of the records. Some newer DASD simulates CKD, and
the same capacity formulae apply.
The gross
capacity of older sector-oriented HDDs is calculated as the product of the
number of cylinders per recording zone, the number of bytes per
sector (most commonly 512), and the count of zones of the drive. Some
modern SATA drives also report cylinder-head-sector (CHS) capacities,
but these are not physical parameters because the reported values are constrained
by historic operating system interfaces. The C/H/S scheme has been replaced
by logical block addressing (LBA), a simple linear addressing scheme
that locates blocks by an integer index, which starts at LBA 0 for the first
block and increments thereafter. When using the C/H/S method to describe
modern large drives, the number of heads is often set to 64, although a typical
modern hard disk drive has between one and four platters. In modern HDDs, spare
capacity for defect management is not included in the published
capacity; however, in many early HDDs, a certain number of sectors were reserved
as spares, thereby reducing the capacity available to the operating system.
Furthermore, many HDDs store their firmware in a reserved service zone, which
is typically not accessible by the user and is not included in the capacity
calculation.
For RAID subsystems,
data integrity, and fault-tolerance requirements also reduce the realized
capacity. For example, a RAID 1 array has about half the total capacity as
a result of data mirroring, while a RAID 5 array with n drives loses 1/n of capacity
(which equals the capacity of a single drive) due to storing parity
information. RAID subsystems are multiple drives that appear to be one drive or
more drives to the user but provide fault tolerance. Most RAID vendors
use checksums to improve data integrity at the block level. Some
vendors design systems using HDDs with sectors of 520 bytes to contain 512
bytes of user data and eight checksum bytes, or by using separate 512-byte
sectors for the checksum data.
Some systems may
use hidden partitions for system recovery, reducing the capacity
available to the end user without knowledge of special disk partitioning
utilities like disk part in Windows.
UNITS
Unit of computer
memory measurements
Binary
Digit = (1, 0)
1 bit = Binary Digit
8 bit = 1 Byte
1024
bytes = 1 KB (Kilo Byte)
1024
KB = 1 MB (Mega Byte)
1024
MB = 1 GB (Giga Byte)
1024 GB = 1 TB (Terra Byte)
1024
TB = 1 PB (Peta Byte)
1024
PB = 1 EB (Exa Byte)
1024
EB = 1 ZB (Zetta Byte)
1024
ZB = 1 YB (Yotta Byte)
1024
YB = 1 (Bronto Byte)
1024
Brontobyte = 1 (Geop Byte)
Geop
Byte is the Highest Memory
In the early days
of computing the total capacity of HDDs was specified in 7 to 9 decimal digits
frequently truncated with the idiom millions. By the 1970s,
the total capacity of HDDs was given by manufacturers
using SI decimal prefixes such
as megabytes (1 MB =
1,000,000 bytes), gigabytes (1 GB =
1,000,000,000 bytes), and terabytes (1 TB =
1,000,000,000,000 bytes). However, capacities of memory are
usually quoted using a binary interpretation of the prefixes, i.e.
using powers of 1024 instead of 1000.
The software reports
hard disk drive or memory capacity in different forms using either decimal or
binary prefixes. The Microsoft Windows family of operating systems
uses the binary convention when reporting storage capacity, so an HDD offered
by its manufacturer as a 1 TB drive is reported by these operating systems
as a 931 GB HDD. Mac OS X 10.6 ("Snow Leopard") uses the decimal convention when reporting HDD capacity. The default behavior of
the df command-line
utility on Linux is to report the HDD capacity as a number of 1024-byte
units.
The difference
between the decimal and binary prefix interpretation caused some consumer
confusion and led to class action suits against HDD manufacturers. The
plaintiffs argued that the use of decimal prefixes effectively misled consumers
while the defendants denied any wrongdoing or liability, asserting that their
marketing and advertising complied in all respects with the law and that no
class member sustained any damages or injuries. In 2020, a California
court ruled that the use of decimal prefixes with a decimal meaning was not
misleading.
PERFORMANCE
CHARACTERISTICS
The factors that
limit the time to access the data on an HDD are mostly related to the
mechanical nature of the rotating disks and moving heads, including:
- Seek time is a measure of how long it takes the head assembly to travel to the track of the disk that contains data.
- Rotational latency is incurred because the desired disk sector may not be directly under the head when data transfer is requested. Average rotational latency is shown in the table, based on the statistical relation that the average latency is one-half the rotational period.
- The bit rate or data transfer rate (once the head is in the right position) creates delay which is a function of the number of blocks transferred; typically, relatively small, but can be quite long with the transfer of large contiguous files.
Delays may also
occur if the drive disks are stopped to save energy.
Defragmentation is
a procedure used to minimize delay in retrieving data by moving related items
to physically proximate areas on the disk. Some computer operating systems
perform defragmentation automatically. Although automatic defragmentation is
intended to reduce access delays, performance will be temporarily reduced while
the procedure is in progress.
Time to access
data can be improved by increasing the rotational speed (thus reducing latency) or
by reducing the time spent seeking. Increasing areal density
increases throughput by increasing data rate and by increasing the
amount of data under a set of heads, thereby potentially reducing seek activity
for a given amount of data. The time to access data has not kept up with
throughput increases, which themselves have not kept up with growth in bit
density and storage capacity.
ACCESS AND
INTERFACES
Inner view of a 1998 Seagate HDD that used the Parallel ATA interface
2.5-inch SATA drive on top of 3.5-inch SATA drive, showing a close-up of (7-pin) data and (15-pin) power connectors
Current hard
drives connect to a computer over one of several bus types, including
parallel ATA, Serial ATA, SCSI, Serial Attached
SCSI (SAS), and Fiber Channel. Some drives, especially external
portable drives, use IEEE 1394, or USB. All of these interfaces
are digital; electronics on the drive process the analog signals from the
read/write heads. Current drives present a consistent interface to the rest of
the computer, independent of the data encoding scheme used internally, and
independent of the physical number of disks and heads within the drive.
Typically
a DSP in the electronics inside the drive takes the raw analog
voltages from the read head and uses PRML and Reed–Solomon error
correction to decode the data, then sends that data out to the standard
interface. The DSP also watches the error rate detected by error
detection and correction and performs bad sector remapping, data
collection for Self-Monitoring, Analysis, Reporting Technology, and
other internal tasks.
Modern interfaces
connect the drive to the host interface with a single data/control cable. Each
drive also has an additional power cable, usually direct to the power supply
unit. Older interfaces had separate cables for data signals and for drive
control signals.
- Small Computer System Interface (SCSI), originally named SASI for Shugart Associates System Interface, was standard on servers, workstations, Commodore Amiga, Atari ST, and Apple Macintosh computers through the mid-1990s, by which time most models had been transitioned to newer interfaces. The length limit of the data cable allows for external SCSI devices. The SCSI command set is still used in the more modern SAS interface.
- Integrated Drive Electronics (IDE), later standardized under the name AT Attachment (ATA, with the alias PATA (Parallel ATA), retroactively added upon the introduction of SATA) moved the HDD controller from the interface card to the disk drive. This helped to standardize the host/controller interface, reduce the programming complexity in the host device driver, and reduced system cost and complexity. The 40-pin IDE/ATA connection transfers 16 bits of data at a time on the data cable. The data cable was originally a 40-conductor, but later higher speed requirements led to an "ultra DMA" (UDMA) mode using an 80-conductor cable with additional wires to reduce crosstalk at high speed.
- EIDE was an unofficial update (by Western Digital) to the original IDE standard, with the key improvement being the use of direct memory access (DMA) to transfer data between the disk and the computer without the involvement of the CPU, an improvement later adopted by the official ATA standards. By directly transferring data between memory and disk, DMA eliminates the need for the CPU to copy byte per byte, therefore allowing it to process other tasks while the data transfer occurs.
- Fiber Channel (FC) is a successor to the parallel SCSI interface on the enterprise market. It is a serial protocol. In disk drives usually, the Fiber Channel Arbitrated Loop (FC-AL) connection topology is used. FC has much broader usage than mere disk interfaces, and it is the cornerstone of storage area networks (SANs). Recently other protocols for this field, like iSCSI and ATA over Ethernet have been developed as well. Confusingly, drives usually use copper twisted-pair cables for Fiber Channels, not fiber optics. The latter are traditionally reserved for larger devices, such as servers or disk array controllers.
- Serial Attached SCSI (SAS). The SAS is a new generation serial communication protocol for devices designed to allow for much higher speed data transfers and is compatible with SATA. SAS uses a mechanically compatible data and power connector to standard 3.5-inch SATA1/SATA2 HDDs, and many server-oriented SAS RAID controllers are also capable of addressing SATA HDDs. SAS uses serial communication instead of the parallel method found in traditional SCSI devices but still uses SCSI commands.
- Serial ATA (SATA). The SATA data cable has one data pair for differential transmission of data to the device, and one pair for differential receiving from the device, just like EIA-422. That requires that data be transmitted serially. A similar differential signaling system is used in RS485, Local Talk, USB, FireWire, and differential SCSI. SATA I to III are designed to be compatible with, and use, a subset of SAS commands, and compatible interfaces. Therefore, a SATA hard drive can be connected to and controlled by a SAS hard drive controller (with some minor exceptions such as drives/controllers with limited compatibility). However, they cannot be connected the other way round-a SATA controller cannot be connected to a SAS drive.
COMPETITION FROM SSDs
HDDs are being
superseded by solid-state drives (SSDs) in markets where their higher
speed (up to 7 gigabytes) per second
for M.2 (NGFF) NVMe SSDs, or
2.5 gigabytes per second for PCIe expansion card drives),
ruggedness, and lower power are more important than price since the bit cost
of SSDs is four to nine times higher than HDDs. As of 2016, HDDs are
reported to have a failure rate of 2–9% per year, while SSDs have fewer
failures: 1-3% per year. However, SSDs have more uncorrectable data
errors than HDDs.
SSDs offer larger
capacities (up to 100 TB) than the largest HDD and/or higher storage
densities (100 TB and 30 TB SSDs are housed in 2.5-inch HDD cases but
with the same height as a 3.5-inch HDD), although their cost remains
prohibitive.
A laboratory
demonstration of a 1.33-Tb 3D NAND chip with 96 layers (NAND commonly used
in solid-state drives (SSDs)) had 5.5 Tbit/in2 as of
2019, while the maximum areal density for HDDs is 1.5 Tbit/in2. The areal
density of flash memory is doubling every two years, similar to Moore's
law (40% per year) and faster than the 10–20% per year for HDDs. As of
2018, the maximum capacity was 16 terabytes for an HDD and
100 terabytes for an SSD. HDDs were used in 70% of the desktop and
notebook computers produced in 2016, and SSDs were used in 30%. The usage share
of HDDs is declining and could drop below 50% in 2018–2019 according to one
forecast because SSDs are replacing smaller-capacity (less than one terabyte)
HDDs in desktop and notebook computers and MP3 players.
CALCULATION
Modern hard disk
drives appear to their host controller as a contiguous set of logical blocks,
and the gross drive capacity is calculated by multiplying the number of blocks
by the block size. This information is available from the manufacturer's
product specification, and from the drive itself through the use of operating
system functions that invoke low-level drive commands. Older IBM and
compatible drives, e.g., IBM 3390, using the CKD record format
have variable length records; such drive capacity calculations must take into
account the characteristics of the records. Some newer DASD simulates CKD, and
the same capacity formulae apply.
The gross
capacity of older sector-oriented HDDs is calculated as the product of the
number of cylinders per recording zone, the number of bytes per
sector (most commonly 512), and the count of zones of the drive. Some
modern SATA drives also report cylinder-head-sector (CHS) capacities,
but these are not physical parameters because the reported values are
constrained by historic operating system interfaces. The C/H/S scheme has been
replaced by logical block addressing (LBA), a simple linear
addressing scheme that locates blocks by an integer index, which starts at LBA
0 for the first block and increments thereafter. When using the C/H/S
method to describe modern large drives, the number of heads is often set to 64,
although a typical modern hard disk drive has between one and four platters. In
modern HDDs, spare capacity for defect management is not included in
the published capacity; however, in many early HDDs, a certain number of sectors
were reserved as spares, thereby reducing the capacity available to the
operating system. Furthermore, many HDDs store their firmware in a reserved
service zone, which is typically not accessible by the user and is not
included in the capacity calculation.
For RAID subsystems,
data integrity, and fault-tolerance requirements also reduce the realized
capacity. For example, a RAID 1 array has about half the total capacity as
a result of data mirroring, while a RAID 5 array with n drives loses 1/n of capacity
(which equals the capacity of a single drive) due to storing parity
information. RAID subsystems are multiple drives that appear to be one drive or
more drives to the user but provide fault tolerance. Most RAID vendors
use checksums to improve data integrity at the block level. Some
vendors design systems using HDDs with sectors of 520 bytes to contain 512
bytes of user data and eight checksum bytes, or by using separate 512-byte
sectors for the checksum data.
Some systems may
use hidden partitions for system recovery, reducing the capacity
available to the end user without knowledge of special disk partitioning utilities like disk
part in Windows.
0 Comments