HISTORY OF HARD DISK
HISTORY OF HARD DISK

 


THE FIRST HARD DISK INTRODUCE IN 1954

hard disk drive  (HDD), hard diskhard drive, or fixed disk, is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, that retains stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

Introduced by IBM in 1956, HDDs were the dominant secondary storage device for general-purpose computers beginning in the early 1960s. HDDs maintained this position in the modern era of servers and personal computers, though personal computing devices produced in large volumes, like cell phones and tablets, rely on flash memory storage devices. More than 224 companies have produced HDDs historically, though, after extensive industry consolidation, most units are manufactured by Seagate, Toshiba, and Western Digital. HDDs dominate the volume of storage produced (exabytes per year) for servers. Though production is growing slowly (by exabytes shipped), sales revenues and unit shipments are declining because solid-state drives (SSDs) have higher data-transfer rates, higher areal storage density, somewhat better reliability, and much lower latency and access times.

The revenues for SSDs, most of which use NAND flash memory, slightly exceeded those for HDDs in 2018. Flash storage products had more than twice the revenue of hard disk drives as of 2017. Though SSDs have four to nine times higher cost per bit, they are replacing HDDs in applications where speed, power consumption, small size, high capacity, and durability are important. As of 2019, the cost per bit of SSDs is falling, and the price premium over HDDs has narrowed.

The primary characteristics of an HDD are its capacity and performance. Capacity is specified in unit prefixes corresponding to powers of 1000: a 1-terabyte (TB) drive has a capacity of 1,000 gigabytes (GB; where 1 gigabyte = 1 billion (109) bytes). Typically, some of an HDD's capacity is unavailable to the user because it is used by the file system and the computer operating system, and possibly inbuilt redundancy for error correction and recovery. There can be confusion regarding storage capacity since capacities are stated in decimal gigabytes (powers of 1000) by HDD manufacturers, whereas the most commonly used operating systems report capacities in powers of 1024, which results in a smaller number than advertised. Performance is specified as the time required to move the heads to a track or cylinder (average access time), the time it takes for the desired sector to move under the head (average latency, which is a function of the physical rotational speed in revolutions per minute), and finally the speed at which the data is transmitted (data rate).

The two most common form factors for modern HDDs are 3.5-inch, for desktop computers, and 2.5-inch, primarily for laptops. HDDs are connected to systems by standard interface cables such as PATA (Parallel ATA), SATA (Serial ATA), USB, or SAS (Serial Attached SCSI) cables.

HARD DISK PRICE


First HDD Price 10 MegaByte




HISTORY

The first production IBM hard disk drive, the 350 disk storage, shipped in 1957 as a component of the IBM 305 RAMAC system. It was approximately the size of two medium-sized refrigerators and stored five million six-bit characters (3.75 megabytes) on a stack of 52 disks (100 surfaces used). The 350 had a single arm with two read/write heads, one facing up and the other down, that moved both horizontally between a pair of adjacent platters and vertically from one pair of platters to a second set. Variants of the IBM 350 were the IBM 355, IBM 7300, and IBM 1405.

In 1961 IBM announced, and in 1962 shipped, the IBM 1301 disk storage unit, which superseded the IBM 350 and similar drives. 1301 consisted of one (for Model 1) or two (for Model 2) modules, each containing 25 platters, each platter about 18-inch (3.2 mm) thick and 24 inches (610 mm) in diameter. While the earlier IBM disk drives used only two read/write heads per arm, 1301 used an array of 48[m] heads (comb), each array moving horizontally as a single unit, one head per surface used. Cylinder-mode read/write operations were supported, and the heads flew about 250 micro-inches (about 6 Âµm) above the platter surface. The motion of the head array depended upon a binary adder system of hydraulic actuators which assured repeatable positioning. The 1301 cabinet was about the size of three home refrigerators placed side by side, storing the equivalent of about 21 million eight-bit bytes per module. Access time was about a quarter of a second.

Also in 1962, IBM introduced the model 1311 disk drive, which was about the size of a washing machine and stored two million characters on a removable disk pack. Users could buy additional packs and interchange them as needed, much like reels of magnetic tape. Later models of removable pack drives, from IBM and others, became the norm in most computer installations and reached capacities of 300 megabytes by the early 1980s. Non-removable HDDs were called "fixed disk" drives.

In 1963 IBM introduced 1302, with twice the track capacity and twice as many tracks per cylinder as 1301. 1302 had one (for Model 1) or two (for Model 2) modules, each containing a separate comb for the first 250 tracks and the last 250 tracks.

Some high-performance HDDs were manufactured with one head per track, e.g., Burroughs B-475 in 1964, and IBM 2305 in 1970 so that no time was lost physically moving the heads to a track and the only latency was the time for the desired block of data to rotate into position under the head. Known as fixed-head or head-per-track disk drives, they were very expensive and are no longer in production.

In 1973, IBM introduced a new type of HDD code-named "Winchester". Its primary distinguishing feature was that the disk heads were not withdrawn completely from the stack of disk platters when the drive was powered down. Instead, the heads were allowed to "land" on a special area of the disk surface upon spin-down, "taking off" again when the disk was later powered on. This greatly reduced the cost of the head actuator mechanism but precluded removing just the disks from the drive as was done with the disk packs of the day. Instead, the first models of "Winchester technology" drives featured a removable disk module, which included both the disk pack and the head assembly, leaving the actuator motor in the drive upon removal. Later "Winchester" drives abandoned the removable media concept and returned to non-removable platters.

In 1974 IBM introduced the swinging arm actuator, which made it feasible because the Winchester recording heads function well when skewed to the recorded tracks. The simple design of the IBM GV (Gulliver) drive, invented at IBM's UK Hursley Labs, became IBM's most licensed electro-mechanical invention of all time, the actuator and filtration system being adopted in the 1980s eventually for all HDDs, and still universal nearly 40 years and 10 Billion arms later.

Like the first removable pack drive, the first "Winchester" drives used platters 14 inches (360 mm) in diameter. In 1978 IBM introduced a swing arm drive, the IBM 0680 (Piccolo), with eight-inch platters, exploring the possibility that smaller platters might offer advantages. Other eight-inch drives followed, then 5+14 in (130 mm) drives, sized to replace the contemporary floppy disk drives. The latter were primarily intended for the then-fledgling personal computer (PC) market.

Over time, as recording densities were greatly increased, further reductions in disk diameter to 3.5" and 2.5" were found to be optimum. Powerful rare earth magnet materials became affordable during this period and were complementary to the swing arm actuator design to make possible the compact form factors of modern HDDs.

As the 1980s began, HDDs were a rare and very expensive additional feature in PCs, but by the late 1980s, their cost had been reduced to the point where they were standard on all but the cheapest computers.

Most HDDs in the early 1980s were sold to PC end users as an external, add-on subsystem. The subsystem was not sold under the drive manufacturer's name but under the subsystem manufacturer's name such as Corvus Systems and Tallgrass Technologies, or under the PC system manufacturer's name such as the Apple ProFile. The IBM PC/XT in 1983 included an internal 10 MB HDD, and soon thereafter internal HDDs proliferated on personal computers.

External HDDs remained popular for much longer on the Apple Macintosh. Many Macintosh computers made between 1986 and 1998 featured a SCSI port on the back, making external expansion simple. Older compact Macintosh computers did not have user-accessible hard drive bays (indeed, the Macintosh 128K, Macintosh 512K, and Macintosh Plus did not feature a hard drive bay at all), so on those models, external SCSI disks were the only reasonable option for expanding upon any internal storage.

HDD improvements have been driven by increasing areal density, listed in the table above. Applications expanded through the 2000s, from the mainframe computers of the late 1950s to most mass storage applications including computers and consumer applications such as the storage of entertainment content.

In the 2000s and 2010s, NAND began supplanting HDDs in applications requiring portability or high performance. NAND performance is improving faster than HDDs, and applications for HDDs are eroding. In 2018, the largest hard drive had a capacity of 15 TB, while the largest capacity SSD had a capacity of 100 TB. As of 2018, HDDs were forecast to reach 100 TB capacities around 2025, but as of 2019, the expected pace of improvement was pared back to 50 TB by 2026. Smaller form factors, 1.8 inches and below, were discontinued around 2010. The cost of solid-state storage (NAND), represented by Moore's law, is improving faster than HDDs. NAND has a higher price elasticity of demand than HDDs, and this drives market growth. During the late 2000s and 2010s, the product life cycle of HDDs entered a mature phase, and slowing sales may indicate the onset of the declining phase.

The 2011 Thailand floods damaged the manufacturing plants and impacted hard disk drive costs adversely between 2011 and 2013. In 2019, Western Digital closed its last Malaysian HDD factory due to decreasing demand, to focus on SSD production. All three remaining HDD manufacturers have had decreasing demand for their HDDs since 2014.

 

CAPACITY

Two Seagate Barracuda drives from 2003 and 2009, respectively 160 GB and 1 TB. As of 2022, Seagate offers capacities of up to 20TB.

The highest-capacity HDDs shipping commercially in 2022 are 20 TB. The capacity of a hard disk drive, as reported by an operating system to the end user, is smaller than the amount stated by the manufacturer for several reasons, e.g., the operating system using some space, use of some space for data redundancy, space used for file system structures. Confusion about decimal prefixes and binary prefixes can also lead to errors.

Modern hard disk drives appear to their host controller as a contiguous set of logical blocks, and the gross drive capacity is calculated by multiplying the number of blocks by the block size. This information is available from the manufacturer's product specification, and from the drive itself through the use of operating system functions that invoke low-level drive commands. Older IBM and compatible drives, e.g., IBM 3390, using the CKD record format have variable length records; such drive capacity calculations must take into account the characteristics of the records. Some newer DASD simulates CKD, and the same capacity formulae apply.

The gross capacity of older sector-oriented HDDs is calculated as the product of the number of cylinders per recording zone, the number of bytes per sector (most commonly 512), and the count of zones of the drive. Some modern SATA drives also report cylinder-head-sector (CHS) capacities, but these are not physical parameters because the reported values are constrained by historic operating system interfaces. The C/H/S scheme has been replaced by logical block addressing (LBA), a simple linear addressing scheme that locates blocks by an integer index, which starts at LBA 0 for the first block and increments thereafter. When using the C/H/S method to describe modern large drives, the number of heads is often set to 64, although a typical modern hard disk drive has between one and four platters. In modern HDDs, spare capacity for defect management is not included in the published capacity; however, in many early HDDs, a certain number of sectors were reserved as spares, thereby reducing the capacity available to the operating system. Furthermore, many HDDs store their firmware in a reserved service zone, which is typically not accessible by the user and is not included in the capacity calculation.

For RAID subsystems, data integrity, and fault-tolerance requirements also reduce the realized capacity. For example, a RAID 1 array has about half the total capacity as a result of data mirroring, while a RAID 5 array with n drives loses 1/n of capacity (which equals the capacity of a single drive) due to storing parity information. RAID subsystems are multiple drives that appear to be one drive or more drives to the user but provide fault tolerance. Most RAID vendors use checksums to improve data integrity at the block level. Some vendors design systems using HDDs with sectors of 520 bytes to contain 512 bytes of user data and eight checksum bytes, or by using separate 512-byte sectors for the checksum data.

Some systems may use hidden partitions for system recovery, reducing the capacity available to the end user without knowledge of special disk partitioning utilities like disk part in Windows.


UNITS


Unit of computer memory measurements

Binary Digit               = (1, 0)

1 bit                          = Binary Digit

8 bit                          = 1 Byte

1024 bytes                = 1 KB (Kilo Byte)

1024 KB                    = 1 MB (Mega Byte)

1024 MB                   = 1 GB (Giga Byte)

1024 GB                   = 1 TB (Terra Byte)

1024 TB                    = 1 PB (Peta Byte)

1024 PB                    = 1 EB (Exa Byte)

1024 EB                    = 1 ZB (Zetta Byte)

1024 ZB                    = 1 YB (Yotta Byte)

1024 YB                    = 1 (Bronto Byte)

1024 Brontobyte       = 1 (Geop Byte)

Geop Byte is the Highest Memory

 

In the early days of computing the total capacity of HDDs was specified in 7 to 9 decimal digits frequently truncated with the idiom millions. By the 1970s, the total capacity of HDDs was given by manufacturers using SI decimal prefixes such as megabytes (1 MB = 1,000,000 bytes), gigabytes (1 GB = 1,000,000,000 bytes), and terabytes (1 TB = 1,000,000,000,000 bytes). However, capacities of memory are usually quoted using a binary interpretation of the prefixes, i.e. using powers of 1024 instead of 1000.

The software reports hard disk drive or memory capacity in different forms using either decimal or binary prefixes. The Microsoft Windows family of operating systems uses the binary convention when reporting storage capacity, so an HDD offered by its manufacturer as a 1 TB drive is reported by these operating systems as a 931 GB HDD. Mac OS X 10.6 ("Snow Leopard") uses the decimal convention when reporting HDD capacity. The default behavior of the df command-line utility on Linux is to report the HDD capacity as a number of 1024-byte units.

The difference between the decimal and binary prefix interpretation caused some consumer confusion and led to class action suits against HDD manufacturers. The plaintiffs argued that the use of decimal prefixes effectively misled consumers while the defendants denied any wrongdoing or liability, asserting that their marketing and advertising complied in all respects with the law and that no class member sustained any damages or injuries. In 2020, a California court ruled that the use of decimal prefixes with a decimal meaning was not misleading.


PERFORMANCE CHARACTERISTICS

The factors that limit the time to access the data on an HDD are mostly related to the mechanical nature of the rotating disks and moving heads, including:

  • Seek time is a measure of how long it takes the head assembly to travel to the track of the disk that contains data.
  • Rotational latency is incurred because the desired disk sector may not be directly under the head when data transfer is requested. Average rotational latency is shown in the table, based on the statistical relation that the average latency is one-half the rotational period.
  • The bit rate or data transfer rate (once the head is in the right position) creates delay which is a function of the number of blocks transferred; typically, relatively small, but can be quite long with the transfer of large contiguous files.

Delays may also occur if the drive disks are stopped to save energy.

Defragmentation is a procedure used to minimize delay in retrieving data by moving related items to physically proximate areas on the disk. Some computer operating systems perform defragmentation automatically. Although automatic defragmentation is intended to reduce access delays, performance will be temporarily reduced while the procedure is in progress.

Time to access data can be improved by increasing the rotational speed (thus reducing latency) or by reducing the time spent seeking. Increasing areal density increases throughput by increasing data rate and by increasing the amount of data under a set of heads, thereby potentially reducing seek activity for a given amount of data. The time to access data has not kept up with throughput increases, which themselves have not kept up with growth in bit density and storage capacity.


ACCESS AND INTERFACES

Inner view of a 1998 Seagate HDD that used the Parallel ATA interface

2.5-inch SATA drive on top of 3.5-inch SATA drive, showing a close-up of (7-pin) data and (15-pin) power connectors

Current hard drives connect to a computer over one of several bus types, including parallel ATA, Serial ATA, SCSI, Serial Attached SCSI (SAS), and Fiber Channel. Some drives, especially external portable drives, use IEEE 1394, or USB. All of these interfaces are digital; electronics on the drive process the analog signals from the read/write heads. Current drives present a consistent interface to the rest of the computer, independent of the data encoding scheme used internally, and independent of the physical number of disks and heads within the drive.

Typically a DSP in the electronics inside the drive takes the raw analog voltages from the read head and uses PRML and Reed–Solomon error correction to decode the data, then sends that data out to the standard interface. The DSP also watches the error rate detected by error detection and correction and performs bad sector remapping, data collection for Self-Monitoring, Analysis, Reporting Technology, and other internal tasks.

Modern interfaces connect the drive to the host interface with a single data/control cable. Each drive also has an additional power cable, usually direct to the power supply unit. Older interfaces had separate cables for data signals and for drive control signals.

  • Small Computer System Interface (SCSI), originally named SASI for Shugart Associates System Interface, was standard on servers, workstations, Commodore Amiga, Atari ST, and Apple Macintosh computers through the mid-1990s, by which time most models had been transitioned to newer interfaces. The length limit of the data cable allows for external SCSI devices. The SCSI command set is still used in the more modern SAS interface.
  • Integrated Drive Electronics (IDE), later standardized under the name AT Attachment (ATA, with the alias PATA (Parallel ATA), retroactively added upon the introduction of SATA) moved the HDD controller from the interface card to the disk drive. This helped to standardize the host/controller interface, reduce the programming complexity in the host device driver, and reduced system cost and complexity. The 40-pin IDE/ATA connection transfers 16 bits of data at a time on the data cable. The data cable was originally a 40-conductor, but later higher speed requirements led to an "ultra DMA" (UDMA) mode using an 80-conductor cable with additional wires to reduce crosstalk at high speed.
  • EIDE was an unofficial update (by Western Digital) to the original IDE standard, with the key improvement being the use of direct memory access (DMA) to transfer data between the disk and the computer without the involvement of the CPU, an improvement later adopted by the official ATA standards. By directly transferring data between memory and disk, DMA eliminates the need for the CPU to copy byte per byte, therefore allowing it to process other tasks while the data transfer occurs.
  • Fiber Channel (FC) is a successor to the parallel SCSI interface on the enterprise market. It is a serial protocol. In disk drives usually, the Fiber Channel Arbitrated Loop (FC-AL) connection topology is used. FC has much broader usage than mere disk interfaces, and it is the cornerstone of storage area networks (SANs). Recently other protocols for this field, like iSCSI and ATA over Ethernet have been developed as well. Confusingly, drives usually use copper twisted-pair cables for Fiber Channels, not fiber optics. The latter are traditionally reserved for larger devices, such as servers or disk array controllers.
  • Serial Attached SCSI (SAS). The SAS is a new generation serial communication protocol for devices designed to allow for much higher speed data transfers and is compatible with SATA. SAS uses a mechanically compatible data and power connector to standard 3.5-inch SATA1/SATA2 HDDs, and many server-oriented SAS RAID controllers are also capable of addressing SATA HDDs. SAS uses serial communication instead of the parallel method found in traditional SCSI devices but still uses SCSI commands.
  • Serial ATA (SATA). The SATA data cable has one data pair for differential transmission of data to the device, and one pair for differential receiving from the device, just like EIA-422. That requires that data be transmitted serially. A similar differential signaling system is used in RS485, Local Talk, USB, FireWire, and differential SCSI. SATA I to III are designed to be compatible with, and use, a subset of SAS commands, and compatible interfaces. Therefore, a SATA hard drive can be connected to and controlled by a SAS hard drive controller (with some minor exceptions such as drives/controllers with limited compatibility). However, they cannot be connected the other way round-a SATA controller cannot be connected to a SAS drive.


COMPETITION FROM SSDs

HDDs are being superseded by solid-state drives (SSDs) in markets where their higher speed (up to 7 gigabytes) per second for M.2 (NGFF) NVMe SSDs, or 2.5 gigabytes per second for PCIe expansion card drives), ruggedness, and lower power are more important than price since the bit cost of SSDs is four to nine times higher than HDDs. As of 2016, HDDs are reported to have a failure rate of 2–9% per year, while SSDs have fewer failures: 1-3% per year. However, SSDs have more uncorrectable data errors than HDDs.

SSDs offer larger capacities (up to 100 TB) than the largest HDD and/or higher storage densities (100 TB and 30 TB SSDs are housed in 2.5-inch HDD cases but with the same height as a 3.5-inch HDD), although their cost remains prohibitive.

A laboratory demonstration of a 1.33-Tb 3D NAND chip with 96 layers (NAND commonly used in solid-state drives (SSDs)) had 5.5 Tbit/in2 as of 2019, while the maximum areal density for HDDs is 1.5 Tbit/in2. The areal density of flash memory is doubling every two years, similar to Moore's law (40% per year) and faster than the 10–20% per year for HDDs. As of 2018, the maximum capacity was 16 terabytes for an HDD and 100 terabytes for an SSD. HDDs were used in 70% of the desktop and notebook computers produced in 2016, and SSDs were used in 30%. The usage share of HDDs is declining and could drop below 50% in 2018–2019 according to one forecast because SSDs are replacing smaller-capacity (less than one terabyte) HDDs in desktop and notebook computers and MP3 players.


CALCULATION

Modern hard disk drives appear to their host controller as a contiguous set of logical blocks, and the gross drive capacity is calculated by multiplying the number of blocks by the block size. This information is available from the manufacturer's product specification, and from the drive itself through the use of operating system functions that invoke low-level drive commands. Older IBM and compatible drives, e.g., IBM 3390, using the CKD record format have variable length records; such drive capacity calculations must take into account the characteristics of the records. Some newer DASD simulates CKD, and the same capacity formulae apply.

The gross capacity of older sector-oriented HDDs is calculated as the product of the number of cylinders per recording zone, the number of bytes per sector (most commonly 512), and the count of zones of the drive. Some modern SATA drives also report cylinder-head-sector (CHS) capacities, but these are not physical parameters because the reported values are constrained by historic operating system interfaces. The C/H/S scheme has been replaced by logical block addressing (LBA), a simple linear addressing scheme that locates blocks by an integer index, which starts at LBA 0 for the first block and increments thereafter. When using the C/H/S method to describe modern large drives, the number of heads is often set to 64, although a typical modern hard disk drive has between one and four platters. In modern HDDs, spare capacity for defect management is not included in the published capacity; however, in many early HDDs, a certain number of sectors were reserved as spares, thereby reducing the capacity available to the operating system. Furthermore, many HDDs store their firmware in a reserved service zone, which is typically not accessible by the user and is not included in the capacity calculation.

For RAID subsystems, data integrity, and fault-tolerance requirements also reduce the realized capacity. For example, a RAID 1 array has about half the total capacity as a result of data mirroring, while a RAID 5 array with n drives loses 1/n of capacity (which equals the capacity of a single drive) due to storing parity information. RAID subsystems are multiple drives that appear to be one drive or more drives to the user but provide fault tolerance. Most RAID vendors use checksums to improve data integrity at the block level. Some vendors design systems using HDDs with sectors of 520 bytes to contain 512 bytes of user data and eight checksum bytes, or by using separate 512-byte sectors for the checksum data.

Some systems may use hidden partitions for system recovery, reducing the capacity available to the end user without knowledge of special disk partitioning utilities like disk part in Windows.