Reading view

There are new articles available, click to refresh the page.

Why you need to make archives, and how to

We back up to ensure that we can recover files, whole volumes, our complete Mac if needed. When that crucial document you were working on earlier has vanished, or becomes damaged, or disaster strikes a disk, backups are essential. But how do you preserve all those documents that used to come on paper, records, correspondence and certificates? How will you or your successors be able to retrieve them in ten or thirty years time? This brief article considers how you should archive them safely, which isn’t the same as backing them up.

By archiving, I mean putting precious files somewhere they can be retrieved in at least ten years time. They may include financial, business, employment and personal records, as well as all finished work that you want to record for posterity. For most, they’ll also include a careful selection of still images, movies, and the more important documents you might create, such as books, theses and papers. They’re what you and the law want you to keep in perpetuity, and to be able to retrieve even after you’re gone.

To see how this can be achieved, I consider: the storage medium to be used, file formats that will be retrievable, how to index them for access, physical storage conditions, and the checks of their integrity that are needed.

Storage medium

While backups are most likely to be kept on hard disks or SSDs, neither of those is in the least suitable for archives, as they have relatively short lifetimes and are too sensitive to storage conditions. Instead, you need a removable medium, today probably Blu-ray disks intended for archival use, such as M-DISC.

For those with copious archives of importance beyond their family, Sony used to offer Optical Disk Archive systems, but those products were discontinued last year and don’t appear to have a suitable replacement. This illustrates one of the problems with planning for the more distant future: today’s technology can all too easily become orphaned.

Businesses are increasingly turning to cloud services to store their archives, but for the great majority of us the recurring cost makes this impractical. In any case, best practice should be to use cloud services as a supplement to a physical archive. iCloud is more affordable for the storage of most important documents, but requires a Legacy Contact to be appointed.

File formats

While it’s fine to archive documents in their original format, as you do in your backups, it’s also important to extract their contents into more permanent formats. Among those most likely to prove durable for the next 50-100 years are:

  • UTF-8 (and formerly ASCII) for text files,
  • JPEG and PNG for still images,
  • audio, video and rich media using one of the widely-used compression standards and file formats,
  • XML-based open document standards,
  • CSV for data,
  • PDF provided that it complies with one of the archival standards PDF/A-1 to /A-4.

You may find it worthwhile tarring together large collections of smaller files, but don’t use an unusual compression or ‘archive’ format, which might prove inaccessible in the future.

Indexing and access

For larger collections, even when structured carefully, a thorough list of contents in UTF-8 text format is essential. While there are index and search tools that could help, in this respect too archives are different from backups. If you’re going to be gathering TB of files, look at some of the commercial solutions. Although some are free to use, like the long-established Greenstone, they aren’t intended for casual users and might prove demanding.

Physical storage conditions

Never print on the disk itself, which can result in its degradation, and keep paper records alongside disks in the same container, but not inside the cases themselves, where they could damage them.

Archive optical disks should be stored in cases with centre hub security, not in sleeves. They must be kept in a cool, dry and dark container, in which there is no mould or fungus. They also need to be protected from physical threats such as flood and fire. Firesafes are popular furniture for this, but you must then ensure that their combination or keys are readily available and not separated from the safe.

There used to be a vogue for commercial data repositories, often underground storage sites that had been repurposed. Not only were those expensive, but many failed to take the care that they promised, and plenty went bankrupt and put their contents at risk. If you can arrange it, store one copy with you, and another at a friend’s or relative’s at least a few miles away.

Integrity checks

If you’re serious about maintaining your archives, some form of integrity checking, such as that provided by my free utilities Dintch, Fintch and cintch, is essential. Check a sample on each disk once a year, to ensure that none has started to deteriorate. If you do detect errors, that’s the time to burn a replacement before the original is lost to decay.

Conclusion

Backups are for recovery, while archives are for posterity. Start building your archives now, and keep them safe for the future.

Further reading

How to burn a Blu-ray disc in Monterey
Wikipedia point of entry

Postscript

Some of you are reporting widespread claims that some Blu-ray burners no longer work in Sequoia. I have therefore repeated the process that I described in Monterey, using exactly the same Pioneer burner connected to a Mac Studio M1 Max running macOS 15.1. I’m delighted to report that it still works perfectly, and I see no reason that any other recent Pioneer optical drive should prove incompatible. All you need to do is follow the instructions.

Happy archiving!

What performance should you get from different types of storage?

External storage is invariably sold with ‘up-to’ performance figures. In practice, you’ll seldom realise anything like some write or read speeds claimed. And when it comes to prolonged tasks like that first full Time Machine backup, no matter how fast you thought that drive would be, it always takes longer than expected.

Over the last few years I have tested and reviewed many examples of different types of external storage, from basic USB 3 hard drives, to the latest USB4 SSD enclosures, and NAS packed with fast SSDs. This article draws on all those test results to give you a better idea of what to expect when they’re being used with your Mac.

Results quoted here are typical for those tests performed mostly using a Mac Studio M1 Max, but unless otherwise indicated should be similar for recent Intel models. They’re summarised in this table.

storage1

Write speeds are given for:

  • the single 50 MB write test performed by Time Machine before each backup;
  • 500 multiple concurrent writes of 4 KB each, performed in those same Time Machine tests;
  • calculated net write speed over a first full backup to APFS of at least 400 GB;
  • general write speed measurement using my app Stibium, which gives broadly similar results to other leading benchmarking apps.

General read speeds are also obtained using Stibium, and similar to other apps. All speeds are given as MB/s for consistency.

Before looking at individual types of storage, one obvious and important result is the effect of throttling by macOS on Time Machine backup performance. Considering Time Machine’s own tests, writing a single 50 MB file is performed consistently at around 200-225 MB/s to local storage of whatever type, and multiple concurrent writes of 4 KB files reach around 20-23 MB/s regardless of local storage type. Those hold good even when you back up to a fast Thunderbolt 3 SSD, and backing up to a NAS is little quicker unless it’s over 2.5GbE to an NVMe SSD. Local transfer speeds only differ more substantially in general tests, when they aren’t throttled as they are in Time Machine.

Hard disks

When writing to or reading from a local hard disk, performance varies substantially according to which sectors on the hard disk are being accessed. This is a well-known phenomenon, and the result of geometry, as sectors are faster at the periphery of the disk’s platter, and slower in the inner part. Ranges given here take that into account: the lower figure is for inner sectors, and the higher for outer ones. Some users compensate for this effect, and only ever use the outer half of a disk’s sectors to obtain better performance, but that reduces their available capacity, and effectively doubles their cost per TB.

SSDs

SATA SSDs may be cheapest, but they’re also slowest, and with Macs they generally don’t enjoy Trim or SMART health indicator support. Of the two, Trim support is usually the more important, as without that, they can accumulate blocks waiting to be erased and returned for further use, and as a result their write (but not read) speed can fall as low as 100 MB/s. Unless used for largely static storage, this is a significant risk.

NVMe SSDs deliver twice the performance of SATA models, and generally enjoy Trim but not SMART indicator support. This makes them far better suited to general use, as their write speeds should be sustained from new throughout their working life.

USB 3.2 Gen 2, Thunderbolt 3, USB4

Translating commonly quoted transfer speeds for these three protocols into real-world speeds turns out to be complex. In practice, these are what you can expect to see:

  • USB 3.2 Gen 2 at 10 Gb/s is slightly less than 1 GB/s
  • Thunderbolt 3 at 32 Gb/s is up to 3 GB/s
  • USB4 at 40 Gb/s is up to 3.4 GB/s.

All recent models of Mac, both Intel and Apple silicon, should realise full performance over USB 3.2 Gen 2 and Thunderbolt 3, but support for USB4 is limited to Apple silicon. Unless a drive or enclosure specifically includes Thunderbolt 3 as a fallback, when connected to an Intel Mac, you should expect it to fall back to USB 3.2 Gen 2 at just under 1 GB/s, less than a third of the speed of USB4.

NAS

Although I haven’t made any systematic comparison between AFP and SMB network protocols, I can see no consistent difference in their performance, when used with the latest versions of macOS and NAS software. The latter, though, can be critical: older versions of NAS software can perform poorly when used over SMB with recent macOS. Keeping your NAS software up to date is important.

Throttling of Time Machine backup writing isn’t supposed to occur when backing up over a network, and there is some evidence here to support that, with significantly better results for 50 MB test files. However, those are only apparent when using NVMe SSDs in the NAS, with a wired Ethernet 2.5GbE connection to provide sufficient bandwidth.

Check TM performance

Provided that your Mac is running a recent version of macOS and backing up to APFS, it’s simple to read the two write performance tests that occur at the start of each Time Machine backup using my free T2M2. Alternatively, you can also read them using the Time Machine custom log extract in Mints. In T2M2 they should look something like:
Destination IO performance measured:
Wrote 1 50 MB file at 238.02 MB/s to "/Volumes/ThunderBay2" in 0.210 seconds
Concurrently wrote 500 4 KB files at 35.58 MB/s to "/Volumes/ThunderBay2" in 0.058 seconds

Check general performance

Although there are other apps that will do this, I developed Stibium for this purpose. Follow the ‘gold standard’ procedure detailed in its Help Reference to obtain the most accurate and reproducible results. Stibium can test any storage you can access in the Finder, including all local devices and networked systems such as NAS.

Further reading

Which external drives have Trim and SMART support?
How to evaluate an external SSD
You can read my reviews in MacFormat and MacLife magazines, available in the App Store.

❌