Reading view

There are new articles available, click to refresh the page.

Updates to Apfelstrudel (Unicode), AppexIndexer (Appexes), Ulbow (logs) and Versatility (versions)

In this last batch of updates to my apps for the next few weeks, there are four more popular tools, covering Unicode normalisation, appexes, logs, and document versions.

Unicode normalisation

Perhaps the earliest problem with APFS was its lack of Unicode normalisation for file and folder names. This has been a standard way to address accented and other characters that appear identical but have different codes. Apple addressed that, first in providing a normalisation layer on top, then by incorporating it into APFS. However, it can still prove a problem, both within apps and when working with other file systems. Apfelstrudel is a simple app that reveals any potential problems with normalisation, and helps you use the form most appropriate. Version 1.6 has an overhauled interface, and has been rebuilt with a new app icon ready for macOS 26 Tahoe. This version supports macOS from Big Sur onwards.

Apfelstrudel 1.6 is now available from here: apfelstrudel16
from its Product Page, and via its auto-update mechanism.

Appexes

App extensions, or appexes, are numerous in recent versions of macOS, and widely used by apps. This simple utility shows all those managed by PlugInKit, complete with their UUIDs, to help you manage them. Version 1.1 has an overhauled interface, and has been rebuilt with a new app icon ready for macOS 26 Tahoe. This version supports macOS from Sonoma 14.6 onwards.

AppexIndexer 1.1 is now available from here: appexindexer11
and from its Product Page. It doesn’t yet support auto-update.

Logs

Until I started development of LogUI, Ulbow was my preferred app for browsing the Unified log. It has extensive features, with full support for the use of predicates, a chart showing the most frequent sources of log entries, and support for creating and using logarchives, including those from iOS and iPadOS. Unlike LogUI, it uses the log command to obtain log extracts, enabling it to show entry times in nanoseconds. It also displays extracts in Rich Text rather than as a list. Version 1.11 fixes a crashing bug when handling some logarchives, has an overhauled interface, and has been rebuilt with a new app icon ready for macOS 26 Tahoe. This version supports macOS from Big Sur onwards, and is recommended for all users.

Ulbow 1.11 is now available from here: ulbow111
from its Product Page, and via its auto-update mechanism.

Document versions

While Revisionist (also recently updated) provides a suite of tools to work with macOS document versions, Versatility handles one of those tasks with greater ease, creating version archives, and reconstituting them into documents. Simply drop a file onto its window and it will be converted into a folder containing each saved version as a separate document. Drop one of those archive folders onto its window and it will be reconstituted into a document with all those previous versions. This makes it simple to preserve versions when moving documents between volumes or computers, and for archival purposes. Version 1.1 has been rebuilt with a new app icon ready for macOS 26 Tahoe, and supports macOS from Big Sur onwards.

Versatility 1.1 is now available from here: versatility11
from its Product Page, and via its auto-update mechanism.

Next updates

Most of my other apps that haven’t yet been updated for Tahoe should still run perfectly well, although their app icons won’t appear the same as before. I’m now turning my attention to the successor to SilentKnight and Skint, and my virtualisers Viable, ViableS, Vimy and Liviable. Once I’m done with those, I’ll return and complete my other apps.

Enjoy!

A brief history of text on the Mac

When the Mac 128K was launched, the computing world was quite happy working with text composed using single-byte characters, and the full 256 characters of Extended ASCII seemed quite sufficient. In those days, encoding text for each language was based on its code page, a different set of 256 characters according to that language’s needs and conventions.

The Mac’s initial version of Extended ASCII became its standard Mac OS Roman encoding by System 6.0.4 in 1989. Since then it has been modified to add support for the euro currency symbol in 1998, and is still supported in macOS. Other code pages for single-byte character encoding extended to Mac OS Icelandic, for example, which formed the basis of Macintosh Latin used by the popular Kermit file-transfer software.

Many languages can’t be encoded in such small character sets, and required 2-byte encodings instead. Dealing with these complexities and support for different writing directions became the task of the Script Manager, introduced in System 4.1 in 1987.

Another fundamental concept in Mac OS has been that text isn’t just a character set, but has to be drawn on the display with other graphics content. Text handling thus became integrated with its rendering and features such as word breaking and ligatures. Support for handling text using mixed scripts came in two optional extensions: WorldScript I for single-byte encodings, and WorldScript II for 2-byte encodings such as Chinese, Japanese and Korean.

There were two more mundane complications for the Classic Mac OS user: line termination, and string handling in code.

While MS-DOS and PCs used the combination of carriage return and line feed characters (\r\n) to terminate lines, Classic Mac OS used carriage return (\r) alone, then Mac OS X followed the Unix convention of using line feed (\n) alone. Although the better text editors supported each of those, and would convert text files between them, that became tedious.

Much application development for Classic Mac OS was performed in Apple’s Macintosh Programmer’s Workshop (MPW) using the extended implementation of Pascal known as Object Pascal. This had adopted UCSD Pascal string format, in which the first byte(s) in its native strings contained the length of the string in bytes, rather than its first character. This was all the more confusing when combining projects with C, whose native string format didn’t preface its characters with length, but terminated every string with a null byte.

In 1985, while working on KanjiTalk, the heart of the Mac’s Japanese localisation, Mark Davis and Ken Krugler developed ideas that eventually led to Unicode. When Davis hired Lee Collins to join him at Apple from Xerox, they developed their proposals further, and in 1987 Apple was one of the founders of the Unicode Consortium. The following year, Apple decided to build Unicode support into TrueType, the new font standard it released in System 7 in 1991.

In 1998 System 8.5 integrated support for Unicode text, in Apple Type Services for Unicode Imaging, ATSUI, which was still supported until 2022, and has finally been removed altogether in macOS 14 Sonoma the following year. Initial support for Unicode included UTF-16 encoding to the Unicode Standard version 2.1. Conversion between text encodings was provided by the Text Encoding Conversion Manager.

Core Text superseded ATSUI in Mac OS X 10.5 Leopard in 2007, and is part of the Cocoa text system inherited from NeXTSTEP.

One unexpected new feature of Unicode was the LastResort, the symbol shown for a code point that doesn’t exist yet, and the product of garbled text, seen here in 2007.

Even in familiar languages like Greek, Unicode offers exotics such as GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI, whatever that might be used for.

However, Unicode has brought its own problems, among them its acceptance of multiple code points (character encodings) for visually identical characters. In normal text use this can impede searching, but becomes more critical with the naming of files and directories.

The letter Å can be represented in UTF-8 as either C3 85 (Form C) or 41 CC 8A (Form D). Search for the word Ångström using Form C, and you won’t find the same word using Form D instead. A file system that allows both forms to appear independently in file and directory names appears to the user to allow items with duplicate names, and that poses further problems for search.

In Apple’s Macintosh Extended (HFS+) file system, Unicode normalisation is used to map characters to Unicode Form D, but when Apple developed APFS it intended to leave any normalisation to apps. Early releases of APFS thus didn’t perform normalisation, resulting in many problems for app developers and users. This was rectified by incorporating a normalisation layer into macOS to return to the relative sanity of Form D.

apfelstrudel10

apfsvol08

It would perhaps be better to close without mentioning the annual additions to emoji supported in Unicode, as announced prominently in macOS updates. It has been a long and sometimes arduous journey from Extended ASCII to the 😁 of 🤷.

Apple Inside Macintosh: Text (1993)
Pascal string types in the Free Pascal and Lazarus Wiki
Unicode – the beginnings, Mark Davis and others
Apple Core Text Programming Guide (2007-2014)
Apple Core Text, current documentation

❌