Reading view

There are new articles available, click to refresh the page.

Who called git, and how Claude was caught red-handed

When the same unusual dialog appears twice within a few days for two different people, you begin to suspect a pattern. This article explores a rabbit hole that involves git, the log and the fickleness of AI.

On 8 March, Guy wondered whether an XProtect update earlier this month could have been responsible for a dialog reading The “git” command requires the following command line developer tools. Would you like to install the tools now? As the request seemed legitimate but its cause remained unknown, we mulled a couple of possible culprits, and he went off to investigate.

Five days later, after he had installed the update to SilentKnight 2.13, Greg emailed me and asked whether that might be responsible for exactly the same request appearing on his Mac. This time, Greg had consulted Claude, which asked him to obtain a log extract using the pasted command
log show --start "2026-03-13 07:07:00" --end "2026-03-13 07:10:00" --style compact --info | grep -E "14207|spawn|exec|git|python|ruby|make"

Armed with that extract, Claude suggested that SilentKight had been the trigger for that dialog.

I reassured Greg that, while SilentKnight does rely on some command tools, it only uses those bundled with macOS, and never calls git even when it’s feeling bored. While I was confident that my app couldn’t have been responsible, I wondered if its reliance on making connections to databases in my Github might somehow be confounding this.

While I knew Claude was wrong over its attribution, the log extract it had obtained proved to be conclusive. Within a few minutes of looking through the entries, I had found the first recording the request for command line tools:
30.212 git Command Line Tools installation request from '[private]' (PID 14205), parent process '[private]' (parent PID 14161)
30.212 git Command Line Tools installation request from '[private]' (PID 14206), parent process '[private]' (parent PID 14161)

As ever, the log chose to censor the most important information in those entries, but it’s dumb enough to provide that information elsewhere. All I had to do was look back to discover what had the process ID of 14161, as its parent. Less than 6 seconds earlier is:
24.868 launchd [pid/14161 [Claude]:] uncorking exec source upfront

Just to be sure, I found matching entries for SilentKnight and the system_profiler tool it called after the attempt to run git:
30.153 launchd [pid/14137 [SilentKnight]:] uncorking exec source upfront
30.336 launchd [pid/14139 [system_profiler]:] uncorking exec source upfront

There was one small mystery remaining, though: why did Claude’s log show command also look for process ID 14207? That was the PID of the installondemand process that caused the dialog to be displayed:
30.215 launchd [gui/502/com.apple.dt.CommandLineTools.installondemand [14207]:] xpcproxy spawned with pid 14207

Following its previous denial, when Claude was confronted with my reading of the log, it accepted that its desktop app had triggered this dialog. Its explanation, though, isn’t convincing:
“the Claude desktop app calls git at launch — likely for one of a few mundane reasons like checking for updates, querying version information, or probing the environment. It’s not malicious, but it’s poorly considered behavior for an app that can’t assume developer tools are present on every Mac.”

In fact, it was Guy who had probably found the real reason, that the Claude app has Github as one of its four external connectors. However, that shouldn’t give it cause to try running the git command, resulting in this completely inappropriate request.

Conclusions

  • Claude might know how to use the log show command, but it still can’t understand the contents of the Unified log.
  • If you’re ever prompted to install developer command tools to enable git to be run, suspect Claude.
  • What a fickle and ever-changing thing is an AI.*

I’m very grateful to Greg and Guy for providing the information about this curious problem.

* This is based on a well-known English translation of a line from Virgil’s Aeneid, Book 4: “Varium et mutabile semper femina”, “what a fickle and ever-changing thing is a woman”. While all of us should dispute that, there’s abundant evidence that it’s true of Claude and other AI.

How long does the log keep entries?

One of the most contentious questions arising from yesterday’s critical examination of ChatGPT’s recommendations, is how long does the Unified log keep entries before they’re purged? ChatGPT seemed confident that some at least can be retained for more than a month, even as long as a year. Can they?

Traditional text logs are removed after a fixed period of time. One popular method is to archive the past day’s log in the early hours of each morning, as part of routine housekeeping. Those daily archives are then kept for several days before being deleted during housekeeping. That’s far too simple and restrictive for the Mac’s Unified log.

Apple’s logs, in macOS and all its devices, are stored in proprietary tracev3 files, sorted into three folders:

  • Persist, containing the bulk of log entries, retained to keep their total size to about 525 MB in about 50-55 files;
  • Special, including fault and error categories, whose entries are slowly purged over time until none remain in the oldest log files, so have a variable total size and number.
  • Signpost, used for performance measurements, which also undergo slow purging until they vanish.

One simple way to estimate the period for which log entries are retained is to find the date of creation of the oldest log file in each of those folders. On a Mac mini M4 Pro run largely during the daytime, those dates were

  • Persist, earliest date of creation 7 March 2026 at 16:54
  • Special, 9 February 2026 at 19:41
  • Signpost, 3 March 2026 at 16:41

when checked on 10 March. Those indicate a full log record is available for the previous 3 days, followed by a steady decline with age to the oldest entry 31 days ago. That compares with statistical data available in my app Logistician going back as far as 14 January, although all entries between then and 9 February have now been removed and lost.

Retrieving old log entries

The real test of how many log entries have been retained is to try to retrieve them. Although the oldest Special log file was created on 9 February, the oldest log entry I could retrieve was the start of the boot process on 11 February, in Special log files returning a total of over 44,000 entries for that day. However, no further log entries could be found after those until the morning of 24 February, a gap of over ten days.

This chart shows the numbers of log entries that could be found and read at intervals over previous days. Where a total of 500,000 is shown, that means over 500,000 for that 24 hour period. I checked these using two different methods of access, using the OSLog API in LogUI, and via the log show command in Ulbow. In all cases, log show returned slightly fewer than OSLog.

It’s clear that with only 3 days of full Persist log files, very few entries have been retained from earlier than 7 days ago, and beyond that retention numbers are erratic.

Over the period prior to the oldest Persist file, when entries could only be coming from Special log files, those included both regular and boundary types, and categories were diverse, including fault, error, notice and info, and weren’t confined to the first two of those categories. Most subsystems were represented, but very few entries were made by the kernel. There is thus no obvious pattern to the longer retention of entries in Special files.

Ephemeral entries

Log entries are initially written to memory, before logd writes most of them to permanent storage in tracev3 log files on disk.

mul102LogdFlow

The first substantial purging of entries thus occurs when logd decides which are ephemeral and won’t be retained on disk. This can be seen by following the number of entries in a short period of high activity in the log, over time, and is shown in the chart below for a sample period of 3 seconds.

When fetched from the log within a minute of the entries being written, a total of 22,783 entries were recovered. Five minutes later there were only 82% of those remaining. Attrition of entries then continued more slowly, leaving 80% after 8 hours. Analysis suggests that over this period in which there were about 6,100 log entries per second written to disk, approximately 1,700 log entries per second were only kept in memory and never written to disk. That suggests about 22% were ephemeral, a proportion that’s likely to vary according to the origin and nature of log entries.

Summary

  • A fifth of log entries are likely to be ephemeral, and lost from the log within the first minutes after they’re written.
  • Most retained log entries are written in Persist logs, where tracev3 files are removed by age to keep their total size to just over 500 MB. Those should preserve the bulk of log entries for hours or days after they’re written.
  • Entries stored in Special log files may be retained for significantly longer, here up to a maximum of 29 days. Although those may contain fault and error categories, retention doesn’t follow an obvious pattern, making their period of retention impossible to predict.
  • In practice, the period in which a fairly complete log record can be expected is that applied to Persist files, which varies according to the rate of writing log entries. In most cases now that’s unlikely to be longer than 5 days, and could be less than 12 hours.
  • You can’t draw conclusions from the apparent absence of certain log entries from the log prior to the earliest entries in Persist log files, as it’s likely that those entries will have been removed.
  • Expecting to retrieve log entries from earlier than 5 days ago is almost certain to fail.

Why does AI tell you to use Terminal so much?

There’s a striking difference between troubleshooting recommendations made by AI and those of humans. If you’ve tried using AI to help solve a problem with your Mac, you’ll have seen how heavily it relies on commands typed into Terminal. Look through advice given by humans, though, and you’ll see they rely more on apps with GUI interfaces. Rather than sending you straight to fsck_apfs, for instance, most humans will prefer to direct you to Disk Utility and its First Aid feature.

This is because most popular AI like ChatGPT, Claude and Grok is based on LLMs, Large Language Models, built on tokens for words. The great majority of humans using Macs work primarily in its GUI, using apps rather than Terminal commands. Verbalising the use of GUI apps is relatively difficult, and with the popularity of YouTube rather than written blogs, is becoming a dying art.

Command tools are all very well, but in this context they have disadvantages, among them being:

  • Few understand the commands used, and AI seldom explains those provided unless explicitly asked to do so. This results in the human learning little other than a long list of incomprehensible commands, and acquiring neither understanding nor insight.
  • Commands often lack the safeguards normally built into in apps, so can more readily cause damage or mislead.
  • Commands can readily generate overwhelmingly large output, even when piped to a file for later browsing.
  • Pasting commands into Terminal has become a favourite way to trick the unsuspecting into installing malware.

Even for an advanced user, confident assertions made by an AI can mislead. Here’s an example taken from ChatGPT’s advice on potential detections of malicious software. For this I have set ChatGPT’s output in italics, together with its commands as code. It’s worth noting that each command was explicitly given for the bash command shell, although in macOS Catalina (in 2019) Terminal changed to use zsh rather than bash as its default.

ChatGPT starts this exchange with a teaser that few could resist:
If you want, I can also show you how to tell in 10 seconds whether a Mac actually has malware despite these logs (there are 3 very specific indicators in Console).

Vague predicate

Here’s the most reliable way to see if macOS has ever detected malware on your Mac using Apple’s internal security logs.
1. Check the XProtect malware history
Open Terminal and run:

log show --predicate 'subsystem == "com.apple.XProtectFramework"' --last 30d

This log show command doesn’t access “Apple’s internal security logs”, which don’t exist, but the Unified log. Nor does the recommendation use Console to access the log, as promised. The predicate given is too non-specific to be useful for the stated purpose, and the time period of 30 days is absurdly long and doesn’t match the claimed period either.

On my Mac with the longest log records, its oldest Special tracev3 log file was created on 9 Feb 2026, and now has only 43 KB left from its original size of 11,298 KB, and the next oldest from 10 Feb has 34 KB left from 11,148 KB. On an older Intel Mac, the oldest creation date of a Special log file was 6 March, just 4 days previously, not 30.

When run on the first of those on 10 March, with longer log records, that log show command returns two Fault entries from 11 Feb, and more complete records from Special logs starting on 2 March, but no full log records until 6 March.

The predicate used is too general to select useful entries and is overwhelming, as it returns a total of 1,505 log entries for that command.

Log entries a year old

2. Specifically check for remediation events
To filter just real detections:

log show --predicate 'eventMessage CONTAINS "remediat"' --last 365d
If nothing prints, no malware has been removed in the last year.

ChatGPT here defines “real detections” as those for which remediation was attempted. I know of no evidence from research to support that assumption. Given the limitations in retrieving log entries from 30 days noted above, it seems absurd to assume that any log entries will be retained from over 360 days ago.

It’s malware only if blocked

3. Check Gatekeeper malware blocks
Run:

--predicate 'process == "syspolicyd" AND eventMessage CONTAINS "malware"' --last 365d

ChatGPT has here redefined its original claim to limit it to malware that was blocked by Gatekeeper, and once again assumes it can retrieve log entries from over 360 days ago.

No verification

4. Verify XProtect definitions are current
Run:

system_profiler SPInstallHistoryDataType | grep -i xprotect
You should see recent updates like:
XProtectPlistConfigData
XProtectPayloads
Apple updates these silently in the background.

This is one of its most curious recommendations, as system_profiler is the command line interface to System Information, a familiar and far more accessible app. What that command does is look for the case-insensitive string “xprotect” in the Installations list. Unfortunately, it proves useless, as all you’ll see is a long list containing those lines, without any dates of installation or version numbers. On my older Mac, piping the output to a file writes those two words on 6,528 lines without any other information about those updates.

I know of two ways to determine whether XProtect and XProtect Remediator data are current, one being SilentKnight and the other Skint, both freely available from this site. You could also perhaps construct your own script to check the catalogue on Apple’s software update server against the versions installed on your Mac, and there may well be others. But ChatGPT’s command simply doesn’t do what it claims.

How not to verify system security

Finally, ChatGPT makes another tempting offer:
If you want, I can also show you one macOS command that lists every XProtect Remediator module currently installed (there are about 20–30 of them and most people don’t realize they exist). It’s a good way to verify the system security stack is intact.

This is yet another unnecessary command. To see the scanning modules in XProtect Remediator, all you need do is look inside its bundle at /Library/Apple/System/Library/CoreServices/XProtect.app. The MacOS folder there should currently contain exactly 25 scanning modules, plus the XProtect executable itself. How listing those can possibly verify anything about the “system security stack” and whether it’s “intact” escapes me.

Conclusions

  • Of the five recommended procedures, all were Terminal commands, despite two of them being readily performed in the GUI. AI has an unhealthy preference for using command tools even when an action is more accessible in the GUI.
  • None of the five recommended procedures accomplished what was claimed, and the fourth to “verify XProtect definitions are current” was comically incorrect.
  • Using AI to troubleshoot Mac problems is neither instructive nor does it build understanding.
  • AI is training the unsuspecting to blindly copy and paste Terminal commands, which puts them at risk of being exploited by malicious software.

Previously

Claude diagnoses the log

Lost in the log? Here’s Logistician 1.1

If you’re still struggling to find your way around the log, or not even prepared to try, I have a new version of my log statistics and navigation utility Logistician that should help. This enhances its list of log files by adding further details, and adds a completely new graphical view to help identify periods of unusual log activity.

Log list

As I showed here a couple of days ago, Logistician opens the JSONL statistics files maintained by logd in /var/db/diagnostics, alongside folders containing the tracev3 log files. The list of those originally gave a minimum of information, and that has been increased to contain:

  • the start date and time of each file, in addition to the date and time it was closed
  • the period during which that file had entries added to it, in seconds
  • the size of log data within the file, in KB
  • the average rate at which log data was written to that file, in B/s
  • the path to that file, which reveals whether its type is Persist, Special or Signpost, hence the nature of its contents.

Start date and time are taken from those for the closing of its predecessor, so can’t be given for the first file of each type. They can also span a period during which the Mac was shut down, although that’s usually obvious from the low rate at which log data was written.

Point plot

The new window available plots point values for the whole series of log files in the current list.

This displays any of three different plots:

  • rate of log data written to Persist log files over the period for which log files are listed, in B/s;
  • amount of log data written to Persist log files over that period, in KB;
  • amount of log data written to Special log files over that period, in KB.

For the latter two, quantities shown are for the three processes that entered the largest data in that period. I have looked at identifying the processes concerned, but that’s far too complex to do here.

Signpost log files contain special types of entry intended to be used to assess performance, and contribute little to other analyses, so are excluded from these plots. Regular log entries are either saved to Persist or Special types, although it’s unclear as to which entries go to each. Some processes only appear to use one, although the entries for many processes can be saved to either. Although there are similarities in the patterns of Persist and Special files, they also differ in other respects. These three plots appear most suitable when looking for anomalies in the log.

Although these plots make it easy to identify the date of an anomaly such as the high outliers at the far right, for 11 February, they can’t tell you the time of the file you should analyse. For that, Logistician reports the time and date of the location that the pointer is hovering over. Place the pointer over the high rate value, for example, and you’ll see it occurred at about 20:14:00. This helps you identify which of the listed log files has that high peak rate, hence the time period to inspect using LogUI.

Traditionally, the moment you move the pointer from a chart area, hover information like that is removed. If that were done here, it would make it infuriatingly hard to refer to the list of log files. So these dates and times show those at the last moment the pointer was over that point plot. The knack is to hover over the point of interest, then move the pointer off that chart vertically, so as not to alter the time indicated. I’m looking at alternative methods of locking the time shown, to make that easier, but that presents more complex coding challenges, as do methods of zooming in on smaller periods of time.

In case you’re wondering, the overall period covered by these point plots, divided across the two log statistics files maintained, is approximately 6 weeks, as indicated by the X scales shown here.

Logistician version 1.1 is now available for Sonoma and later from here: logistician11a
and will shortly be getting its place in a Product Page and other listings here.

Enjoy!

Update: thanks to Jake for finding a divide by zero bug that could crash Logistician when opening a JSONL file. I have fixed this in build 14, now available above. Please download that and replace copies of the original build 12, so you shouldn’t encounter that crash. My apologies.

How long will my Mac’s SSD last?

It’s not that long ago that our Macs came with internal storage that could readily be replaced when it failed. Memories of big hard disks that died almost as soon as their warranty ran out, and of keeping a bootable clone ready in a Mac Pro, aren’t easily forgotten. So isn’t it high risk to buy a modern Mac that won’t even boot if its internal SSD has failed? Are you left wondering whether that SSD will last five years, or even three?

SSDs aren’t like hard disks

Hard disks are amazingly engineered electro-mechanical devices that spin platters at high speeds incredibly close to read-write heads. Before you even consider all the faults that can occur in their magnetic storage, there are many horrible ways they can die through mechanical disaster. Visit a data recovery shop and they’ll show you heads fused to platters, and shards of what had been storing terabytes of data before the platter shattered. And like all mechanical devices they wear out physically, no matter how carefully you care for them.

By comparison, an SSD in a Mac that has good mains power filtering, ideally a proper uninterruptible power supply (UPS), leads a sheltered life. Like other solid-state devices, so long as its power supply is clean and it doesn’t get too hot, it’s most likely to fail in the first few weeks of use, and as it’s reaching the end of its working life, in a U-shaped curve. Modern quality control has greatly reduced the number of early failures, so what we’re most concerned about is how long it will be until it wears out, as it approaches its maximum number of erase-write cycles.

Predicting wear

The theory goes that the memory cells used in SSDs can only work normally for a set number of erase-write cycles. This appears to hold good in practice, although there’s always a small number that suffer unpredictable electronic failure before they reach that. What’s more controversial is how many erase-write cycles each SSD should be capable of. Manufacturers make various claims based on accelerated ageing tests, and I suspect most come with a large dash of marketing sauce. Apple doesn’t offer figures for the SSDs it equips Macs with, but conservative estimates are around 3,000 cycles in recent models.

To work out how long you can expect your Mac’s internal SSD to last before it reaches that cycle limit, all you need do is to measure how much data is written to it, and once that is 3,000 times the capacity of the SSD, you should expect it to fail through wear. Fortunately, SSDs keep track of the amount of data written to them over their lifetime. This can be accessed through better SSD utilities like DriveDx, and I even have a feature in Mints that will do that for most internal SSDs.

Example

My iMac Pro is now well over 7 years old, as it was bought new in December 2018. It has a 1 TB internal SSD (I wanted 2 TB, but couldn’t wait for a BTO), and has run pretty well 24/7 since I got it. As I work every day, even over Christmas, and it has been my main production system, it has probably been in use for over 2,500 days now.

According to the SSD’s records, over that period its 1 TB SSD has written about 150 TB in total, from its total expected lifetime of 3,000 TB, if it reaches 3,000 erase-write cycles. At current usage rates that would take another century, or 133 years if you want to be precise. In reality, it’s generally believed that most SSDs will cease functioning after about 10 years in any case.

It’s worth noting here that, had I got the iMac Pro with my preferred 2 TB SSD, its total expected lifetime would have been 6,000 TB, and instead of lasting a total of 140 years it would in theory have gone twice that period before it wore out.

What wears out SSDs?

For an SSD to wear out when it reaches its limit of erase-write cycles, wear across its memory must be even. If that memory were to be largely full of static data, and the SSD was only able to write to 10% of its memory, then it would wear out ten times quicker than the whole SSD would. To ensure that doesn’t happen, all modern SSDs incorporate wear-levelling, which incurs its own overhead in erase-write cycles, but should ensure that the whole SSD wears out at the same rate. You can help that, and maintain faster write speeds, by keeping ample storage space free. My current target for my iMac Pro is an absolute minimum of 10% free, and 15% as much as possible.

Given that my iMac Pro has averaged about 21 TB written to its SSD each year, that works out at just under 60 GB per day. For those who are worried that the Unified log adds significantly to SSD wear, it’s not hard to estimate that’s only likely to write around 250-500 MB each day even if you leave your Mac awake and running 24/7, less than 1% of my Mac’s daily write load.

Unless you work with huge media files, by far your worst enemy is swap space used for virtual memory. When the first M1 Macs were released, base models with just 8 GB of memory and 128 GB internal SSDs were most readily available, with custom builds following later. As a result, many of those who set out to assess Apple’s new Macs ended up stress-testing those with inadequate memory and storage for the tasks they ran. Many noticed rapid changes in their SSD wear indicators, and some were getting worryingly close to the end of their expected working life after just three years.

So the best way to get a long working life from your Mac’s internal SSD is to ensure that it has sufficient memory as to never use swap space in its VM volume. Although my iMac Pro only has a 1 TB internal SSD, which is more cramped than I’d like, it has 32 GB of memory, and almost never uses swap.

Key points

  • SSDs wear out differently from hard disks.
  • Protect your Mac and its internal SSD with good mains power filtering, preferably using a UPS.
  • Expect modern Mac internal SSDs to wear out after at least 3,000 erase-write cycles.
  • To monitor wear, measure the total data written to the SSD.
  • Expect an internal SSD to wear out when that total reaches 3,000 times the total capacity of the SSD.
  • For a given amount of data written to an SSD, the larger the total capacity of the SSD, the slower it will wear out.
  • Keep at least 10% of the SSD free at all times, with 15-25% even better.
  • Ensure your Mac has sufficient memory to never use VM swap space.

Investigate a past event in the log

We don’t always notice something is wrong within a few hours of the event that caused a problem. Sometimes it can take days or weeks before we realise that we need to check something in the log. By that time all trace has vanished, as the active log will have rolled those log entries long before we go looking for them. This article shows how to recover and analyse events from the more distant past, using a Time Machine backup and my free utilities LogUI and Logistician. My target is the macOS 26.3 Tahoe update installed on my Mac mini M4 Pro on 11 February, and I performed this analysis 11 days later, on 22 February.

When was the event?

In this case I remember updating at around 18:30-19:30 on 11 February, but I don’t even need to recall the date. I first copied the logdata.statistics.1.jsonl file from my active log in /var/db/diagnostics to a working folder in ~/Documents, then opened it using Logistician.

The log file listing between 18:10:39 and 19:26:47 on 11 February 2026 shows a remarkably rapid turnover of log files that’s an obvious marker of that update. Highlighted here is a Persist file that’s exceptionally large at 139 MB of log entries for a collection period of just 37 seconds, although like other tracev3 log files in the Persist folder that only takes 10.5 MB of disk space.

Retrieve the log

Although I’m confident those logs were removed many days ago, I open LogUI, then select its Diagnostics Tool from the Window menu. I click the Get Info tool and select my active log in /var/db/diagnostics. That tells me that the oldest log entry there dates from 17 February, so there’s no point in trying to find those entries in that log.

Like all good backup utilities, Time Machine also backs up the whole of the log folders, and I can use those to create a logarchive file for analysis. I therefore locate the next backup made after those log entries were written, on 12 February, and copy the /var/db/diagnostics and /var/db/uuidtext folders into a new folder in my working folder, ready to turn them into a logarchive.

In LogUI, I open its Logarchive Tool from the Window menu and use that to turn those folders into a logarchive I can access using LogUI. I check that freshly created logarchive using the Catalogue tool to confirm that it contains the log files I want to browse.

Identify the event

With the historical log safely preserved in a logarchive and a defined time of interest, my next task is to identify the event I want to investigate. In this case, I could probably go straight ahead and look at all entries for a few seconds, but in other circumstances you may need to know which entries to look for.

Back in Logistician, I select that extraordinary Persist log file and view it in a Chart. Most of the other log files over this period look like this:

with large quantities of entries from softwareupdated, com.apple.MobileSoftwareUpdate and similar processes. But the huge Persist file that filled in only 37 seconds is exceptional.

Almost all its entries are from audiomxd, and all other entries are dwarfed by its size.

Browse the event

By default when you click on LogUI’s Get Log tool it will fetch those log entries from the active log. To switch that source to my logarchive file, I click on the Use Logarchive tool and select the logarchive I just created in my Documents folder. To remind me that it’s no longer looking in the active log, that window then displays a red-letter caution of !! Logarchive to the left of the Start control. That also reminds me to use dates and times within the range covered by that logarchive.

I set the Start to ten seconds into the collection period of that large Persist file, a period of 1 second, and the maximum number of entries to 100,000, then click on the Get Log tool.

This is one of the most remarkable log extracts I have ever seen: in this 1 second period, the audiomxd process in com.apple.coremedia wrote about 53,000 entries to the log. Over the 37 seconds of log records in that single Persist file, audiomxd must have written at least 1.5 million log entries. These are all apparently the result of the ‘death’ of the AudioAccessory service audioaccessoryd, and its recovery after updating macOS.

Summary

  1. Identify approximate time of event from /var/db/diagnostics/logdata.statistics.1.jsonl using Logistician.
  2. Check in LogUI whether that falls within the period of the active log.
  3. If not, retrieve /var/db/diagnostics and /var/db/uuidtext from the next backup made after the event.
  4. Convert those folders into a logarchive using LogUI’s Logarchive tool, and check it contains the period of the event.
  5. Identify the processes involved using Logistician’s chart.
  6. Set LogUI to use that logarchive, enter the correct date and time, and get log entries for the required processes.

How does macOS keep its log?

One of the mysteries of the Unified log since its introduction almost ten years ago in macOS Sierra has been how it keeps such extensive records of what happens between the start of kernel boot and the user logging in. This must have been sufficiently challenging before Catalina, but since then the separate Data volume, where log files are stored, has been locked away by FileVault until the user’s password enables it to be accessed.

Log entries are initially stored in memory, and from there most are written to tracev3 files in their respective directories in /var/db/diagnostics. logd is responsible for that, and for maintaining the contents of that directory. Ironically, logd maintains its own old-style log in text files inside that directory, but its terse entries there reveal little about how it does its job. What they do reveal is that logd has an assistant, logd_helper, with its own old-style text logs that are a bit more informative.

In Apple silicon Macs, logd_helper apparently connects every few minutes to five coprocessors, SMC (System Management Controller), AOP (Always-On Processor), DCPEXT0, DCPEXT1 and DCPEXT2 (Display Co-Processors). There appears to be nothing equivalent in Intel Macs. It also conducts ‘harvests’ shortly after it has started up, so solving the mystery of where all the log entries are saved prior to unlocking the Data volume.

Soon after the start of kernel boot, once the Preboot volume is generally accessible and logd is running, logd starts writing log entries to temporary storage in the Preboot volume. You can see that in the path /System/Volumes/Preboot/[UUID]/PreLoginData, where there’s a complete diagnostics directory, and one to store longer log entries in the warren of directories in uuidtext. Those are identical in layout to permanent log storage in /var/db.

Shortly after user login, with the Data volume unlocked at last, logd_helper is started, and it merges log files from Preboot/[UUID]/PreLoginData/diagnostics and those files in Preboot/[UUID]/PreLoginData/uuidtext into permanent log storage in the Data volume in /var/db/diagnostics and /var/db/uuidtext, a process it refers to as harvesting. logd_helper can also harvest from entries stored in memory.

Once this merger has been completed, log directories in Preboot/[UUID]/PreLoginData/diagnostics are left empty, as are logdata.statistics files there, making the log record of kernel boot complete, right up to unlocking of the Data volume.

That explains how tens of thousands of log entries can still be recorded faithfully in a Data volume that can’t be unlocked for some time yet.

Once normal logging to /var/db/diagnostics is running, logd maintains the tracev3 files containing log entries there. Its goals appear to be:

  • in the Persist folder, file size is normally 10.4-10.5 MB, although it may be smaller when truncated by shutdown;
  • Persist files are removed with age to maintain a typical total size for that folder of just under 530 MB in just over 50 files, bringing the size of the whole diagnostics folder to between 1-2 GB;
  • in Special and Signpost, log file size is normally 2.0-2.1 MB when closed, but entries are weeded progressively until each file is empty and can be deleted;
  • timesync files are less than 1 KB;
  • HighVolume is seldom if ever used.

The overall effect on historical log entries is that those in Persist files are rate-sensitive and removed sooner when log entries are written more frequently. However, selected entries in Special files may last considerably longer, but become less frequent with age. A few of those may be retained for hours or days longer than the oldest in Persist files. I have no insight into the rules that logd follows when deciding when to weed entries from Special files.

Extended entries stored in the warren of folders in /var/db/uuidtext are purged periodically on request from CacheDelete, as with other purgeable storage, at least once a day. That should ensure that the contents are only retained while they’re still referred to by entries in the log files.

As far as I’m aware, the user gets no say in the size limits imposed on log storage, and there’s no option to increase them to allow logs to be retained for longer. However, as both /var/db/diagnostics and /var/db/uuidtext folders should be backed up by Time Machine and most third-party utilities, you can always analyse those backups when you need to check older log entries.

Last Week on My Mac: A log statistician

If you don’t know exactly what you’re looking for, and when it happened, the log has been a hostile place. Doom-scrolling through tens of thousands of log entries in the hope of stumbling across a clue is tedious, and the odds have been stacked against you. So last week I’ve been doing something to redress the balance and shorten those odds, and I’m delighted to offer its first version in Logistician. This has nothing to do with logistics, but is all about log statistics.

Alongside folders containing your Mac’s Unified log files, in /var/db/diagnostics, you’ll see files with names starting with logdata.statistics. A couple are text files that only go back a day or two, and others have the extension jsonl. If you were privileged to test some beta-releases of macOS Tahoe, you may have some database files as well, but here it’s those jsonl files I’m concerned with.

Inside them are basic statistical summaries of every log file that’s been saved in your Mac for the last few weeks or months. Even though the original log files have long since been deleted, summaries of their contents are still available in files like logdata.statistics.1.jsonl, and those are opened up by Logistician.

As the files in /var/db/diagnostics are still live, and may be changed as logd does its housekeeping, copy those jsonl files to somewhere in your Home folder, like a folder in ~/Documents. Open Logistician, click on its Read JSONL tool, select one of those copies and open it.

Logistician’s window displays the file’s contents in a list, with the oldest at the top. It gives the date and time that file was saved, just after the last log entry was written to it, its size in KB, whether it was a Persist (regular log), Special (longer supplementary log entries) or Signpost (performance measurements) collection, and the name of the file.

Select one of those file entries and click on the Chart selection tool at the top right to see its data plotted out in the Chart view.

Data provided for each log file listed includes a breakdown of the total size of log entries from that process or subsystem, and Logistician’s Chart view displays those data as a bar chart. The height of each bar represents the total size in KB of log entries made by that process in that specific log file. As there are a 50 bars available, two sliders set the size and location of that window on the data:

  • Start sets the number of the first bar on the left, beginning at 1 for the greatest size, usually the kernel, and increasing to 40 for a process with very few log entries, just ten from the smallest.
  • Width sets the number of bars to display, ranging from 6 to 25. The more shown, the harder it is to read the names of processes at the foot of each bar, and the less precisely you can read the size of their log data at the right.

These sliders are set to show 9 bars from number 6 at the left (the sixth highest log data, written by launchd) to number 14 at the right (14th highest, written by ContinuityCaptureAgent). Of interest here are around 400 KB of log entries from NeptuneOneWallpaper.

Here are 8 bars from 17 to 24, with smaller quantities written to the log of around 200 KB each. They include the DAS service dasd and cloudd for iCloud.

It’s easy to flip quickly through a series of log files: click on the next file you want to view in the main list, click on the Chart selection tool and values will be displayed immediately.

Fascinating though that might be, it doesn’t in itself answer many questions. Add a log browser like LogUI, though, and the combination helps you locate and identify unusual activity, problems, and specific events.

I happened to notice one Special log file that was closed at 19:11:17 on 19 February has high log data from softwareupdated. The previous Special log file was closed at 18:20:04, so somewhere between those times my Mac checked for software updates.

To ensure the full entries were still available in the log, I opened LogUI’s Diagnostics Tool to confirm that its earliest entries were a couple of days earlier.

I then set LogUI to a Start time of 18:20:04 with a Period of 600 seconds, and a Predicate set to a processImagePath of softwareupdated, to look for entries from that process. My first check located all the softwareupdated entries around 18:29:25, when I had apparently run SilentKnight. As a bonus, I discovered from those that SilentKnight was stuck in app translocation, so have been able to fix that (again).

Logistician version 1.0 build 7 for macOS Sonoma and later is now available from here: logistician106
I will add it to other pages here when I’m more confident that this initial version is stable and does what it claims in its detailed Help book.

Enjoy!

Friday Magic: See real log entries

One of the features introduced in the new Unified log back in macOS Sierra was its ability to protect privacy by redacting potentially sensitive contents. Although a good thing, an extraordinary mistake in High Sierra, which revealed an encryption password in plain text, has led to many entries being so heavily redacted that they’re gutted of all meaning by <private>.

Another bone of contention has been the protection provided to key information about network connections. Originally that could be removed by setting the CFNETWORK_DIAGNOSTICS environment in Terminal. Following a vulnerability addressed in Ventura 13.4 that was protected by SIP, raising the barrier for that as well.

This Friday’s magic trick is one of the most complicated I have attempted yet, and is going to show how you can put meaning back into your log and discover where all those network connections are going. Because of the changes necessary, this is easiest to perform in a macOS VM, allowing you to discard the VM when you’re done.

Setting up

You don’t have to use a VM, but if you use a Mac it shouldn’t be your production system, and you’ll need to set it back to its original settings when you’ve finished.

I took a freshly updated VM with macOS Tahoe 26.3, duplicated that in the Finder, and used the duplicate so I could easily trash it.

I then installed the profile I have made available here to remove privacy in the log. Double-click the profile, then confirm in System Settings > General > Device Management that you want to add and enable it. From then until you remove that profile, all redactions in the log should cease.

To disable SIP, I started the VM up in Recovery mode, opened Startup Security Utility and downgraded boot security there. I then opened Terminal and disabled SIP using the command
csrutil disable

If you want to, while you’re in Terminal you can run the command to enable network diagnostics
launchctl setenv CFNETWORK_DIAGNOSTICS 3
noting that, in Recovery, there’s no sudo required or available. If you do this now, it should also apply when you restart.

Once that has been completed, restart back into normal mode and check the profile is still enabled. If you didn’t enable network diagnostics there, open Terminal and enter
sudo launchctl setenv CFNETWORK_DIAGNOSTICS 3

Testing

Ensure the menu bar clock is displaying seconds, and just as it turns those to 00 seconds, run an app like SilentKnight that connects to remote sites. View the log for that period using LogUI (or whatever), and you should see the effects of both privacy removal and network diagnostics. The log is now a very different place, and far more informative.

Results

These are comparable log entries, before and after pulling this trick.

Privacy removal

Normal log entry:
00.541160 com.apple.launchservices Found application <private> to open application <private>

Privacy removed:
00.540882 com.apple.launchservices Found application SilentKnight to open application file:///Applications/SilentKnight.app/
restoring the app name and location that had been redacted to render the log entry meaningless.

Network diagnostics

Normal log entry:
01.240305 com.apple.network [C5 752CDB24-4E91-40B0-A837-9D7B9DE41B9E Hostname#7c4edf26:443 tcp, url hash: b62568a6, tls, definite, attribution: developer, context: com.apple.CFNetwork.NSURLSession.{AA60FF41-BA48-4332-B223-0C76A78CCEA7}{(null)}{Y}{2}{0x0} (private), proc: 9FC457E5-3273-37FA-BAEE-749A710F48E5, delegated upid: 0] start
which obfuscates the URL in a hash of b62568a6.

Network diagnostics:
01.103602 com.apple.network [C1 8BF615A6-CBEF-48D8-BE2F-CEF861B70BEE Hostname#99dda594:443 quic-connection, url: https://raw.githubusercontent.com/hoakleyelc/updates/master/applesilicon.plist, definite, attribution: developer, context: com.apple.CFNetwork.NSURLSession.{58709C77-3924-44EA-8563-4B44F0223AB6}{(null)}{Y}{2}{0x0} (private), proc: 06DF065F-71F6-36D9-BBAE-533B2D327BF4, delegated upid: 0] start
which reveals the full URL of https: // raw.githubusercontent.com/hoakleyelc/updates/master/applesilicon.plist, the property list on my Github containing firmware versions for Apple silicon Macs.

Remember

If you did this on a physical Mac, don’t forget to remove the profile, to enable SIP and return Startup Security Utility to Full Security, which should automatically disable network diagnostics.

How does an Apple silicon Mac tell the time?

Anyone familiar with Doctor Who will be aware of the power brought by control over time. Although there have been sporadic reports of problems with Apple silicon Macs keeping good time, and they may not synchronise sufficiently accurately for some purposes, they appear to have generally good control over time.

Last year I explained how macOS now uses the timed service with network time (NTP) to perform adjustments while running. This article looks at what happens before that, during startup, when the Mac has only its own devices to tell the time. Although the user sees little of this period, anyone accessing the log recorded during startup could find the timestamps of entries affected by adjustments. It may also provide insights into how Apple silicon Macs tell the time.

Methods

To investigate clock initialisation and adjustment during startup, I analysed approximately 100,000 consecutive log entries freshly recorded in the log of a Mac mini M4 Pro booting cold into macOS 26.2, following a period of about 14 hours shut down. These entries covered the period from the initial boot message for approximately 27 seconds, after I had logged in and the Desktop and Finder were displayed. This Mac is configured to Set time and date automatically from the default source of time.apple.com in Date & Time settings, and has both WiFi and Ethernet connections.

Multiple adjustments to wallclock time were recorded during that period, resulting in significant discontinuities in log timestamps. For example,
11:40:15.888717 void CoreAnalyticsHub::handleNagTimerExpiry(IOTimerEventSource *)::838:messageClients of 37 available events
11:40:10.045582 === system wallclock time adjusted
11:40:10.053309 000009.664065 Sandboxing init issue resolved: "Success"
11:40:10.053447 com.apple.sandbox.reporting Sandbox: wifiFirmwareLoader(49) deny(1) file-read-metadata /Library
11:40:10.112333 === system wallclock time adjusted
11:40:10.127559 com.apple.Installer-Progress Progress UI App Starting

In the first of those adjustments, the wallclock time was retarded by up to 5.84 seconds, and in the second it was advanced by at most 0.0589 seconds.

Because of those wallclock adjustments, times recorded in the log are discontinuous. Although it’s not possible to correct for the adjustments made completely accurately, assuming that each of those adjustments corresponds to standard time (such as UTC), those can be applied backwards through times recorded in the log to bring them closer to that standard. This could result in some brief periods of entries with times earlier than preceding entries, but is as accurate an estimate possible given the data.

Adjustments found

The narrative constructed from the log is summarised in the following table.

This starts with the initial record of the boot process, giving its UUID, within a second of the Power button being pressed (at approximately 0.0 seconds) to initiate the startup. That’s followed by a gap of just over 5 seconds before the second log entry.

The first two wallclock adjustments were made at 10 seconds, before there was any evidence of network connectivity. Those took place one second before the loginwindow was launched.

Two subsequent adjustments were made shortly after 24 seconds, immediately following the removal of the loginwindow after successful authentication. A further three adjustments followed in the 2.5 seconds after the user was logged in, while the Desktop and Finder were being prepared and displayed.

Log entries reporting the timed service was running didn’t occur until shortly before the last of those wallclock adjustments, and that was recorded in the log 0.0003 seconds before timed obtained its first external time.

Multiple internal clocks

A total of seven wallclock time adjustments were made before timed was able to obtain a time from any external reference. Over the first 10 seconds, before the initial wallclock adjustment, those were substantial, amounting to 5.8 seconds. For those changes to be made to wallclock time, there must be another source of time deemed more accurate, against which wallclock time can be compared and adjusted.

I’ve been unable to find any trustworthy information about internal clocks and timekeeping in Apple silicon Macs. It has been suggested (and Google AI is confident) that local reference time is obtained from the Secure Enclave. However, Apple’s only detailed account of features of the Secure Enclave fails to mention this. Initialisation of the Secure Enclave Processor also occurs relatively late during kernel boot, in this case at around the same time as the first two adjustments were made to wallclock time.

Conclusions

  • Apple silicon Macs may make multiple adjustments to wallclock time during startup, resulting in several discontinuities in log timestamps, which can cause discrepancies in event times.
  • Several of those can occur before the Mac has access to external time references, and timed is able to obtain an external time against which to adjust wallclock time.
  • Wallclock time can’t be the only local source of time, and appears to be adjusted against another local source.
  • Time isn’t as simple as it might appear.

Last Week on My Mac: Signs of distress

Over the last few weeks, I’ve been posed a series of questions that can only be resolved using the log, such as why Time Machine backups are failing to complete. The common confounding factor is that something has gone wrong at an unknown time, but if you don’t know exactly what you’re looking for, nor when to look, the Unified log is a hostile place, no matter which tool you use to browse its constant deluge of entries.

This is compounded by the fact that errors seldom come singly, but often propagate into tens of thousands of consequential log entries, and those not only make the cause harder to find, but can shorten the period covered by the log. In the worst case, by the time you get to look for them, the entries you needed to find are lost and gone forever.

In late 2017, I experimented with a log browser named Woodpile that approached the log differently, starting from a histogram of frequencies of log entries from different processes.

keychainopen05

Viewed across the whole of a log file, this could draw attention to periods with frequent entries indicating potential problems.

keychainopen06

The user could then zoom into finer detail before picking a time period to browse in full detail. Of course, at that time the Unified log was in its early days, and entries were considerably less frequent than they are now today.

logchart05

A related feature in my later Ulbow log browser provides similar insights over the briefer periods covered, but like Woodpile that view seems little used, and isn’t offered in LogUI.

Another concern is that a great deal of importance can be recorded in the log, but the user is left in the dark unless they hunt for it. I documented that recently for Time Machine backups, and note from others that this isn’t unusual. Wouldn’t it be useful to have a utility that could monitor the log for signs of distress, significant errors or failure? To a degree, that’s what The Time Machine Mechanic (T2M2) sets out to do, only its scope is limited to Time Machine backups, and you have to run it manually.

Given the sheer volume and frequency of log entries, trying to monitor them continuously in real time would have a significant impact on a Mac, even if this were to be run as a background process on the E cores of an Apple silicon Mac. In times of distress, when this would be most critical, the rate of log entries can rise to thousands per second, and any real-time monitor would be competing for resources just when they were most needed for other purposes.

A better plan, less likely to affect either the user or the many background services in macOS, would be to survey log events in the background relatively infrequently, then to look more deeply at periods of concern, should they have arisen over that time. The log already gives access to analysis, either through the log statistics command, or in the logdata.statistics files stored alongside the folders containing the log’s tracev3 files. Those were used by Woodpile to derive its top-level overviews.

Those logdata.statistics files are provided in two different formats, plain text and JSON (as JSON lines, or jsonl). Text files are retained for a shorter period, such as the last four days, but JSON data are more extensive and might go back a couple of weeks. I don’t recall whether JSON was provided eight years ago when I was developing Woodpile, but that app parses the current text format.

Keeping an eye on the log in this way overlaps little with Activity Monitor or other utilities that can tell you which processes are most active and using most memory, but nothing about why they are, unless you run a spindump. Those also only show current figures, or (at additional resource cost) a few seconds into the past. Reaching further back, several hours perhaps, would require substantial data storage. For log entries, that’s already built into macOS.

I can already see someone at the back holding up a sign saying AI, and I’m sure that one day LLMs may have a role to play in interpreting the log. But before anyone starts training their favourite model on their Mac Studio M3 Ultras with terabytes of log entries, there’s a lot of basic work to be done. It’s also worth bearing in mind Claude’s recent performance when trying to make sense of log entries.

What do you think?

Which cryptexes does macOS Tahoe load?

Since macOS Ventura, if not in late releases of Monterey, macOS has been loading Safari and other parts of the operating system, including dyld caches, in cryptexes, instead of installing them in the Data volume. In addition to those, Apple silicon Macs with AI enabled load additional cryptexes to support its features. I detailed those for macOS 15.5 last summer; this article updates that information for macOS Tahoe 26.2.

Cryptexes

These first appeared on Apple’s customised iPhone, its Security Research Device, which uses them to load a personalised trust cache and a disk image containing corresponding content. Without the cryptex, engineering those iPhones would have been extremely difficult. According to its entry in the File Formats Manual from five years ago (man cryptex), ‘A cryptex is a cryptographically-sealed archive which encapsulates a well-defined filesystem hierarchy. The host operating system recognizes the hierarchy of the cryptex and extends itself with the content of that hierarchy. The name cryptex is a portmanteau for “CRYPTographically-sealed EXtension”.’

In practice, a cryptex is a sealed disk image containing its own file system, mounted at a chosen location within the root file system during the boot process. Prior to mounting the cryptex, macOS verifies it matches its seal, thus confirming it hasn’t been tampered with. Managing these cryptexes is the task of the cryptexd service with cryptexctl. Because cryptexes aren’t mounted in the usual way, they’re not visible in mount lists such as that produced by mount(8).

System cryptexes

Once kernel boot is well under way, APFS mounts containers and volumes in the current boot volume group, followed by others to be mounted at startup. When those are complete, it turns to mounting and grafting the three standard system cryptexes, os.dmg containing system components such as dyld caches, app.dmg containing Safari and its supporting components including WebKit, and os.clone.dmg a clone of os.dmg that shares its data blocks with os.dmg. Grafting all three takes around 0.034 seconds, and typically occurs over 15 seconds after APFS is started, and around 25 seconds after the start of boot.

AI cryptex collection

About 5 seconds after the system cryptexes have been grafted, APFS checks and grafts a series of cryptexes primarily involved with Apple Intelligence features. These are handled one at a time in succession, and are listed in the Appendix. Typical time required to complete this collection is less than 0.5 seconds.

Ten new AI cryptexes have been added in Tahoe, and five of Sequoia’s have been removed, bringing the total including the PKI trust store from 23 to 28. Notable among the additions are:

  • language instruction support for image tokenisation
  • support for drafting replies in Messages
  • suggesting action items in Reminders
  • support for Shortcuts
  • suggesting recipe items.

Conclusions

  • Apple silicon Macs running macOS 26.2 with AI enabled load 28 additional cryptexes to support AI.
  • One cryptex is a secure PKI trust store, whose volume name starts with Creedence.
  • These cryptexes are installed and updated as part of macOS updates, although they could also be installed or updated separately, for example when AI is enabled.
  • If a Mac shows an unusual mounted volume with a name starting with Creedence or Revival, that’s almost certainly the respective disk image, which should normally be hidden and not visible in the Finder.

Appendix

Disk image names for the main AI cryptex collection in macOS 26.2 (Apple silicon):

  • UC_FM_CODE_GENERATE_SAFETY_GUARDRAIL_BASE_GENERIC_H16S_Cryptex.dmg
  • UC_FM_CODE_GENERATE_SMALL_V1_BASE_GENERIC_H16_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_300M_ADM_PROMPT_REWRITING_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_300M_BASE_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_300M_IMAGE_TOKENIZER_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_AUTONAMING_MESSAGES_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_BASE_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_CONCISE_TONE_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_FM_API_GENERIC_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_FRIENDLY_TONE_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MAGIC_REWRITE_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MAIL_REPLY_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MESSAGES_ACTION_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_MESSAGES_REPLY_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_PHOTOS_MEMORIES_ASSET_CURATION_OUTLIER_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_PHOTOS_MEMORIES_TITLE_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_PROFESSIONAL_TONE_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_PROOFREADING_REVIEW_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_REMINDERS_SUGGEST_ACTION_ITEMS_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_SHORTCUTS_ASK_AFM_ACTION_3B_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_SHORTCUTS_ASK_AFM_ACTION_3B_V2_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_SUGGEST_RECIPE_ITEMS_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_SUMMARIZATION_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_EVENT_EXTRACTION_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_PERSON_EXTRACTION_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_VISUAL_IMAGE_DIFFUSION_V1_BASE_GENERIC_H16S_Cryptex.dmg
  • UC_IF_PLANNER_NLROUTER_BASE_EN_GENERIC_H16S_Cryptex.dmg

New cryptexes are shown in bold. When these are mounted, their volume names add the prefix RevivalB13M202xxx where xxx are ID digits for that cryptex. That prefix replaces RevivalB13M201xxx used in macOS 15.5.

Additionally, a volume is mounted as a PKI trust store, as Creedence11M6270.SECUREPKITRUSTSTOREASSETS_SECUREPKITRUSTSTORE_Cryptex.

The following cryptexes found in macOS 15.5 appear to have been removed from 26.2:

  • UC_FM_LANGUAGE_INSTRUCT_3B_DRAFTS_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_EVENT_EXTRACTION_MULTILINGUAL_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_TEXT_PERSON_EXTRACTION_MULTILINGUAL_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_INSTRUCT_3B_URGENCY_CLASSIFICATION_DRAFT_GENERIC_GENERIC_H16S_Cryptex.dmg
  • UC_FM_LANGUAGE_SAFETY_GUARDRAIL_BASE_GENERIC_GENERIC_H16S_Cryptex.dmg

How Time Machine backups can fail silently

If anyone claims they have engineered something so you can set it and forget it, remind them of the third step in that process, where you come to regret it. Without attentive monitoring, any system is likely to fail without your being aware. Time Machine is a good example, and this article illustrates how it can fail almost totally without you being aware that anything has changed.

Scenario

I’ve recently been testing a security product on my Mac mini M4 Pro. One of its more novel features is device control similar to Apple’s Accessory Security for Apple silicon Mac laptops. I’m sure I’m not the only person who has wondered why that feature hasn’t been incorporated into desktop models, and here is a third-party developer doing just that and more. For their implementation lets you specify precisely which peripherals can be connected, down to the serial number of each SSD.

When I installed and went through the onboarding of this product, I naturally assumed that my Time Machine backup storage, an external SSD, would be allowed by this control, as it was connected at the time. At that time I was offered no option to manually allow it, and as its three volumes remained listed in the Finder’s Locations I didn’t check its settings.

When I started that Mac up the following day, I discovered that I had lost access to that external SSD. Its three volumes were still there in the Finder, but any attempt to open them failed with an error. I quickly disabled the device control feature, just in time to allow Time Machine to make its first hourly backup since 12:35 the previous day, just after I had installed the security software. Time Machine had happily gone that long without backing up or warning me that it had no backup storage.

Compare that with what would have happened with any other backup utility, such as Carbon Copy Cloner, which would have informed me of the error in terms loud and clear. I think this results from Time Machine’s set and forget trait, and its widespread use by laptop Macs that are often disconnected from their backup storage. Thankfully I hadn’t come to regret it, this time.

Evidence

Not only does Time Machine not draw the attention of the user to this error, but it can be hard to discover from the log. Run T2M2, for example, and the evidence is subtle:
Times taken for each auto backup were 4.3, 0.5, 0.6, 10.5, 2.5, 0.8, 0.9, 0.7, 0.6 minutes,
intervals between the start of each auto backup were 86.6, 29.3, 1197.7, 55.5, 60.3, 60.1, 60.1, 60.3 minutes.

(Emphasis added.)

A gap of almost 20 hours between backups far exceeds the nine hours it was shut down overnight. But at the end, T2M2 reports
✅No error messages found.

I know from the security software’s log that it had blocked access to the backup storage after Time Machine had completed the last of its hourly backups at around 12:35. This is what the next attempt to back up reported:
13:23:54.382 Failed to find any mounted disk matching volume UUIDs: {("A3A3DADA-D88E-499B-8175-CC826E0E3DE4")}
13:23:54.382 Skipping scheduled Time Machine backup: No destinations are potentially available
13:23:54.487 attrVolumeWithMountPoint 'file:///Volumes/VMbackups/' failed, error: Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"
13:23:54.488 attrVolumeWithMountPoint 'file:///Volumes/OWCenvoyProSX2tb/' failed, error: Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"
13:23:54.489 attrVolumeWithMountPoint 'file:///Volumes/Backups%20of%20MacStudio%20(7)/' failed, error: Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"
13:23:59.103 attrVolumeWithMountPoint 'file:///Volumes/VMbackups/' failed, error: Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"
13:23:59.104 attrVolumeWithMountPoint 'file:///Volumes/OWCenvoyProSX2tb/' failed, error: Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"
13:23:59.104 attrVolumeWithMountPoint 'file:///Volumes/Backups%20of%20MacStudio%20(7)/' failed, error: Error Domain=NSPOSIXErrorDomain Code=1 "Operation not permitted"

Time Machine then fell back to what it has long been intended to do in such circumstances, making a local snapshot of the volume it should have backed up. This starts with a full sync of buffers to disk storage,
13:23:59.876 FULLFSYNC succeeded for '/System/Volumes/Data'
in preparation for that snapshot
13:24:00.024 Created Time Machine local snapshot with name 'com.apple.TimeMachine.2026-01-12-132359.local' on disk '/System/Volumes/Data'

These were repeated at 14:24:06.638, and hourly thereafter until the Mac was shut down. Accompanying those few entries were tens of thousands of errors from APFS and Time Machine.

The only clue as to the cause was a single log entry
14:24:06.805543 EndpointSecurity ES_AUTH_RESULT_DENY: event 1
in which Endpoint Security is logging the event that access to the external device was denied.

Lesson

If you do just set it and forget it, you will come to regret it. Attentive monitoring is essential, and when anything does go wrong, don’t pass over it in silence.

又一次重建 Hexo Blog

昨天想更新 blog,没想到就是下午迁移出了问题,然后又是花了一个晚上排查问题。后来才发现好简单……所说是无用的调试,还是记录一下,免得以后又遇到。

掉进坑(无用调试)

执行 git pull origin main 后出错

1
2
3
4
5
6
luckye@Lucky-Mac-mini lucky-blog % git pull origin main

ERROR: Repository not found.
[fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists.

提示远程仓库不存在,问了 ChatGPT 说是SSH 密钥的问题,生成公钥和私钥重新添加后也不起作用。

ls -al 后有 .git 文件,确认下命令行有没有走代理流量

1
curl -I www.google.com

返回正常,再看看能不能连接到 GitHub

1
ssh -T -p 443 git@ssh.github.com

输入 yes 也是正常。看看 git branch 在哪个分支上,是 main。

在 GitHub-Code-Local-SSH 直接复制项目名,执行

1
git remote set-url origin git@github.com:xxx/xxx.github.io.git

输入后没反应是正确的,接着

1
2
3
git pull
返回
fatal: couldn't find remote ref main

再确认分支有没有错

1
2
3
git branch -a
返回
* main

接着查看当前 Git 仓库的本地配置文件内容的命令

1
2
3
4
5
6
7
8
9
10
11
12
cat .git/config
返回
[core]
repositoryformatversion = 0
filemode = true
bare = false
[remote "origin"]
url = git@github.com:username/repo.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "main"]
remote = origin
merge = refs/heads/main

接着查看当前 HEAD(最新提交)的一些详细信息

1
2
3
4
5
6
7
git show --summary
返回
commit 5d0ca94755d1dbe39f5be0983 (HEAD -> main, origin/main, origin/HEAD)
Author: xxx <xxxgmail.com>
Date
Fri Oct 17 23:47:06 2025 +0800
"update"

这些看着都没问题那就只好重新 clone 一下了。重新 push 创建了新的 main 分支

1
git push -u origin main

我把 Default 设置为 main,把 master 删除。重新 clone 一下 main 分支。更改默认分支后进行 clone。

移动再克隆

第一步直接 mv

1
mv ~/本地文稿/lucky-blog ~/本地文稿/lucky-blog-bak

bak 就是 backup 的缩写

第二步 git clone

1
git clone git@github.com:xxx/xxx.github.io.git ~/xxx-blog

重新安装 Hexo

1
ls /opt/homebrew

奇了怪了,Homebrew 也不生效,其实并不是没了,而是当前 PATH 没指向 /opt/homebrew/bin。干脆就全部安装下吧。要开启 Qutumult X 开启 SS 服务器功能,不然没有代理无法下载。安装 Homebrew

1
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

要执行下面三条命令才能安装

1
2
3
4
5
echo >> /Users/lucky/.zprofile

echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> /Users/lucky/.zprofile

eval "$(/opt/homebrew/bin/brew shellenv)"

clone 之后文件夹重命名,重新安装 Hexo

1
2
3
brew install node
brew install git
npm install -g hexo-cli

水落石出

就在我以为重装能成功之后结果还是不行,越弄越烂,最后还是麻烦余光帮我排查,在私人仓库添加一个 collaborator,他上去一看就发现问题所在。

仓库和 Cloudflare 所关联的根本不是同一个半年前我已经迁移到 Cloudfare,新建了一个 repo,全然忘记老的仓库不再使用,所以牛头不对马嘴😅

Clone 正确仓库

那接下来就简单了,重新备份(昨晚已经备份过,跳过),然后再 clone。类似于重启大法,对正确的这个仓库没有进行过操作,都不用安装 Homebrew,Hexo,所以异常的快速简单。

使用 GitHub 图形界面

改动了啥一目了然

这一次,终于抛弃那些繁琐的命令,本来就搞不懂。下载 GitHub 图形界面,登陆 GitHub 账号,然后设置 git config 关联的邮箱,因为 git 标准里要求必须有邮箱(和登录的 GitHub 账号没有关系),可以选择匿名邮箱账号,为了不出幺蛾子用自己邮箱登陆,选中仓库(别再选错),一键 Clone,选择保存路径。

然后只有两步,Commit to main 然后 Push origin,搞定。想看本地渲染的还是

1
2
cd ~/blog 
hexo s

如果提示没有项目依赖就装一个,都不用全局安装

1
npm install

在老的 MacBook Pro 上也同样部署了,比以前方便太多,另一台下次更改前先 Push origin (⌘ + P) 一下就好。

大功告成!感谢余光,FriendsA 帮我排查了这么久时间🙏

疑难点

之前排查的过程中,老 repo 的线上分支 Default 是 master,但是也有 main,以为是项目的 mastermain 不同导致的错误。(这两个其实只是名字区别,只是最近主流都用 main 了)

  • branch 指的是 Hexo push 的远程分支

  • 如果 Git 仓库默认分支是 main,_config.yml 必须写 main,否则 Hexo 会往 master 推送(即使你的本地 default 是 main)

  • 只修改 _config.yml 并没有提交,这个修改是本地未保存的 Git 改动。Hexo 会用最新的文件夹去 push,但 Git 依旧会检测到本地未提交改动

  • 因此,如果想 Hexo push 到 main,最好先把 _config.yml 的修改提交或保存,避免被 Git 阻止。

在 _config.yml 中修改

deploy:
type: ‘git’
repo: git@github.com:luckylele666/luckylele666.github.io.git
branch: master

master 修改为 main

如果是在本地部署可以这样设置,但是我现在的构建方式不是这样了。

原理

传统 Hexo 部署

以前常用的流程:Hexo 本地 → hexo g(生成静态文件) → hexo d(推送到远程服务器/仓库)

  • _config.yml 的 deploy配置是关键,指定仓库和分支
  • Hexo 自己负责生成 public/ 并把它推送到 GitHub 或其他远程
  • 必须在本地先生成,再上传

不足:

  • 每次修改都要本地生成
  • 本地要安装完整 Hexo 环境(Node、主题、插件)才能生成

Cloudflare Pages 部署

我现在的模式是:GitHub 仓库(纯代码,私人仓库,更加隐私) → Cloudflare Pages(自动构建 & 发布)

特点:

  1. 自动化构建:Cloudflare 会在 push 到 main 分支时自动执行 Hexo generate,生成 public/ 并直接发布到站点
  2. 不依赖 Hexo deploy,不用再在本地运行 hexo d,也不需要在 _config.yml 里配置分支
  3. 纯 Git 流程:只需要用 git add → commit → push 管理版本,Cloudflare 会负责构建和部署(用了 GitHub 图形管理界面甚至都不用 Git 命令

为什么不用 hexo g,hexo d

因为 hexo d 其实做了两件事:

  1. 生成静态文件(hexo g)
  2. 把生成的 public/ 推送到指定远程分支(hexo d)

而 Cloudflare 已经代替了第 2 步:

  • Cloudflare 会在云端执行构建(等同于 hexo g)
  • 自动把生成结果发布到你的 Pages 网站

所以本地运行 hexo g hexo d 反而是多余的,甚至可能把 Git 历史搞乱(尤其是 _deploy_git/ 里嵌套 Git 仓库的情况)。


用下来个人感觉在代码方面,免费版之间,ChatGPT>Grok>Gemini,不要太过于相信 AI 的答案,现阶段感觉还是不如人脑。

💡注释

  • ~ 是 用户>xxx 的别名
  • cd 是 change directory(切换目录) 的缩写
  • ~ 表示 当前用户的主目录(home 目录)
  • /blog 是主目录下的一个子文件夹

在 github 架设 hugo blog(纯浏览器操作)

我其实对 Hugo 不熟,不知道这算不算重新发明了一遍轮子。但我搜索「如何在 github 上,用 hugo 架设自己的 blog?」时,搜到的教程,都需要用户在自己的电脑上,安装运行各种 git 和 hugo 的相关命令,感觉对新手并不友好。所以,我试着写了一个流程,让新人完全只需要在网页浏览器上操作,就能快速生成自己的 blog 网站。

所有操作都在 Github 这个项目上进行:
https://github.com/fivestone/hugo-papermod-beginning

这个项目本质上,就是搭了一个空白的 hugo 网站,让用户 fork 到自己的账户下,设置一下就能直接使用。对功能和界面有什么额外要求的话,请自行学习 hugo 的进阶教程。——然后你们就不是需要用这个项目的新人啦~

  • 本项目基于 hugo 博客引擎,和流行的 PaperMod 主题
  • 在 github 上建立的 blog,在墙内是不能直接访问的,需注意

1. 创建 github 账号

首先,注册自己的 github 账号,过程略。注册过程中,你设置的账户名 username,通常就是最终的网站地址 username.github.io,当然以后也可以把自己的域名映射到上面。

2. 架设自己的 blog 项目

注意,这一步,有两种方法

  • 第一种方法:你建一个全新的项目,下载我提供的 .zip 文件,解压后,再手动上传到你的项目。和第二种相比,稍微繁琐一点。但还是希望大家,有条件的话,使用这种方法。
  • 第二种方法,把我的这个项目,fork 到你的项目。这种方法对新人更简便,完全不需要在本地操作文件,只用手机或 pad 就可以完成。但这种 fork 在一起的项目,在进行自动发布 blog 的操作时,是共享同一个操作额度的。如果 fork 的人数非常多,未来可能会被 Github 限制。——要达到这种规模,大概要几千人同时用吧……所以也不需要很在意。
2.1. 第一种方法

注册并登入账号后,新建自己的项目(Repository)。

项目的名称,决定了最终 blog 的网址。假设你的 github 用户名为 username

  • 如果把项目命名为 username.github.io ,则最终的网站地址为
    https://username.github.io/
  • 如果把项目设置成其它名字,如 new-name,则最终的网站地址为
    https://username.github.io/new-name

确认项目为 Public。其它设置都不需要更改,点击绿色按钮创建。

创建项目后,点击「上传已有的文件」

我的 github 项目,下载已经设置好的 hugo 文件包,在本地解压缩 .zip 文件。然后,把里面的所有文件,拖拽上传到你的项目里。

等到 80 多个文件都被上传后,别忘了点击页面底部的 Commit changes 提交。

2.2. 第二种方法

注册并登入账号后,进入项目:
https://github.com/fivestone/hugo-papermod-beginning
点击 Fork,将这个模板复制到你自己的账号下。

和第一种方法一样,这里需要设置你自己的项目名称,假设你的 github 用户名为 username

  • 如果把项目命名为 username.github.io ,则最终的网站地址为
    https://username.github.io/
  • 如果把项目设置成其它名字,如 new-name,则最终的网站地址为
    https://username.github.io/new-name

然后点击 Create fork 创建项目。


3. 配置自动发布 blog

创建新项目后,进入项目的 Settings – Pages 页面,把 Build and deployment – Source,改为 GitHub Actions。

把 Source 从 Deploy from a branch,改为 GitHub Actions 后,进入上方的 Actions 页面。

初次进入 Actions 页面后,会显示 Github 预设的各种配置方案,通过搜索框找到 hugo,然后点击 hugo 方案中的 Configure

系统会自动生成配置文件,不需要做任何改动,点击绿色的 Commit changes 提交。

此时自动发布的 action / workflow 就已经开始运行了,大约 1~2 分钟后,就可以在
https://username.github.io 看到 blog 最初的页面了。

以后,每次对项目里文章或配置文件的更改,都会触发这个 action / workflow,重新生成一遍网站。可以在 Actions 页面,查看 workflow 每次运行的情况。


4. 更改网站基本信息

在 Code 页面,点击编辑 config.yml 页面,把一些预设的网站信息,改成你自己的信息。

对新人来说,需要在 config.yml 文件里更改的,大概有以下几项:

baseURL: https://username.github.io/ # 改成你自己的网址
title: 网站名称
params:
  author: somebody # 作者的署名

  homeInfoParams:
    Title: 网站标题,只显示在首页上
    Content: >
      显示在首页标题下方的一些文字。</br>
      支持一些简单的 html 和 markdown。

更改后,点击绿色的 Commit Changes… 在弹出的页面中,再一次点击绿色的 Commit Changes,保存文件后 1~2 分钟,就可以在 blog 页面上,看到更改后的内容了。


5. 添加、管理文章

所有的 blog 文章,都在 content / posts 目录中。在 Code 页面,进入 content / posts 目录。点击右上角,创建新文件。

所有的文章,均为 .md 结尾的 markdown 文件。文件名对应着这篇文章的网址,譬如,post-20241111.md 文章链接,就是
https://username.github.io/posts/post-20241111/

在文件的开头,如图所示,写入用 — 隔开的,文章的标题和发布日期。

---
title: "新文章的标题"
date: "2024-11-11"
---
然后开始写正文,markdown 格式。

同样,点击绿色的 Commit changes… 保存提交。1~2 分钟后,就可以在 blog 页面上看到新文章了。

content / posts 目录里的所有文件,都可以随意地新增、删除、修改、重命名文件。对应着 blog 文章的增删改、和改变 url 链接。

文章内嵌的图片,建议放在 static 目录下,然后在文章中用 markdown 格式引用。譬如 static / aa.jpg 文件,相应地在 markdown 文件中插入的代码为:

![](https://username.github.io/aa.jpg)

有经验的用户,也可以使用其它更有效的组织方式。


6. 其它注意事项

  • 生成的 blog 对应的 rss 订阅地址为
https://username.github.io/index.xml
  • 如果是用第二种方法,直接 fork 的项目,以后在这个 blog 项目里,会一直看到,图片里这样的消息,提示要把你对项目的更改,反馈给原本的我的项目。——不用理会就是了。

输出是一种排泄

在不同的平台上时不常的都能看到一些内容创作者他们会有疑问,说我的东西明明很有深度,准备得也很充分,制作也很用心,但是为什么没有获得很好的流量,或者其他的回报?这种时候要么就是真的有疑问,要么就是想通过这种疑问的方式,来表达对于这种流量的不满或者鄙视。

每次看到他们说这种话的时候,我就会代入到自己。我也有很多内容是花了很多心思很认真做的,但就是没有什么人看,没有什么人听。前几年确实会有疑惑,但现在我很坦诚地接受自己就是不擅长做那种大众流量欢迎的内容。

这里并没有鄙视大众流量的意思,我是真的发自内心的不懂,哈哈哈哈哈~

因为我做内容 99% 的动机,都只是为了把脑袋里的东西腾出来,它只是我的一个思考过程的外化。有人获得共鸣和启发,那就最好,没有那也无所谓。因此我确实没有真的花过心思在研究怎么样制作大家都喜欢的那种类型的内容,因为我也确实没有发自真心地想把自媒体作为自己的一条所谓职业赛道来看待。

因此,没有获得那样的流量,是很正常,也应该的。

一张封面引发的内核更换

我把博客的模版换了,更简洁,但更好用了。

事情要从「播客封面的输出事故」开始。

我在播客后台上传的一张 1600×1600 的封面图,在通过 RSS Feed 分别同步到小宇宙和 Apple Podcast 的时候,出现了不显示或被识别成「未提供」的状态。小宇宙的后台能看见,这张图是抓到了的,但在播放页没显示;而在 Apple Podcast Connect 的后台就直接识别为「未提供」。同样的源我也尝试给到 YouTube,能抓到,能显示,但非常模糊。

第一时间我就认为是博客那头的设置问题,但具体是什么原因呢?实在是太多年没有折腾过博客的模版设计了。每个菜单我挨个检查了一遍,发现确实有个「摘录」的开关打开了,它会限制 Feed 分享出去的是完整的一篇还是只有局部的内容。这确实有影响,它导致小宇宙没抓到全文,在单集详情页里只显示了第一段话,后面还跟一个无法点击的跳转的纯文字「阅读更多」。关掉这个开关之后,小宇宙也马上就更新,能显示全文了,但封面依然没有显示出来;Apple Podcast 那边完全没动静,别说更新了,就是搜索都还搜不出来,但明明已经发布了。

我实在想不出是哪里的设置不对,就上即刻问了一下。很幸运的是,小宇宙的小伙伴立刻就开始帮我找原因。在几经周折后,最终联系上了小宇宙的技术同学,他给我看说托管源输出的图片尺寸只有 180×200 px。这就很明确了!

https://suithink.files.wordpress.com/2024/04/vol-0000.jpg?w=180&amp;h=200&amp;crop=1

但我依然不知道为什么,因为翻遍了整个后台,都不存在一个设置 RSS Feed 输出封面尺寸的地方。别的朋友也都没遇到过这样的事。况且,一张正方形的图,就算是缩略图也应该是正方形的,比例变了又是为什么呢?

于是我意识到一件事:

这是一个行业内的标准做法,那就应该是通用的,如果别人的播客都没有这个问题,而博客后台又不存在可设置和调整的界面,那最有可能的原因大概就是,我博客使用的模板太过于老旧了。

为什么我会想到这个角度呢?

因为我博客目前用的模板,是 2013 年开始启用的,这十一年来,只在 2022 年时调整过一次,但技术内核还是原本的那套东西。然而事实上,我博客后台切换到区块编辑器已经好几年了,我还在用的这个老模板其实已经下架很多很多年了,只是因为我一直没有更换它,还在生效而已。

为了验证这件事,我先是研究了一下朋友托管播客的网站结构,确定了「封面图」在通用模版中的形式,再在我的博客后台巡了几圈,选定一些结构相似的、我也喜欢的模板,把它们套用在我的播客日志上,看看是什么表现。最后,我在区块编辑器里找到这些页面,看看它们是怎么表达和处理这张「封面图」的,有哪些可以设置的项。

至此,我锁定,问题的根源就在于,这个多年前就早已下架不再维护的老版本软件的模版,它在技术层面和现行的技术之间的差异,导致输出的封面图变成了一个比例错误的缩略图。我只需要换上一个新模版,就可以解决了。

但「换模版」这件事,其实我已经考虑好长时间了。

在这次「封面图事故」之前,我就有换新的的想法了。一方面确实是,在日常写作和新增一些页面时会明显感觉到这种技术上的代际差,只是自己懒得动,能用就不改。我相信大部分程序员也是这么想的,代码屎山不就是这么回事嘛。但如果只是换个模版,其实不用想那么久,所以另一方面更核心的是,我在同时考虑把博客的套餐升级到 Explorer 版,还想提前买下后面几年的域名使用权,因此,在我心里,模版的更新、升级、域名这三件事是合并在一起考虑的。

这次小事故,倒像是上天推了我一把。

于是乎,我下了决心,升档、域名、模版,一次性全部处理好了。我现在的博客,是一个更为简练、但更好用的全新状态。播客源的抓取也回归正常了。

20132022 到今天,眼看着我的博客越改越简练,但内容越来越充盈,我心中是欢喜的。这就是我这些年的状态,越发充盈,越不需要装饰,所有形式都让位于内容。我只要一件舒适的 T 恤就够了。年轻时喜欢说的个性,那不是通过页面、手机壳、衣裤鞋来体现的,个性是行动做派,不需要是任何视觉化的呈现。

这就是另一种「我变秃了,但更强了」。

在博客上做播客,再因为做播客而全面更新了博客,再把这个过程记录在博客上,我如果不写出来,这事儿说给人听都会觉得我有神经病哈!不过 Blog 和 Podcast 这俩完全不相关的事物,在简体中文里的说法竟然像绕口令一般相似,也是有意思。

兔子王国里的外星人_0.ylog

欢迎收听荒野楼阁 WildloG,这里是设计师苏志斌的个人播客。作为第 0 期,我会与你分享一下做这个播客的动机、这个播客的主题会是什么、起这个名字的缘由、本期封面和播客 Logo 的设计想法,以及后续的节目计划。

在这一期,你会听到:

—- 我是谁?从我家的动物园,工作和话剧的经历,聊到我二十多年的写作习惯。

—- 我对于视频内容的态度?为什么那么久没更新《设以观复》系列?

—- 原本并不想做播客,为什么转变想法呢?动机是什么?

—- 这个播客的主题:一个外星人

—- 荒野楼阁 WildloG 这个名字是什么意思?因为生机勃勃啊!

—- Why not 和 WildloG

—- 封面设计:隐秘的荒野和兔子王国

—- 以两类节目为主:一个人捡树枝,两人以上一起捡树枝

—- 要停更视频?

|登场人物|

苏志斌:工业设计师,车联网智能硬件产品经理/联创,《设以观复》作者

|相关链接|

若你所使用的播客客户端未能完整显示插图,或遇网络问题未能正常播放,请访问:

荒野楼阁 WildloG 的地址:https://suithink.me/zlink/podcast/

阅读设计相关的各类文章:https://suithink.me/zlink/idea/

|其他社交网络媒体|

苏志斌 @ 知乎|SUiTHiNK @ 即刻 / 微博

苏志斌SUiTHiNK @ Bilibili / YouTube / 小红书

|联络邮箱|

suithink.su@gmail.com

欢迎在 小宇宙、Spotify、YouTube、Apple Podcast 收听本节目,期待你的留言。

💾

分类法

把 blog 的文章分类整理一下。

这项任务已经拖延了超过十年,早就没意义了。blog 页面上,已经很多年没有放置分类列表。曾经企图把所有文章归入 uncategorized,然后用 tag 来做分类,——幸好没有,不然又是个半途而废的坑。

还是先把分类的框架搭出来吧。即使有了框架,从前的旧帖子,也还没有妥善归置到各个分类里,继续拖着吧。


yy – /yy

也就是「意淫」,初代互联网口头语之一。某种意义上,整个 blog 的所有文章,都是在 yy,所以从这个网站初建,这就是分类之一。内容偏向于各种奇怪的,未必和情欲相关的幻想。也经常觉得 yy 这个词渐渐不合时宜、甚至有些男性油腻,想换个更高大上的(譬如 P.Bourdieu 的 illusio…);但,还是先这么放着吧。

日子流 – /current

流水账。早年 blog 大兴的时候,一大群人的文章风格,都是今天吃了什么玩了什么买了什么。我也想学着这么接地气一些,但最终还是没写太多。网址的英文名,出自卡佛的《水流》。

nowhere man – /nowhere

旅行相关,或者日常生活中偏向旅行气质的片段。出处是 Beatles。

norwegian wood – /wood

情感相关。出处是村上春树当然也是 Beatles。

不正经的 – /funny

纯搞笑段子。

偶知道 – /seriously

一些比较认真的思考吧。「偶知道」既是偶然知道,也是我知道,——(偶 = 我)仍然是上古网络用语,尴尬到脚抠地。

他们 – /ille

偏向于对他人的描述,有网络引用的,也有实际访谈的。ille 是第三人称自指。关于他者的描述,终归还是要映照自身。

无政府主义审美 – /aesthetic

最初是时政方面的吐槽。后来也包括从无政府主义视角,对同温层内部的一些不同的观感。然而并不会努力地,专注于用这个 blog 向人普及反贼言论,所以最终还是轻轻地立足于「审美」两个字,说些只是看着不爽的东西。

love me do – /tech

和人文情感无关;一些在各种兴趣领域的技术攻略:攀岩、射箭、外语、IT……

拍拍 – /photography

作为童年爱好,这个 blog 曾经把「摄影」作为一个二层分类的顶端,下面包括很多细微的分类:拍的照片、心得、理论讨论……甚至还有过一个摄影 blog,后来停掉了,也没有合并过来。这边还多多少少留了些,就先堆在这里。——所以心中还是对此有爱的吧。要知道,连 IT 类原创都没资格独占一个分类,而是胡乱塞进 /tech /fyi /misc 里面……

FYI – /fyi

For Your Information,感觉会对很多人有用的资源。和 /tech 类似,但兴趣方面的含量就低了很多,只是当作工具来整理,也未必是原创。

misc – /misc

杂项。

把日常阅读的网页,用 RSS 推荐给好友

虽然大家写 blog 的频率都没那么勤了,但是,RSS 还是有其它可以玩的方式的!

很多人都有用各种 read it later、或者书签类工具,把有意思的网页保存下来。在这些工具里,可以通过某些手法,把一些想要分享的网页,生成 rss。其他好友订阅这个 rss 地址,就可以自动刷新,看到你推荐的文章啦!

下面介绍一些,常见的书签网站,生成 rss 的方式。但首先——

  • 这些生成的 rss 地址里,大多都内嵌着网站的验证码,容易泄密,也冗长而不简洁。强烈建议:得到 rss 后,先用 Feedburner 之类的网站,转成新的 rss,再分享出来,可以去掉原先网址里的隐私信息。
  • 很多网站,并没有专门用来 share 的分类,只能通过曲线手段,把已经 archive 或者 star 的类别分享出来。这可能会影响你原本的使用习惯。
  • 一些网站生成的 rss,并没有全文,甚至只有标题和链接地址。没关系,能看到大家的推荐,就已经很好了,自己点开就可以啦。
  • 大多数 rss 订阅工具里,也都有发送给 read it later 的功能,可以辗转着,把自己订阅的文章分享给他人。

我的 RSS 分享地址是:

https://feed.fivest.one/readings
或者
https://feeds.feedburner.com/fivestone/readings

有兴趣和我分享的,欢迎留言或私信交换!!


Instapaper 免费版

感觉 Instapaper 生成 RSS 的功能是最好用的,可以把特定的文件夹设为公开,直接得到它的 rss,形如:

https://instapaper.com/folder/1234567/rss/123456/Vciysdfsd7mod9B
  • 在左边栏创建新的 Folder,存放想要分享的文章;
  • 进入这个 Folder,在页面上方选择 Edit,设置成 Public;
  • 点击页面右上角的下拉菜单,选择 Download;
  • 下载为 RSS Feed,就得到 RSS 地址了。

Pocket 免费版

好像 Pocket 的免费用户,内容都只能是公开的(无语…),只要知道了用户名,就可以通过 RSS 查看全部的内容(所以生成的 rss 需要转录才安全)。而且不能自定义分类,只有默认的:

https://getpocket.com/users/USERNAME/feed/all
https://getpocket.com/users/USERNAME/feed/unread
https://getpocket.com/users/USERNAME/feed/read

最后一条 …/read 会返回所有 Archived 了的文章,可以勉强用它作为分类的手段。

Readwise

这个只有收费版,我就不去试了。有它家的用户,可以帮忙把生成 rss 的方法分享一下?

Wallabag

我在用 Wallabag,可以自建,也有收费的服务可用。生成的 rss 是全文输出,效果很好。在 Config – Feed 里,生成一个 token,然后点开任何一个 tag,点击列表上方的 rss 图标,就可以得到这个 tag 的 rss 订阅(生成的地址里带着统一的 token,所以需要转录才安全):

https://wallabag.your.domain/feed/USERNAME/asdfghjkl/tags/t:share

運動無國界,嗎?

🎥 點擊封面播放視頻

由一個提問開始,我從設計師與創作的角度,談論體育運動中的民族主義,警惕「分類」思維對思考過程的影響。

今日戶外步行運動記錄:開發了一條四分馬的路線!

🎥 B站播放地址:https://www.bilibili.com/video/BV1yv421k7Mg/

🎥 YouTube 播放地址:https://youtu.be/w2KQoic8LAg

這是春節過後的第一條視頻,主要是起個頭跟大家聊聊天。如果你有任何想法,歡迎在視頻中的彈幕或評論區里和其他人一起理性討論。

博客实现 RSS 订阅能力全记录

我的个人网站 LRD.IM 从 2018 年元旦发布至今,已经持续维护 4 年了。从最开始根据一个网站模板替换图文,到现在全面的「自主研发」,这是我最骄傲的一款作品。接下来要开始尝试将我的博客内容输出成 RSS 源。

这篇博客不谈设计。写一点自己对互联网的初印象,将博客做成 RSS 源的原因,以及使用 RSSHub 将自己网站生成订阅源的过程。

吹吹水:我的互联网回忆录

对早期互联网的印象

家里 2005 年配置了台电脑,依稀记得那会儿互联网的潮流正好是从门户网站转换到「社会性网络」。

Discuz!、贴吧、QQ空间、开心网等等,都是 Web 2.0 的代表作,我「当年」也有高度参与到这些生态里面。那时候的互联网鼓励人们创造内容,强调互动和内容本身。而不像现在基于所谓的「算法」一直在投喂内容,网站上充斥着营销号、广告、对立、(甚至审查🤫)…

我记得以前网络上的内容是挺有趣的,因为都是自发的内容,感觉很真实。而现在在网上看到「有趣」的内容,却总感觉像是某个 MCN 机构生产出来的,所有内容都有目的,一直在包装、孵化,指向最后的变现…

所以正好我这位在 Web 2.0 时代背景下成长的网民,正好也有掌握了一些专业知识,于是乎就捣鼓出属于我的个人网站 LRD.IM,以及目前为止积累了 40 多篇的设计博客。一方面是想通过这些来积累自己的专业知识,另一方面是更想在互联网上留下自己的(且属于自己)的一些痕迹。

对 RSS 的执念

以前在 WordPress 搭建的博客、新闻资讯等网站总会有一个 RSS 订阅渠道,最初还没有了解过是什么意思,只是记住了有这个很不起眼的橙色的彩虹条纹的图标。

直到在大学了解到了一些信息获取来源方式和分类之后,才真正了解到 RSS 这个东西。很神奇的是它将内容进行重新格式化输出,将内容和从页面中分离出来,供读者在自己喜欢的第三方阅读器里接收新内容并阅读。

我还是挺认同这个概念的,即使它是 20 年前的产物。

分离原本的网页,意味着读者可以根据自己喜好在第三方阅读器上进行阅读,甚至自己捣鼓一个阅读器,而并非局限在原本网页管理者的设计中。

而第三方阅读器通常都会有新内容通知、已读未读标记、分类、标签等功能,这意味着读者能够拥有自己的一套信息获取方式,而不需要受到算法的介入,并能及时获得新内容的推送。

正好由于今年我的设计博客还上了不少公众号、周报的推荐,我察觉到拥有一个自己的内容推送服务迫在眉睫,而 RSS 订阅、邮件订阅都能满足这个需求,正好我就先将 RSS 订阅服务做起来先。

综上所述,我认同 RSS 的中立、纯粹、无算法介入,以及需要获取一个推送功能。所以,这时的我作为互联网内容生产者,是时候为自己的博客搭建一个 RSS 服务了!

实打实:将自己的网站生成 RSS 订阅源

预期

首先阐述下我的预期效果:

  • 抓取到 Blog 页面里的所有博客,并能检测到内容变动(包括增删改);
  • 将抓取的内容生成为 RSS 要求的格式,比如 XML;
  • 更新内容时,在第三方阅读器中确保至少的时效性(不会延迟太久)和内容准确度(顺序不会错乱);
  • 订阅链接用回自己的域名。

尝试一:第三方生成服务

于是乎我首先也试了几个做法,第一个是寄托于第三方的 RSS 生成服务。比如 Feed43FeedburnerRSS.app。当时我是期望这些第三方服务能够满足上述的预期效果,自动抓取我发布的新博客生成为 RSS 文件。它们确实也做到了,但会各自有一些硬伤。

  • Feed43:更新快,准确。但是免费版里生成的 Feed 里面会带有它们品牌的宣传,且国内访问速度一般;
  • Feedburner:更新速度一般,准确。但是被墙了,国内无法正常访问。
  • RSS.app:更新快,准确。但是只能抓取前几条内容,想要保留所有内容需付费升级账号。是否被墙倒是没试过。

试了一遍论坛/讨论组内大伙儿推荐的 RSS 生成服务提供商,没有完美地达到我的期望,每个都是差一点,只能靠自己了。

尝试二:手动制作静态文件

眼见外部服务不好使,于是乎我跑去看了看 RSS 的语法规范,尝试用最笨的方法:按照语法自己手动编写一份 XML 文档出来,给到读者去订阅。

大费周章按 RSS 语法规范写完了 XML,传了两份用作测试,一份传在自己的阿里云轻应用服务器上,另一份传到了阿里云 OSS 里。之后用第三方阅读器订阅 RSS 链接。

结果在阅读器里看到顺序错乱,时间异常,更新非常迟缓。刚开始我还以为这个笨方法能行得通,只是做起来比较麻烦,没想到仍然有这么多缺点。

最后一试:用 RSSHub 将自己网站生成为 RSS 源

了解 RSSHub
前两种方法试过都不能满足我的要求之后,我实在找不到一个好的思路或方向去实现,唯有在 V2EX 上发了个帖子请教下其他人。虽然帖子只有两个人回复,但足够启发我去解决问题了。

一楼回复建议我给 RSSHub 提个申请,让其他人写个规则来抓取我的网站,然后合并到 RSSHub 项目里,用该链接作为 RSS 的订阅链接。

后面我去了解了下 RSSHub,原来这是一个开源的 RSS 订阅源生成器,编写好规则(项目内称为脚本)后就能将内容按照 RSS 的语法生成一份订阅源,之后都可以用生成出来的链接在第三方阅读器上订阅和浏览。同时可以部署在自己的服务器里面,也能直接用 RSSHub 项目内别人预先编写好的规则,两者的效果是一样的。

粗浅了解下 RSSHub 的能力,以及也看了用该服务生成出来的订阅链接,功能上是能解决我的问题,但是时效性、准确度这些得自己部署完才能知道。

二楼回复认为静态的 XML 文件作为订阅源,会存在着不受我控制的缓存机制。于是乎我更坚定地认为应该尝试使用 RSSHub 的服务,毕竟这是动态生成,不是一个憨憨的静态文件。

安装 RSSHub
登录到服务器,由于我用的网站在阿里云轻服务器上搭建的,所以直接在网页里就能远程登录,其他服务器也能 ssh 登录后依次输入指令安装并运行 RSSHub。

git clone https://github.com/DIYgod/RSSHub.git
cd RSSHub
npm install
npm start

成功安装并运行后,在浏览器访问 1200 端口应该能看到如下页面。

回到终端,设置 RSSHub 服务常驻(断开 ssh 链接后仍然保持服务);以及开机自启。

# 设置常驻
npm install pm2 -g
pm2 start lib/index.js
# 设置开机自启
pm2 save
pm2 startup

这时可以访问几个 RSSHub 项目自带的路由,探索下其他有趣的 RSS 源。然后开始编写自己的脚本路由。

编写脚本路由
这一部分我的思路是参考文档的指引,并且在 RSSHub 项目内找到已有的脚本路由,找一份与自己网页结构类似的脚本,照葫芦画瓢捣鼓一份出来。

按文档的说法,/lib/router.js 里面记载了所有路由,即指明通过域名访问而指向的脚本。我先在里面创建一个路由,让其链接到之后创建的脚本。

# 添加路由
router.get('/feed', lazyloadRouteHandler('./routes/lrdim/blogs'));

其中 /feed 意思是我希望通过域名/feed 能够访问到我的 RSS 源,即 lrd.im/feed。后面的 ./routes/lrdim/blogs 意思是用前面的链接访问时,是链接到 /routes/lrdim/blogs.js 这个文件,用里面的规则生成 RSS 源。

然后到 /lib/routes 里面创建 lrdim 这个文件夹,并在里面创建 blog.js 这个文件,我们的抓取规则要写在里面。

由于我的网站是静态的 HTML 页面,所以我按照文档里的第二种方法,按照元素的标签和样式来抓取里面的数据。并且当时有参考了一个网页结构同样很简单的 RSSHub 内置源:十年之约。具体做法不在这详细说明了,主要按文档规范走就没啥问题,cheerio 语法也很直观容易理解。

保存后过几分钟,已经能在 lrd.im:1200/feed 中访问生成的 RSS 源了。此时需检查下名称、描述、标题、摘要、日期等有没有问题。

注意:日期必须按照格式编写,不能出现非法内容。否则这条 item 不会出现在结果中。我是写成了 YYYY-MM-DD。

调整渲染模版
由于刚刚通过设置 RSSHub 脚本获取到的数据,会经过 rss.art 模版处理后再成为 RSS 订阅源,所以我们需要打开 /lib/views/rss.art 对渲染模版作出相应的调整,比如预设 <webMaster> 不符合预期,调整为我自己的邮箱。

设置代理
至此,我的 RSS 订阅源已经生成完毕,内容、格式等符合要求,剩下的只有域名还不大对劲。

截至到上面的步骤,我的 RSS 订阅源扔人家需要通过端口来访问:lrd.im:1200/feed。这个肯定是不能作为最终产物的,谁会把端口直接暴露出来,太丢人了。于是乎需要配置最后一步:设置代理。

我的服务器里安装了 nginx,打开 nginx/conf/nginx.conf 后配置代理,将 lrd.im/feed 指向 1200 端口的页面。

location /feed {
proxy_pass http://127.0.0.1:1200;
}

此时测试 lrd.im/feed,已经能正常访问经 RSSHub 生成的订阅源。

验收
拿订阅链接去各主流阅读器中试验,尝试搜索、添加此链接,并尝试增删改被抓取的网页内容,检查到确实能够及时更新,且不会信息错乱。

大功告成!奉上我的设计博客正式的 RSS 订阅链接:https://lrd.im/feed

将该链接复制到 RSS 阅读器上就能够及时获取我的最新设计博客了。阅读器的话看个人喜好,目前我觉得 InoreaderReeder 的体验都不错。

做多啲:在网站中曝光 RSS 订阅功能

费了这么大周章捣鼓的新功能,起码在我的网站里多多曝光才是。所以我顺手做了三件事。

支持 RSSHub Radar

RSSHub 有一个浏览器插件叫 RSSHub Radar,用于探测网站上提供的 RSS,让读者能在浏览器的插件位置直接就能订阅,而不必在网页上费劲去找 RSS 订阅按钮。

支持该功能也不怎么复杂,我在所有页面的 <head> 处都添加了一条订阅地址,就能够被 RSSHub Radar 识别到。

博客页新功能提示

既然是我的博客内容的 RSS 源,那理所当然地应该在 lrd.im/blog.html 里添加一个稍微醒目一点的 Banner,提示到当前已经支持 RSS 啦!

此处参考了 Instagram 和 Apple 的样式,高亮若干秒之后开始减弱对比,使其融入到页面中。用了 CSS 的 keyframe 来做。

博客详情页功能组

从 Google Analytics 处跟踪到的数据,挺多流量是来自直接的博客详情页,而不是先从列表页跳转到详情页。所以也为了增加在详情页的曝光,我在「作者信息」一行右侧新添加了几个按钮,分别是:复制文章链接、RSS 订阅、联系小东。

原本「复制文章链接」这个功能我是做成响应式设计,只在移动端尺寸下才能看到,但还是根据 GA 数据反馈,造访我的博客的流量中有至少 75% 是来自桌面端,所以我将这个功能开放了。

「RSS 订阅」功能点击后能直接跳转到 lrd.im/feed,后续的订阅操作懂得都懂。每篇文章都提供这个入口,极大提高曝光量。

「联系小东」的功能是点击后出现我的微信二维码。我的设计博客通常会输出我的观点和看法,而我的网站里面又没有评论的功能,所以我捣鼓了一个微信二维码上去,读者如果有东西想和我交流就能直接加我好友一起交流。

这些功能没做成悬停在左右两侧,原因是我不太想在读者阅读时有其他干扰项。

哦对了,还顺便 Close 了一个,也是目前唯一一个 Issue。

总结

给博客生成 RSS 订阅源这事虽然一直惦记了有大半年,但实际开始到最终呈现,历时一周左右,都是用下班时间和周末探索的。

又到了给自己挖坑的时间:后面关于博客订阅这块,我会做的几件事:

  • 想办法将自己的 RSS 源链接提交到 RSSHub 项目下的博客分类
  • 研究主流的 RSS 阅读器抓取文章首图的规则。现在抓取的首图并非每次都符合我的预期,可能本身我的博客详情页构造需要作些许调整;
  • 探索开拓邮箱订阅功能的必要性。目前 Medium 能够用邮箱订阅我的最新博客动态,在想是否需要用到 Revue竹白之类的自己捣鼓一个邮件订阅服务,内嵌到个人网站里面去。顾虑是担心这种平台的文本编辑器功能是否够完善,以及和 Medium 功能重复了。

完成了自己的一项执念,如释重负,舒坦开朗。但心里仍然藏了另一项执念,我惦记了有至少一两年了,仍待推进…

参考文档

Hosting Ghost Blog with Docker on NixOS

Hosting Ghost Blog with Docker on NixOS

As previously mentioned, I have successfully deployed NixOS on my Oracle ARM machine. You can find the original post here:

How to Install NixOS on Oracle ARM machine
The steps I undertook to install NixOS on an Oracle ARM machine.
Hosting Ghost Blog with Docker on NixOSDigital ImmigrantsOrchestr
Hosting Ghost Blog with Docker on NixOS

In the past, my blog was hosted on Tencent Cloud using Typecho. Unfortunately, due to unforeseen circumstances, I lost ownership of that machine along with all my previous posts. Consequently, I took a hiatus from blogging, remaining in a state of silence for a few years. However, I now realize the importance of reviving my blog before lethargy engulfs me.

After conducting extensive research and considering various platforms such as Ghost, WordPress, Typecho ,Hugo and some other platforms, I finally settled on Ghost. Its remarkable speed, plethora of customized themes, aesthetically pleasing web user interface, and integrated membership system influenced my decision.

Check out all the cool stuff Ghost has to offer on their website below:

Ghost: The Creator Economy Platform
The world’s most popular modern publishing platform for creating a new media platform. Used by Apple, SkyNews, Buffer, Kickstarter, and thousands more.
Hosting Ghost Blog with Docker on NixOSGhost - The Professional Publishing Platform
Hosting Ghost Blog with Docker on NixOS

Due to the absence of Ghost in the NixOS packages, and the cumbersome nature of adapting it into a NixOS service, Docker has emerged as an excellent solution for hosting Ghost. Here, I have provided a comprehensive breakdown of the steps I followed to set up a blog using Ghost with Docker on NixOS. This can be modified to use on other platforms.

Step 0: Enable Docker on NixOS

Enabling Docker(Podman) on NixOS is a straightforward process, requiring modification of just one configuration file. I personally prefer using the vim editor, but feel free to use your preferred tool such as nano, emacs, or VS Code.

The initial step involves logging into the machine, particularly if it is being used as a server.

ssh ${username}@${server IP}

Then, we can start to modify the configuration file:

sudo vim /etc/nixos/configuration.ni

There are two ways of adding Docker to the NixOS system: for all users:

environment.systemPackages = with pkgs; [
  docker
];

And for one user only:

users.users.${username}.packages = with pkgs; [
  docker
];

You can choose either way based on your needs. The next step is to enable the Docker service.

virtualisation.docker.enable = true;
virtualisation.oci-containers.backend = "docker";

Note that we're using oci-containers to control Dockers. If you have chosen to install Podman, remember to modify it accordingly. Some may question why we're not using docker-compose; this is a simple answer – we embrace the capabilities of NixOS, and that suffices.

Last, remember to create a directory for docker to use. Here's my example:

mkdir ~/.docker

Step 1: Set up Docker Network

Using the Docker CLI command docker network will indeed create the network, but it may not be the optimal approach. Since we're operating within the context of NixOS, we can add it as a service. Add the following code snippet to your configuration.nix file, ensuring to customize the name according to your requirements. In my case, I'm utilizing npm as an example since I'm employing nginx-proxy-manager as my Nginx reverse proxy service.

systemd.services.init-docker-ghost-network-and-files = {
  description = "Create the network npm for nginx proxy manager using reverse proxy.";
  after = [ "network.target" ];
  wantedBy = [ "multi-user.target" ];

  serviceConfig.Type = "oneshot";
  script =
    let dockercli = "${config.virtualisation.docker.package}/bin/docker";
    in ''
      # Put a true at the end to prevent getting non-zero return code, which will
      # crash the whole service.
      check=$(${dockercli} network ls | grep "npm" || true)
      if [ -z "$check" ]; then
        ${dockercli} network create npm
      else
        echo "npm already exists in docker"
      fi
    '';
};

Step 2: Set up Mysql for Ghost

We will now proceed with crafting Docker configurations. The initial step involves creating an external directory for MySQL to store its data, ensuring that we can modify MySQL without accessing the Docker environment directly. At present, this MySQL database is exclusively intended for Ghost; however, you have the freedom to tailor it according to your specific requirements.

mkdir ~/.docker/ghost-blog/mysql -p

Please add the following snippet to your configuration file as well:

virtualisation.oci-containers.containers."ghost-db" = {
  image = "mysql:latest";
  volumes = [ "/home/hua/.docker/ghost-blog/msql:/var/lib/mysql" ];
  environment = {
    MYSQL_ROOT_PASSWORD = "your_mysql_root_password";
    MYSQL_USER = "ghost";
    MYSQL_PASSWORD = "ghostdbpass";
    MYSQL_DATABASE = "ghostdb";
  };
  extraOptions = [ "--network=npm" ];
};

Please note that Ghost no longer supports SQLite and MariaDB as its database options.

Step 3: Set up Ghost Docker

Finally, It's time for Ghost.

Basic Set up Configuarion

Following the previous instructions, we will proceed to create the content folder:

mkdir ~/.docker/ghost-blog/content

Now, let's move on to configuring Ghost:

virtualisation.oci-containers.containers."ghost-blog" = {
  image = "ghost:latest";
  volumes =
    [ "/home/hua/.docker/ghost-blog/content:/var/lib/ghost/content" ];
  dependsOn = [ "ghost-db" ];
  ports = [ 3001:3001 ];
  environment = {
    NODE_ENV = "develop";
    url = "http://${server IP}:3001";
    database__client = "mysql";
    database__connection__host = "ghost-db";
    database__connection__user = "ghost";
    database__connection__password = "ghostdbpass";
    database__connection__database = "ghostdb";
  };
  extraOptions = [ "--network=npm" ];
};

Within this section, we configure the port mapping, environment variables, and volume mapping. Please note that you should customize the MySQL configurations in accordance with your specific setup in the final step.

Mail Server Set Up

Taking Gmail as an example, please note that you can modify this configuration according to your specific needs.

virtualisation.oci-containers.containers."ghost-blog".environment = {
  mail__transport = "SMTP";
  mail__option_service = "Google";
  mail__options__auth__user = "username@gmail.com";
  mail__options__auth__pass = "your google app password";
  mail__options__host = "smtp.gmail.com";
  mail__options__port = "587";
  mail__options__secure = "false";
  mail__from = "username@gmail.com";
  tls__rejectUnauthorized = "true";
}

Please remember that the Google app password mentioned here is different from your actual Google account password. You can generate a Google app password by following the steps outlined in the Sign in with app passwords guide.

By configuring these settings, visitors will be able to sign up and leave comments on our website.

More Custom Options

Please refer to the instructions provided on the Ghost website at the following link:

Configuration - Adapt your publication to suit your needs
Find out how to configure your Ghost publication or override Ghost’s default behaviour with robust config options, including mail, storage, scheduling and more!
Hosting Ghost Blog with Docker on NixOSGhost - The Professional Publishing Platform
Hosting Ghost Blog with Docker on NixOS

Step 4: Set up Nginx Reverse Proxy

There are numerous articles available on the internet that explain how to set up Nginx as a system service or utilize nginx-proxy-manager as a Docker service. For the purpose of this example, I will demonstrate the Docker service approach. Remember to create the necessary folders as well.

virtualisation.oci-containers.containers."nginx-proxy-manager" = {
  image = "jc21/nginx-proxy-manager:latest";
  dependsOn = [ "ghost-blog" "chatgpt-next-web" ];
  volumes = [
    "/home/hua/.docker/nginx-proxy-manager/data:/data",
    "/home/hua/.docker/nginx-proxy-manager/letsencrypt:/etc/letsencrypt"
  ];
  ports = [ "80:80", "443:443", "81:81" ];
  extraOptions = [ "--network=npm" ];
};

Step 5: Rebuild System

sudo nixos-rebuild switch`

Step 6: Start to Use

After rebuilding the system, you can proceed to open the web pages for both Ghost and nginx-proxy-manager.

For information and usage details about Ghost, please visit:

Ghost: The Creator Economy Platform
The world’s most popular modern publishing platform for creating a new media platform. Used by Apple, SkyNews, Buffer, Kickstarter, and thousands more.
Hosting Ghost Blog with Docker on NixOSGhost - The Professional Publishing Platform
Hosting Ghost Blog with Docker on NixOS

To learn more about nginx-proxy-manager, please visit:

Nginx Proxy Manager
Docker container and built in Web Application for managing Nginx proxy hosts with a simple, powerful interface, providing free SSL support via Let’s Encrypt
Hosting Ghost Blog with Docker on NixOS
Hosting Ghost Blog with Docker on NixOS

Please note that once you have set up the nginx reverse proxy for Ghost, it's necessary to modify the Docker configuration for Ghost as follows:

virtualisation.oci-containers.containers."ghost-blog".environment = {
  NODE_ENV = "production";
  url = "https://your-website-address";
}

Please replace your-website-address with the actual address of your website. After making this modification, rebuild the system again.

In conclusion, if you have any further questions, please feel free to leave a comment without hesitation.

苟日新

突然被人跑来问,是怎么做到写博客坚持这么久的,而且可以持续输出?

(荣幸地,拿起话筒:)啊,我不觉得我这个样子,叫做「持续输出」啦。早就连每月一更都不能保证了,而且那些技术相关的帖子,在我心里都不能算是「更新博客」的,用这些凑数也为我自己所不齿……

但我看到这个问题时,首先想到的,一个很重要的因素:大概是因为,这个站就一直在这儿吧~ 我的技术能力,不需要花什么额外的精力,就能让这个 blog 一直存活下去。于是,想写东西的时候,这里始终有个地方,可以让我写。

——也有很多时期,是完全写不下去的,长时期没法去面对、去反刍自己的生活;然而也没必要因此而关站,就让 blog 存活在那里,终归是个表述的出口。大概是因为,我也是希望,自己能够从那些「无法整理自己」的状态中,渐渐走出来,回复到可以写东西的状态吧。所以站点的持续存在,满重要的,因为确实能感觉到,想写点什么的时候,如果没有这么个站,又或者需要自己重新架一个,可能也就不写了……


这种「随时可以在站点写东西」的状态,也影响着对 blog 平台的选择(怎么又拐到技术贴去了?好吧,之前也一直想吐槽这方面,就顺带提一下)。这些年一直有 〖wordpress vs 各种静态博客〗哪个更好的争论。双方确实各有利弊。总体来说,静态博客最大的优点就是……省钱,可以薅 github、vercel 之类托管网站的羊毛。但另一方面,静态博客每次发布、或者修改一篇文章的过程,其实满折腾的。通常情况下,它需要

  • 一台固定的电脑,安装静态博客编译程序,并且从这台电脑发布到 github 的专门权限。而不是随便打开一台电脑或手机,从浏览器就能编辑发文;
  • 每次发文时的一系列专门操作。

我不乏看到有人,好久没有更新,突然想写一篇文章时,忘了怎么操作,翻出攻略来重温一遍;甚至忘了连接 github 的 ssh-key……可能别人觉得这样的折腾无所谓,或者自我管理优秀的话,不会出现这种情况。但我个人觉得,这是会在主观上,影响发文章的状态的。所以,随便在任何地方任何电脑上都能直观地发文,感觉还是蛮重要的。

好像也是可以通过一系列操作,实现用浏览器某个网站上编辑文章,然后自动编译发布到托管网站的。我没有仔细去关注。但是,如果把 blog 的生命周期,放到 5~10 年这个尺度上,那么这些网站之间的复杂依赖关系,很大程度上是不靠谱的。譬如我已经看到好几个静态 blog 的外挂评论系统,不知为什么不工作了……总之,相比之下,我可能更宁愿去使用那些免费带广告的 blog 平台。

我对写 blog 的新人的推荐,一直是——

  • 如果有技术能力、也有服务器的话,自建 wordpress;
  • 或者找人蹭一个。如果我们比较熟,你可以去买个域名,把 blog 挂在我的服务器上。这并不是很大的负担。(ps,个人 wordpress 小站,是可以不必安装开销很大的 mysql 数据库的);
  • 如果上面两条都不行,那么,我优先推荐去注册现成的 wordpress.com 或者 blogspot.com,目前看起来,长期靠谱的只有这两家了。虽然免费版界面不好看、还有广告,但长期写着应该没问题的;
  • 当然,我不会给乐于尝试静态博客的人泼冷水。但我会根据你的技术能力和气质,暗戳戳地担心:
    • 你能坚持写多久;
    • 你写出来的,会不会很多都是关于你怎么建站的经历和心得……

转一张,对于熟悉这十几年来 blog 平台变迁的人,应该会很搞笑:用不同工具写 blog 的人,(写 blog 文章)vs(写关于怎么配置 blog 的文章)的对比。右下角那些术语,都是在各个年代,需要各种不同程度的折腾的,静态 blog 方案:gatsby、org mode、jekyll、hugo、git workflow……


ps,两个月前,用这段代码方案,把我在 twitter 的所有 po 文,都导入到了自建的 mastodon 里。Twitter 那边,应该会随着 Elon Musk 的各种不靠谱折腾,渐渐放弃掉了吧。而每条推文的字数限制,从 twitter 的 140 字,变成 mastodon 的 500 字后,很多几百字的感受,要不要专门写到 blog 这边来,就比从前,更让人犹豫。具体怎么处理,我还没想好。

❌