Reading view

There are new articles available, click to refresh the page.

The Robotics Revolution

8VC is hosting a meetup for ChinaTalk this coming Thursday. Sign up here if you can make it!


Ryan Julian is a research scientist in embodied AI. He worked on large-scale robotics foundation models at DeepMind and got his PhD in machine learning in 2021.

In our conversation today, we discuss…

  • What makes a robot a robot, and what makes robotics so difficult,

  • The promise of robotic foundation models and strategies to overcome the data bottleneck,

  • Why full labor replacement is far less likely than human-robot synergy,

  • China’s top players in the robotic industry, and what sets them apart from American companies and research institutions,

  • How robots will impact manufacturing, and how quickly we can expect to see robotics take off.

Listen now on your favorite podcast app.

Tesla is spending what Chief Executive Elon Musk called “staggering amounts of money” on gearing up for mass production. Above, robots assemble Model S sedans at the electric car maker’s 5.3-million-square-foot plant in Fremont, Calif.
Robotic arms at Tesla’s factory in Fremont, California. Source.

Embodying Intelligence

Jordan Schneider: Ryan, why should we care about robotics?

Ryan Julian: Robots represent the ultimate capital good. Just as power tools, washing machines, or automated factory equipment augment human labor, robots are designed to multiply human productivity. The hypothesis is straightforward — societies that master robotics will enjoy higher labor productivity and lower costs in sectors where robots are deployed, including in logistics, manufacturing, transportation, and beyond. Citizens in these societies will benefit from increased access to goods and services.

The implications become even more profound when we consider advanced robots capable of serving in domestic, office, and service sectors. These are traditionally areas that struggle with productivity growth. Instead of just robot vacuum cleaners, imagine robot house cleaners, robot home health aides, or automated auto mechanics. While these applications remain distant, they become less far-fetched each year.

Looking at broader societal trends, declining birth rates across the developed world present a critical challenge — How do we provide labor to societies with shrinking working-age populations? Robots could offer a viable solution.

From a geopolitical perspective, robots are dual-use technology. If they can make car production cheaper, they can also reduce the cost of weapon production. There’s also the direct military application of robots as weapons, which we’re already witnessing with drones in Ukraine. From a roboticist’s perspective, current military drones represent primitive applications of robotics and AI. Companies developing more intelligent robotic weapons using state-of-the-art robotics could have enormous implications, though this isn’t my area of expertise.

Fundamentally, robots are labor-saving machines, similar to ATMs or large language models. The key differences lie in their degree of sophistication and physicality. When we call something a robot, we’re describing a machine capable of automating physical tasks previously thought impossible to automate — tasks requiring meaningful and somewhat general sensing, reasoning, and interaction with the real world.

This intelligence requirement distinguishes robots from simple machines. Waymo vehicles and Roombas are robots, but dishwashers are appliances. This distinction explains why robotics is so exciting — we’re bringing labor-saving productivity gains to economic sectors previously thought untouchable.

Jordan Schneider: We’re beginning to understand the vision of unlimited intelligence — white-collar jobs can be potentially automated because anything done on a computer might eventually be handled better, faster, and smarter by future AI systems. But robotics extends this to the physical world, requiring both brain power and physical manipulation capabilities. It’s not just automated repetitive processes, but tasks requiring genuine intelligence combined with physical dexterity.

Ryan Julian: Exactly. You need sensing, reasoning, and interaction with the world in truly non-trivial ways that require intelligence. That’s what defines an intelligent robot.

I can flip your observation — robots are becoming the physical embodiment of the advanced AI you mentioned. Current large language models and vision-language models can perform incredible digital automation — analyzing thousands of PDFs or explaining how to bake a perfect cake. But that same model cannot actually bake the cake. It lacks arms, cannot interact with the world, and doesn’t see the real world in real time.

However, if you embed that transformer-based intelligence into a machine capable of sensing and interacting with the physical world, then that intelligence could affect not just digital content but the physical world itself. The same conversations about how AI might transform legal or other white-collar professions could equally apply to physical labor.

Today’s post is brought to you by 80,000 Hours, a nonprofit that helps people find fulfilling careers that do good. 80,000 Hours — named for the average length of a career — has been doing in-depth research on AI issues for over a decade, producing reports on how the US and China can manage existential risk, scenarios for potential AI catastrophe, and examining the concrete steps you can take to help ensure AI development goes well.

Their research suggests that working to reduce risks from advanced AI could be one of the most impactful ways to make a positive difference in the world.

They provide free resources to help you contribute, including:

  • Detailed career reviews for paths like AI safety technical research, AI governance, information security, and AI hardware,

  • A job board with hundreds of high-impact opportunities,

  • A podcast featuring deep conversations with experts like Carl Shulman, Ajeya Cotra, and Tom Davidson,

  • Free, one-on-one career advising to help you find your best fit.

To learn more and access their research-backed career guides, visit 80000hours.org/ChinaTalk.

To read their report about AI coordination between the US and China, visit http://80000hours.org/chinatalkcoord.

Jordan Schneider: Ryan, why is robotics so challenging?

Ryan Julian: Several factors make robotics exceptionally difficult. First, physics is unforgiving. Any robot must exist in and correctly interpret the physical world’s incredible variation. Consider a robot designed to work in any home — it needs to understand not just the visual aspects of every home worldwide, but also the physical properties. There are countless doorknob designs globally, and the robot must know how to operate each one.

The physical world also differs fundamentally from the digital realm. Digital systems are almost entirely reversible unless intentionally designed otherwise. You can undo edits in Microsoft Word, but when a robot knocks a cup off a table and cannot retrieve it, it has made an irreversible change to the world. This makes robot failures potentially catastrophic. Anyone with a robot vacuum has experienced it consuming a cable and requiring rescue — that’s an irreversible failure.

The technological maturity gap presents another major challenge. Systems like ChatGPT, Gemini, or DeepSeek process purely digital inputs — text, images, audio. They benefit from centuries of technological development that we take for granted — monitors, cameras, microphones, and our ability to digitize the physical world.

Today’s roboticist faces a vastly more complex challenge. While AI systems process existing digital representations of the physical world, roboticists must start from scratch. It’s as if you wanted to create ChatGPT but first had to build CPUs, wind speakers, microphones, and digital cameras.

Robotics is just emerging from this foundational period, where we’re creating hardware capable of converting physical world perception into processable data. We also face the reverse challenge — translating digital intent into physical motion, action, touch, and movement in the real world. Only now is robotics hardware reaching the point where building relatively capable systems for these dual processes is both possible and economical.

Jordan Schneider: Let’s explore the brain versus body distinction in robotics — the perception and decision-making systems versus the physical mechanics of grasping, moving, and locomotion. How do these two technological tracks interact with each other? From a historical perspective, which one has been leading and which has been lagging over the past few decades?

Ryan Julian: Robotics is a fairly old field within computing. Depending on who you ask, the first robotics researchers were probably Harry Nyquist and Norbert Wiener. These researchers were interested in cybernetics in the 1950s and 60s.

Norbert Wiener, founder of cybernetics, in an MIT classroom, ~1949. Source.

Back then, cybernetics, artificial intelligence, information theory, and control theory were all one unified field of study. These disciplines eventually branched off into separate domains. Control theory evolved to enable sophisticated systems like state-of-the-art fighter plane controls. Information theory developed into data mining, databases, and the big data processing that powers companies like Google and Oracle — essentially Web 1.0 and Web 2.0 infrastructure.

Artificial intelligence famously went into the desert. It had a major revolution in the 1980s, then experienced the great AI winter from the 80s through the late 90s, before the deep learning revolution emerged. The last child of this original unified field was cybernetics, which eventually became robotics.

The original agenda was ambitious — create thinking machines that could fully supplant human existence, human thought, and human labor — that is, true artificial intelligence. The founding premise was that these computers would need physical bodies to exist in the real world.

Robotics as a field of study is now about 75 years old. From its origins through approximately 2010-2015, enormous effort was devoted to creating robotic hardware systems that could reliably interact with the physical world with sufficient power and dexterity. The fundamental questions were basic but challenging — Do we have motors powerful enough for the task? Can we assemble them in a way that enables walking?

A major milestone was the MIT Cheetah project, led by Sangbae Kim around 2008-2012. This project had two significant impacts — it established the four-legged form factor now seen in Unitree’s quadrupedal robots and Boston Dynamics’ systems, and it advanced motor technology that defines how we build motors for modern robots.

Beyond the physical components, robots require sophisticated sensing capabilities. They need to capture visual information about the world and understand three-dimensional space. Self-driving cars drove significant investment in 3D sensing technology like LiDAR, advancing our ability to perceive spatial environments.

Each of these technological components traditionally required substantial development time. Engineers had to solve fundamental questions — Can we capture high-quality images? What resolution is possible? Can we accurately sense the world’s shape and the robot’s own body position? These challenges demanded breakthroughs in electrical engineering and sensor technology.

Once you have a machine with multiple sensors and actuators, particularly sensors that generate massive amounts of data, you need robust data processing capabilities. This requires substantial onboard computation to transform physical signals into actionable information and generate appropriate motion responses — all while the machine is moving.

This is where robotics historically faced limitations. Until recently, robotics remained a fairly niche field that hadn’t attracted the massive capital investment seen in areas like self-driving cars. Robotics researchers often had to ride the waves of technological innovation happening in other industries.

A perfect example is robotic motors. A breakthrough came from cheap brushless motors originally developed for electric skateboards and power drills. With minor modifications, these motors proved excellent for robotics applications. The high-volume production for consumer applications dramatically reduced costs for robotics.

The same pattern applies to computation. Moore’s Law and GPU development have been crucial for robotics advancement. Today, robots are becoming more capable because we can pack enormous computational power into small, battery-powered packages. This enables real-time processing of cameras, LiDAR, joint sensors, proprioception, and other critical systems — performing most essential computation onboard the robot itself.

Jordan Schneider: Why does computation need to happen on the robot itself? I mean, you could theoretically have something like Elon’s approach where you have a bartender who’s actually just a robot being controlled remotely from India. That doesn’t really count as true robotics though, right?

Ryan Julian: This is a fascinating debate and trade-off that people in the field are actively grappling with right now. Certain computations absolutely need to happen on the robot for physical reasons. The key framework for thinking about this is timing — specifically, what deadlines a robot faces when making decisions.

If you have a walking robot that needs to decide where to place its foot in the next 10 milliseconds, there’s simply no time to send a query to a cloud server and wait for a response. That sensing, computation, and action must all happen within the robot because the time constraints are so tight.

The critical boundary question becomes: what’s the timescale at which off-robot computation becomes feasible? This is something that many folks working on robotics foundation models are wrestling with right now. The answer isn’t entirely clear and depends on internet connection quality, but the threshold appears to be around one second.

If you have one second to make a decision, it’s probably feasible to query a cloud system. But if you need to make a decision in less than one second — certainly less than 100 milliseconds — then that computation must happen on-board. This applies to fundamental robot movements and safety decisions. You can’t rely on an unreliable internet connection when you need to keep the robot safe and prevent it from harming itself or others.

Large portions of the robot’s fundamental motion and movement decisions must stay local. However, people are experimenting with cloud-based computation for higher-level reasoning. For instance, if you want your robot to bake a cake or pack one item from each of ten different bins, it might be acceptable for the robot to query DeepSeek or ChatGPT to break that command down into executable steps. Even if the robot gets stuck, it could call for help at this level — but it can’t afford to ask a remote server where to place its foot.

One crucial consideration for commercial deployment is that we technologists and software engineers love to think of the internet as ubiquitous, always available, and perfectly reliable. But when you deploy real systems — whether self-driving cars, factory robots, or future home robots — there will always be places and times where internet access drops out.

Given the irreversibility we discussed earlier, it’s essential that when connectivity fails, the robot doesn’t need to maintain 100% functionality for every possible feature, but it must remain safe and be able to return to a state where it can become useful again once connectivity is restored.

Jordan Schneider: You mentioned wanting robots to be safe, but there are other actors who want robots to be dangerous. This flips everything on its head in the drone context. It’s not just that Verizon has poor coverage — it’s that Russia might be directing electronic warfare at you, actively trying to break that connection.

This creates interesting questions about the balance between pressing go on twenty drones and letting them figure things out autonomously versus having humans provide dynamic guidance — orienting left or right, adjusting to circumstances. There are both upsides and downsides to having robots make these decisions independently.

Ryan Julian: Exactly right. The more autonomy you demand, the more the difficulty scales exponentially from an intelligence perspective. This is why Waymos are Level 4 self-driving cars rather than Level 5 — because Level 5 represents such a high bar. Yet you can provide incredibly useful service with positive unit economics and game-changing safety improvements with just a little bit of human assistance.

Jordan Schneider: What role do humans play in Waymo operations?

Ryan Julian: I don’t have insider information on this, but my understanding is that when a Waymo encounters trouble — when it identifies circumstances where it doesn’t know how to navigate out of a space or determine where to go next — it’s programmed to pull over at the nearest safe location. The on-board system handles finding a safe place to stop.

Then the vehicle calls home over 5G or cellular connection to Waymo’s central support center. I don’t believe humans drive the car directly because of the real-time constraints we discussed earlier — the same timing limitations that apply to robot movement also apply to cars. However, humans can provide the vehicle with high-level instructions about where it should drive and what it should do next at a high level.

Jordan Schneider: We have a sense of the possibilities and challenges — the different technological trees you have to climb. What is everyone in the field excited about? Why is there so much money and energy being poured into this space over the past few years to unlock this future?

Ryan Julian: People are excited because there’s been a fundamental shift in how we build software for robots. I mentioned that the hardware is becoming fairly mature, but even with good hardware, we previously built robots as single-purpose machines. You would either buy robot hardware off the shelf or build it yourself, but then programming the robot required employing a room full of brilliant PhDs to write highly specialized robotic software for your specific problem.

These problems were usually not very general — things like moving parts from one belt to another. Even much more advanced systems that were state-of-the-art from 2017 through 2021, like Amazon’s logistics robots, were designed to pick anything off a belt and put it into a box, or pick anything off a shelf. The only variations were where the object is located, how I position my gripper around it, what shape it is, and where I move it.

From a human perspective, that’s very low variation — this is the lowest of low-skilled work. But even handling this level of variation required centuries of collective engineering work to accomplish with robots.

A pick-and-place robot aligns wafer cookies during the packaging process. Source.

Now everyone’s excited because we’re seeing a fundamental change in how we program robots. Rather than writing specific applications for every tiny task — which obviously doesn’t scale and puts a very low ceiling on what’s economical to automate — we’re seeing robotics follow the same path as software and AI. Programming robots is transforming from an engineering problem into a data and AI problem. That’s embodied AI. That’s what robot learning represents.

The idea is that groups of people develop robot learning software — embodied AI systems primarily composed of components you’re already familiar with from the large language model and vision-language model world. Think large transformer models, data processing pipelines, and related infrastructure, plus some robot-specific additions. You build this foundation once.

Then, when you want to automate a new application, rather than hiring a big team to build a highly specialized robot system and hope it works, you simply collect data on your new application and provide it to the embodied AI system. The system learns to perform the new task based on that data.

This would be exciting enough if it worked for just one task. But we’re living in the era of LLMs and VLMs — systems that demonstrate something remarkable. When you train one system to handle thousands of purely digital tasks — summarizing books, writing poems, solving math problems, writing show notes — you get what we call a foundation model.

When you want that foundation model to tackle a new task in the digital world, you can often give it just a little bit of data, or sometimes no data at all — just a prompt describing what you want. Because the system has extensive experience across many different tasks, it can relate its existing training to the new task and accomplish it with very little additional effort. You’re automating something previously not automated with minimal effort.

The hope for robotics foundation models is achieving the same effect with robots in the physical world. If we can create a model trained on many different robotic tasks across potentially many different robots — there’s debate in the field about this — we could create the GPT of robotics, the DeepSeek of robotics.

Imagine a robot that already knows how to make coffee, sort things in a warehouse, and clean up after your kids. You ask it to assemble a piece of IKEA furniture it’s never seen before. It might look through the manual and then put the furniture together. That’s probably a fantastical vision — maybe 10 to 20 years out, though we’ll see.

But consider a softer version: a business that wants to deploy robots only needs to apprentice those robots through one week to one month of data collection, then has a reliable automation system for that business task. This could be incredibly disruptive to the cost of introducing automation across many different spaces and sectors.

That’s why people are excited. We want the foundation model for robotics because it may unlock the ability to deploy robots in many places where they’re currently impossible to use because they’re not capable enough, or where deployment is technically possible but not economical.

Jordan Schneider: Is all the excitement on the intelligence side? Are batteries basically there? Is the cost structure for building robots basically there, or are there favorable curves we’re riding on those dimensions as well?

Ryan Julian: There’s incredible excitement in the hardware world too. I mentioned earlier that robotics history, particularly robotics hardware, has been riding the wave of other industries funding the hard tech innovations necessary to make robots economical. This remains true today.

You see a huge boom in humanoid robot companies today for several reasons. I gave you this vision of robotics foundation models and general-purpose robot brains. To fully realize that vision, you still need the robot body. It doesn’t help to have a general-purpose robot brain without a general-purpose robot body — at least from the perspective of folks building humanoids.

Humanoid robots are popular today as a deep tech concept because pairing them with a general-purpose brain creates a general-purpose labor-saving machine. This entire chain of companies is riding tremendous progress in multiple areas.

Battery technology has become denser, higher power, and cheaper. Actuator technology — motors — has become more powerful and less expensive. Speed reducers, the gearing at the end of motors or integrated into them, traditionally represented very expensive components in any machine using electric motors. But there’s been significant progress making these speed reducers high-precision and much cheaper.

Sensing has become dramatically cheaper. Camera sensors that used to cost hundreds of dollars are now the same sensors in your iPhone, costing two to five dollars. That’s among the most expensive components you can imagine, yet it’s now totally economical to place them all over a robot.

Computation costs have plummeted. The GPUs in a modern robot might be worth a couple hundred dollars, which represents an unimaginably low cost for the available computational power.

Robot bodies are riding this wave of improving technologies across the broader economy — all dual-use technologies that can be integrated into robots. This explains why Tesla’s Optimus humanoid program makes sense: much of the hardware in those robots is already being developed for other parts of Tesla’s business. But this pattern extends across the entire technology economy.

Jordan Schneider: Ryan, what do you want to tell Washington? Do you have policy asks to help create a flourishing robotics ecosystem in the 21st century?

Ryan Julian: My policy ask would be for policymakers and those who inform them to really learn about the technology before worrying too much about the implications for labor. There are definitely implications for labor, and there are also implications for the military. However, the history of technology shows that most new technologies are labor-multiplying and labor-assisting. There are very few instances of pure labor replacement.

I worry that if a labor replacement narrative takes hold in this space, it could really hold back the West and the entire field. As of today, a labor replacement narrative isn’t grounded in reality.

The level of autonomy and technology required to create complete labor replacement in any of the job categories we’ve discussed is incredibly high and very far off. It’s completely theoretical at this point.

My ask is, educate yourself and think about a world where we have incredibly useful tools that make people who are already working in jobs far more productive and safer.

China’s Edge and the Data Flywheel

Jordan Schneider: On the different dimensions you outlined, what are the comparative strengths and advantages of China and the ecosystem outside China?

Ryan Julian: I’m going to separate this comparison between research and industry, because there are interesting aspects on both sides. The short version is that robotics research in China is becoming very similar to the West in quality.

Let me share an anecdote. I started my PhD in 2017, and a big part of being a PhD student — and later a research scientist — is consuming tons of research: reams of dense 20-page PDFs packed with information. You become very good at triaging what’s worth your time and what’s not. You develop heuristics for what deserves your attention, what to throw away, what to skim, and what to read deeply.

Between 2017 and 2021, a reliable heuristic was that if a robotics or AI paper came from a Chinese lab, it probably wasn’t worth your time. It might be derivative, irrelevant, or lacking novelty. In some cases, it was plainly plagiarized. This wasn’t true for everything, but during that period it was a pretty good rule of thumb.

Over the last two years, I’ve had to update my priorities completely. The robotics and AI work coming out of China improves every day. The overall caliber still isn’t quite as high as the US, EU, and other Western institutions, but the best work in China — particularly in AI and my specialization in robotics — is rapidly catching up.

Today, when I see a robotics paper from China, I make sure to read the title and abstract carefully. A good portion of the time, I save it because I need to read it thoroughly. In a couple of years, the median quality may be the same. We can discuss the trends driving this — talent returning to China, people staying rather than coming to the US, government support — but it’s all coming together to create a robust ecosystem.

Moving from research to industry, there’s an interesting contrast. Due to industry culture in China, along with government incentives and the way funding works from provinces and VC funds, the Chinese robotics industry tends to focus on hardware and scale. They emphasize physical robot production.

Xiaomi’s “Dark Factory” 黑灯工厂 autonomously produces smartphones. Source.

When I talk to Chinese robotics companies, there’s always a story about deploying intelligent AI into real-world settings. However, they typically judge success by the quantity of robots produced — a straightforward industrial definition of success. This contrasts with US companies, which usually focus on creating breakthroughs and products that nobody else could create, where the real value lies in data, software, and AI.

Chinese robotics companies do want that data, software, and AI capabilities. But it’s clear that their business model is fundamentally built around selling robots. Therefore, they focus on making robot hardware cheaper and more advanced, producing them at scale, accessing the best components, and getting them into customers’ hands. They partner with upstream or downstream companies to handle the intelligence work, creating high-volume robot sales channels.

Take Unitree as a case study — a darling of the industry that’s been covered on your channel. Unitree has excelled at this approach. Wang Xingxing and his team essentially took the open-source design for the MIT Cheetah quadruped robot and perfected it. They refined the design, made it production-ready, and likely innovated extensively on the actuators and robot morphology. Most importantly, they transformed something you could build in a research lab at low scale into something manufacturable on production lines in Shenzhen or Shanghai.

They sold these robots to anyone willing to buy, which seemed questionable at the time — around 2016 — because there wasn’t really a market for robots. Now they’re the go-to player if you want to buy off-the-shelf robots. What do they highlight in their marketing materials? Volume, advanced actuators, and superior robot bodies.

This creates an interesting duality in the industry. Most American robotics companies — even those that are vertically integrated and produce their own robots — see the core value they’re creating as intelligence or the service they deliver to end customers. They’re either trying to deliver intelligence as a service (like models, foundation models, or ChatGPT-style queryable systems where you can pay for model training) or they’re pursuing fully vertical solutions where they deploy robots to perform labor, with value measured in hours of replaced work.

On the Chinese side, companies focus on producing exceptionally good robots.

Jordan Schneider: I’ve picked up pessimistic energy from several Western robotics efforts — a sense that China already has this in the bag. Where is that coming from, Ryan?

Ryan Julian: That’s a good question. If you view AI as a race between the US and China — a winner-take-all competition — and you’re pessimistic about the United States’ or the West’s ability to maintain an edge in intelligence, then I can see how you’d become very pessimistic about the West’s ability to maintain an edge in robotics.

As we discussed, a fully deployed robot is essentially a combination of software, AI (intelligence), and a machine. The challenging components to produce are the intelligence and the machine itself. The United States and the West aren’t particularly strong at manufacturing. They excel at design but struggle to manufacture advanced machines cheaply. They can build advanced machines, but not cost-effectively.

If you project this forward to a world where millions of robots are being produced — where the marginal cost of each robot becomes critical and intelligence essentially becomes free — then I can understand why someone would believe the country capable of producing the most advanced physical robot hardware fastest and at the lowest cost would have a huge advantage.

If you believe there’s no sustainable edge in intelligence — that intelligence will eventually have zero marginal cost and become essentially free — then you face a significant problem. That’s where the pessimism originates.

Jordan Schneider: Alright, we detoured but we’re coming back to this idea of a foundation model unlocking the future. We haven’t reached the levels of excitement for robotics that we saw in October 2022 for ChatGPT. What do we need? What’s on the roadmap? What are the key inputs?

Ryan Julian: To build a great, intelligent, general-purpose robot, you need the physical robot itself. We’ve talked extensively about how robotics is riding the wave of advancements elsewhere in the tech tree, making it easier to build these robots. Of course, it’s not quite finished yet. There are excellent companies — Boston Dynamics, 1X, Figure, and many others who might be upset if I don’t mention them, plus companies like Apptronik and Unitree — all working to build great robots. But that’s fundamentally an engineering problem, and we can apply the standard playbook of scale, cost reduction, and engineering to make them better.

The key unlock, assuming we have the robot bodies, is the robot brains. We already have a method for creating robot brains — you put a bunch of PhDs in a room and they toil for years creating a fairly limited, single-purpose robot. But that approach doesn’t scale.

To achieve meaningful impact on productivity, we need a robot brain that learns and can quickly learn new tasks. This is why people are excited about robotics foundation models.

How do we create a robotics foundation model? That’s the crucial question. Everything I’m about to say is hypothetical because we haven’t created one yet, but the current thinking is that creating a robotics foundation model shouldn’t be fundamentally different from creating a purely digital foundation model. The strategy is training larger and larger models.

However, the model can’t just be large for its own sake. To train a large model effectively, you need massive amounts of data — data proportionate to the model’s size. In large language models, there appears to be a magical threshold between 5 and 7 billion parameters where intelligence begins to emerge. That’s when you start seeing GPT-2 and GPT-3 behavior. We don’t know what that number is for robotics, but those parameters imply a certain data requirement.

What do we need to create a robotics foundation model? We need vast amounts of diverse data showing robots performing many useful tasks, preferably as much as possible in real-world scenarios. In other words, we need data and diversity at scale.

This is the biggest problem for embodied AI. How does ChatGPT get its data? How do Claude or Gemini get theirs? Some they purchase, especially recently, but first they ingest essentially the entire internet — billions of images and billions of sentences of text. Most of this content is free or available for download at low cost. While they do buy valuable data, the scale of their purchases is much smaller than the massive, unstructured ingestion of internet information.

There’s no internet of robot data. Frontier models train on billions of image-text pairs, while today’s robotics foundation models with the most data train on tens of thousands of examples — requiring herculean efforts from dozens or hundreds of people.

Subscribe now

This creates a major chicken-and-egg problem. If we had this robotics foundation model, it would be practical and economical to deploy robots in various settings, have them learn on the fly, and collect data. In robotics and AI, we call this the data flywheel: you deploy systems in the world, those systems generate data through operation, you use that data to improve your system, which gives you a better system that you can deploy more widely, generating more data and continuous improvement.

We want to spin up this flywheel, but you need to start with a system good enough to justify its existence in the world. This is robotics’ fundamental quandary.

I want to add an important note about scale. Everyone talks about big data and getting as much data as possible, but a consistent finding for both purely digital foundation models and robotics foundation models is that diversity is far more important than scale. If you give me millions of pairs of identical text or millions of demonstrations of a robot doing exactly the same thing in exactly the same place, that won’t help my system learn.

The system needs to see not only lots of data, but data covering many different scenarios. This creates another economic challenge, because while you might consider the economics of deploying 100 robots in a space to perform tasks like package picking...

Jordan Schneider: Right, if we have a robot that can fold laundry, then it can fold laundry. But will folding laundry teach it how to assemble IKEA furniture? Probably not, right?

Ryan Julian: Exactly. Economics favor scale, but we want the opposite — a few examples of many different things. This is the most expensive possible way to organize data collection.

Jordan Schneider: I have a one-year-old, and watching her build up her physics brain — understanding the different properties of things and watching her fall in various ways, but never the same way twice — has been fascinating. If you put a new object in front of her, for instance, we have a Peloton and she fell once because she put her weight on the Peloton wheel, which moved. She has never done that again.

Ryan Julian: I’m sure she’s a genius.

Jordan Schneider: Human beings are amazing. They’re really good at learning. The ability to acquire language, for example — because robots can’t do it yet. Maybe because we have ChatGPT, figuring out speech seems less of a marvel now, but the fact that evolution and our neurons enable this, particularly because you come into the world not understanding everything... watching the data ingestion happen in real time has been a real treat. Do people study toddlers for this kind of research?

Ryan Julian: Absolutely. In robot learning research, the junior professor who just had their first kid and now bases all their lectures on watching how their child learns is such a common trope. It’s not just you — but we can genuinely learn from this observation.

First, children aren’t purely blank slates. They do know some things about the world. More importantly, kids are always learning. You might think, “My kid’s only one or two years old,” but imagine one or two years of continuous, waking, HD stereo video with complete information about where your body is in space. You’re listening to your parents speak words, watching parents and other people do things, observing how the world behaves.

This was the inspiration for why, up through about 2022, myself and other researchers were fascinated with using reinforcement learning to teach robots. Reinforcement learning is a set of machine learning tools that allows machines, AIs, and robots to learn through trial and error, much like you described with your one-year-old.

What’s been popular for the last few years has been a turn toward imitation learning, which essentially means showing the robot different ways of doing things repeatedly. Imitation learning has gained favor because of the chicken-and-egg problem: if you’re not very good at tasks, most of what you try and experience won’t teach you much.

If you’re a one-year-old bumbling around the world, that’s acceptable because you have 18, 20, or 30 years to figure things out. I’m 35 and still learning new things. But we have very high expectations for robots to be immediately competent. Additionally, it’s expensive, dangerous, and difficult to allow a robot to flail around the world, breaking things, people, and itself while doing reinforcement learning in real environments. It’s simply not practical.

Having humans demonstrate tasks for robots is somewhat more practical than pure reinforcement learning. But this all comes down to solving the chicken-and-egg problem I mentioned, and nobody really knows the complete solution.

There are several approaches we can take. First, we don’t necessarily have to start from scratch. Some recent exciting results that have generated significant enthusiasm came from teams I’ve worked with, my collaborators, and other labs. We demonstrated that if we start with a state-of-the-art vision-language model and teach it robotics tasks, it can transfer knowledge from the purely digital world — like knowing “What’s the flag of Germany?” — and apply it to robotics.

Imagine you give one of these models data showing how to pick and place objects: picking things off tables, moving them to other locations, putting them down. But suppose it’s never seen a flag before, or specifically the flag of Germany, and it’s never seen a dinosaur, but it has picked up objects of similar size. You can say, “Please pick up the dinosaur and place it on the flag of Germany.” Neither the dinosaur nor the German flag were in your robotics training data, but they were part of the vision-language model’s training.

My collaborators and I, along with other researchers, showed that the system can identify “This is a dinosaur” and use its previous experience picking up objects to grab that toy dinosaur, then move it to the flag on the table that it recognizes as Germany’s flag.

One tactic — don’t start with a blank slate. Begin with something that already has knowledge.

Another approach — and this explains all those impressive dancing videos you see from China, with robots running and performing acrobatics — involves training robots in simulation using reinforcement learning, provided the physical complexity isn’t too demanding. For tasks like walking (I know I say “just” walking, but it’s actually quite complex) or general body movement, it turns out we can model the physics reasonably well on computers. We can do 99% of the training in simulation, then have robots performing those cool dance routines.

We might be able to extend this framework to much more challenging physical tasks like pouring tea, manipulating objects, and assembling things. Those physical interactions are far more complex, but you could imagine extending the simulation approach.

Jordan Schneider: Or navigating around Bakhmut or something.

Ryan Julian: Exactly, right. The second approach uses simulation. A third tactic involves getting data from sources that aren’t robots but are similar. This has been a persistent goal in robot learning for years — everyone wants robots to learn from watching YouTube videos.

There are numerous difficult challenges in achieving this, but the basic idea is extracting task information from existing video data, either from a first-person perspective (looking through the human’s eyes) or third-person perspective (watching a human perform tasks). We already have extensive video footage of people doing things.

What I’ve described represents state-of-the-art frontier research. Nobody knows exactly how to accomplish it, but these are some of our hopes. The research community tends to split into camps and companies around which strategy will ultimately succeed.

Then there’s always the “throw a giant pile of money at the problem” strategy, which represents the current gold standard. What we know works right now — and what many people are increasingly willing to fund — is building hundreds or even thousands of robots, deploying them in real environments like factories, laundries, logistics centers, and restaurants. You pay people to remotely control these robots to perform desired tasks, collect that data, and use it to train your robotics foundation model.

The hope is that you don’t run out of money before reaching that magic knee in the curve — the critical threshold we see in every other foundation model where the model becomes large enough and the data becomes sufficiently big and diverse that we suddenly have a model that learns very quickly.

There’s a whole arms race around how to deploy capital quickly enough and in the right way to find the inflection point in that curve.

Jordan Schneider: Is Waymo an example of throwing enough money at the problem to get to the solution?

Ryan Julian: Great example.

Jordan Schneider: How do we categorize that?

Ryan Julian: Waymo and other self-driving cars give people faith that this approach might work. When you step into a Waymo today, you’re being driven by what is, at its core, a robotics foundation model. There’s a single model where camera, lidar, and other sensor information from the car comes in, gets tokenized, decisions are made about what to do next, and actions emerge telling the car where to move.

That’s not the complete story. There are layers upon layers of safety systems, decision-making processes, and other checks and balances within Waymo to ensure the output is sound and won’t harm anyone. But the core process remains: collect data on the task (in this case, moving around a city in a car), use it to train a model, then use that model to produce the information you need.

Self-driving cars have been a long journey, but their success using this technique gives people significant confidence in the approach.

Let me temper your enthusiasm a bit. There’s hope, but here’s why it’s challenging. From a robotics perspective, a self-driving car is absolutely a robot. However, from that same perspective, a self-driving car has an extremely simple job — it performs only one task.

The job of a self-driving car is to transport you, Jordan, and perhaps your companions from point A to point B in a city according to a fairly limited set of traffic rules, on a relatively predictable route. The roads aren’t completely predictable, but they follow consistent patterns. The car must accomplish this without touching anything. That’s it — get from point A to point B without making contact with anything.

The general-purpose robots we’re discussing here derive their value from performing thousands of tasks, or at least hundreds, without requiring extensive training data for each one. This represents one axis of difficulty: we must handle many different tasks rather than just one.

The other challenge is that “don’t touch anything” requirement, which is incredibly convenient because every car drives essentially the same way from a physics perspective.

Jordan Schneider: Other drivers are trying to avoid you — they’re on your side and attempting to avoid collisions.

Ryan Julian: Exactly — just don’t touch anything. Whatever you do, don’t make contact. As soon as you start touching objects, the physics become far more complicated, making it much more difficult for machines to decide what to do.

The usefulness of a general-purpose robot lies in its ability to interact with objects. Unless it’s going to roam around your house or business, providing motivation and telling jokes, it needs to manipulate things to be valuable.

These are the two major leaps we need to make from the self-driving car era to the general robotics era — handling many different tasks and physically interacting with the world.

Jordan Schneider: Who are the companies in China and the rest of the world that folks should be paying attention to?

Ryan Julian: The Chinese space is gigantic, so I can only name a few companies. There are great online resources if you search for “Chinese robotics ecosystem."

In the West, particularly the US, I would divide the companies really pushing this space into two camps.

The first camp consists of hardware-forward companies that think about building and deploying robots. These tend to be vertically integrated. I call them “vertical-ish” because almost all want to build their own embodied AI, but they approach it from a “build the whole robot, integrate the AI, deploy the robot” perspective.

In this category, you have Figure AI, a vertical humanoid robot builder that also develops its own intelligence. There’s 1X Technologies, which focuses on home robots, at least currently. Boston Dynamics is the famous first mover in the space, focusing on heavy industrial robots with the Atlas platform. Apptronik has partnered with Google DeepMind and focuses on light industrial logistics applications.

Tesla Optimus is probably the most well-known entry in the space, with lots of rhetoric from Elon about how many robots they’ll make, where they’ll deploy them, and how they’ll be in homes. But it’s clear that Tesla’s first value-add will be helping automate Tesla factories. Much of the capital and many prospective customers in this space are actually automakers looking to create better automation for their future workforce.

Apple is also moving into the space with a very early effort to build humanoid robots.

The second camp focuses on robotics foundation models and software. These tend to be “horizontal-ish” — some may have bets on making their own hardware, but their core focus is foundation model AI.

My former employer, Google DeepMind, has a robotics group working on Gemini Robotics. NVIDIA also has a group doing this work, which helps them sell chips.

Among startups, there’s Physical Intelligence, founded by several of my former colleagues at Google DeepMind and based in San Francisco. Skild AI features some CMU researchers. Generalist AI includes some of my former colleagues. I recently learned that Mistral has a robotics group.

A few other notable Western companies — there’s DYNA, which is looking to automate small tasks as quickly as possible. They’re essentially saying, “You’re all getting too complicated — let’s just fold napkins, make sandwiches, and handle other simple tasks.”

There are also groups your audience should be aware of, though we don’t know exactly what they’re doing. Meta and OpenAI certainly have embodied AI efforts that are rapidly growing, but nobody knows their exact plans.

In China, partly because of the trends we discussed and due to significant funding and government encouragement (including Made in China 2025), there’s been an explosion of companies seeking to make humanoid robots specifically.

The most well-known is Unitree with their H1 and G1 robots. But there are also companies like Fourier Intelligence, AgiBot, RobotEra, UBTECH, EngineAI, and Astribot. There’s a whole ecosystem of Chinese companies trying to make excellent humanoid robots, leveraging the Shenzhen and Shanghai-centered manufacturing base and incredible supply chain to produce the hardware.

When Robots Learn

Jordan Schneider: How do people in the field of robotics discuss timelines?

Ryan Julian: It’s as diverse as any other field. Some people are really optimistic, while others are more pessimistic. Generally, it’s correlated with age or time in the field. But I know the question you’re asking: when is it coming?

Let’s ground this discussion quickly. What do robots do today? They sit in factories and do the same thing over and over again with very little variation. They might sort some packages, which requires slightly more variation. Slightly more intelligent robots rove around and inspect facilities — though they don’t touch anything, they just take pictures. Then we have consumer robots. What’s the most famous consumer robot? The Roomba. It has to move around your house in 2D and vacuum things while hopefully not smearing dog poop everywhere.

That’s robots today. What’s happening now and what we’ll see in the next three to five years falls into what I call a bucket of possibilities with current technology. There are no giant technological blockers, but it may not yet be proven economical. We’re still in pilot phases, trying to figure out how to turn this into a product.

The first place you’re going to see more general-purpose robots — maybe in humanoid form factors, maybe slightly less humanoid with wheels and arms — is in logistics, material handling, and light manufacturing roles. For instance, machine tending involves taking a part, placing it into a machine, pressing a button, letting the machine do its thing, then opening the machine and pulling the part out. You may also see some retail and hospitality back-of-house applications.

What I’m talking about here is anywhere a lot of stuff needs to be moved, organized, boxed, unboxed, or sorted. This is an easy problem, but it’s a surprisingly large part of the economy and pops up pretty much everywhere. Half or more of the labor activity in an auto plant is logistics and material feed. This involves stuff getting delivered to the auto plant, moved to the right place, and ending up at a production line where someone picks it up and places it on a new car.

More than half of car manufacturing involves this process, and it’s actually getting worse because people really want customized cars these days. Customizations are where all the profit margin is. Instead of Model T’s running down the line where every car is exactly the same, every car running down the line now requires a different set of parts. A ton of labor goes into organizing and kitting the parts for each car and making sure they end up with the right vehicle.

Ten to twelve percent of the world economy is logistics. Another fifteen to twenty percent is manufacturing. This represents a huge potential impact, and all you’re asking robots to do is move stuff — pick something up and put it somewhere else. You don’t have to assemble it or put bolts in, just move stuff.

Over the next three to five years, you’re going to see pilots starting today and many attempts, both in the West and in China, to put general-purpose robots into material handling and show that this template with robotics foundation models can work in those settings.

Now, if that works — if the capital doesn’t dry up, if researchers don’t get bored and decide to become LLM researchers because someone’s going to give them a billion dollars — then maybe in the next seven to ten years, with some more research breakthroughs, we may see these robots moving into more dexterous and complex manufacturing tasks. Think about placing bolts, assembling things, wings on 747s, putting wiring harnesses together. This is all really difficult.

You could even imagine at this point we’re starting to see maybe basic home tasks: tidying, loading and unloading a dishwasher, cleaning surfaces, vacuuming...

Jordan Schneider: When are we getting robotic massages?

Ryan Julian: Oh man, massage. I don’t know. Do you want a robot to press really hard on you?

Jordan Schneider: You know... no. Maybe that’s on a fifteen-year horizon then?

Ryan Julian: Yeah, that’s the next category. Anything that has a really high bar for safety, interaction with humans, and compliance — healthcare, massage, personal services, home health aid — will require not only orders of magnitude more intelligence than we currently have and more capable physical systems, but you also really start to dive into serious questions of trust, safety, liability, and reliability.

Share

Having a robot roving around your house with your one-year-old kid and ensuring it doesn’t fall over requires a really high level of intelligence and trust. That’s why I say it’s a question mark. We don’t quite know when that might happen. It could be in five years — I could be totally wrong. Technology changes really fast these days, and people are more willing than I usually expect to take on risk. Autopilot and full self-driving are good examples.

One thing the current generation of robotics researchers, generalist robotics researchers, startups, and companies are trying to learn from the self-driving car era is this: maybe one reason to be optimistic is that because of this safety element, self-driving cars are moving multi-ton machines around lots of people and things they could kill or break. You have people inside who you could kill. The bar is really high — it’s almost aviation-level reliability. The system needs to be incredibly reliable with so much redundancy, and society, regulators, and governments have to have so much faith that it is safe and represents a positive cost-benefit tradeoff.

This makes it really difficult to thread the needle and make something useful. In practice, it takes you up the difficulty and autonomy curve we talked about and pushes you way up to really high levels of autonomy to be useful. It’s kind of binary — if you’re not autonomous enough, you’re not useful.

But these generalist robots we’re talking about don’t necessarily need to be that high up the autonomy difficulty curve. If they are moderately useful — if they produce more than they cost and save some labor, but not all — and you don’t need to modify your business environment, your home, or your restaurant too much to use them, and you can operate them without large amounts of safety concerns, then you have something viable.

For instance, if you’re going to have a restaurant robot, you probably shouldn’t start with cutting vegetables. Don’t put big knives in the hands of robots. There are lots of other things that happen in a restaurant that don’t involve big knives.

One of the bright spots of the current generalist robotics push and investment is that we believe there’s a much more linear utility-autonomy curve. If we can be half autonomous and only need to use fifty percent of the human labor we did before, that would make a huge difference to many different lives and businesses.

Jordan Schneider: Is that a middle-of-the-road estimate? Is it pessimistic? When will we get humanoid robot armies and machines that can change a diaper?

Ryan Julian: It’s a question of when, not if. We will see lots of general-purpose robots landing, especially in commercial spaces — logistics, manufacturing, maybe even retail back of house, possibly hospitality back of house. The trajectory of AI is very good. The machines are becoming cheaper every day, and there are many repetitive jobs in this world that are hazardous to people. We have difficulty recruiting people for jobs that are not that difficult to automate. Personally, I think that’s baked in.

If, to you, that’s a robot army — if you’re thinking about hundreds of thousands, maybe even millions of robots over the course of ten years working in factories, likely in Asia, possibly in the West — I think we will see it in the next decade.

The big question mark is how advanced we’ll be able to make the AI automation. How complicated are the jobs these machines could do? Because technology has a habit of working really well and advancing really quickly until it doesn’t. I’m not exactly sure where that stopping point will be.

If we’re on the path to AGI, then buckle up, because the robots are getting real good and the AGI is getting really good. Maybe it’ll be gay luxury space communism for everybody, or maybe it’ll be iRobot. But the truth is probably somewhere in between. That’s why I started our discussion by talking about how robots are the ultimate capital good.

If you want to think about what would happen if we had really advanced robots, just think about what would happen if your dishwasher loaded and unloaded itself or the diaper changing table could change your daughter’s diaper.

A good dividing line to think about is that home robots are very difficult because the cost needs to be very low, the capability level needs to be very diverse and very high, and the safety needs to be very high. We will require orders of magnitude more intelligence than we have now to do home robots if they do happen. We’re probably ten-plus years away from really practical home robots. But in the industrial sector — and therefore the military implications we talked about — it’s baked in at this point.

Jordan Schneider: As someone who, confession, has not worked in a warehouse or logistics before, it’s a sector of the economy that a lot of the Washington policymaking community just doesn’t have a grasp on. Automating truckers and automating cars doesn’t take many intellectual leaps, but thinking about the gradations of different types of manual labor that are more or less computationally intensive is a hard thing to wrap your head around if you haven’t seen it in action.

Ryan Julian: This is why, on research teams, we take people to these places. We go on tours of auto factories and logistics centers because your average robotics researcher has no idea what happens in an Amazon warehouse. Not really.

For your listeners who might be interested, there are also incredible resources for this provided by the US Government. O*NET has this ontology of labor with thousands of entries — every physical task that the Department of Labor has identified that anybody does in any job in the United States. It gets very detailed down to cutting vegetables or screwing a bolt.

Share

Jordan Schneider: How can people follow this space? What would you recommend folks read or consume?

Ryan Julian: Well, of course you should subscribe to ChinaTalk. Lots of great revised coverage. The SemiAnalysis guys also seem to be getting into it a little bit. Other than that, I would join Twitter or Bluesky. That is just the rest of the AI community. That’s the best place to find original, raw content from people doing the work every day.

If you follow a couple of the right accounts and start following who they retweet over time, you will definitely build a feed where, when the coolest new embodied AI announcement comes out, you’ll know in a few minutes.

[Some accounts! Chris Paxton, Ted Xiao, C Zhang, and The Humanoid Hub. You can also check out the General Robots and Learning and Control Substacks, Vincent Vanhoucke on Medium, and IEEE’s robotics coverage.]

Jordan Schneider: Do you have a favorite piece of fiction or movie that explores robot futures?

Ryan Julian: Oh, I really love WALL-E and Big Hero 6. I prefer friendly robots.

Enjoy this deleted scene from WALL-E:

Mood Music:

#102 真假伟人秀

回头看中国这几代人,有个很清晰的规律,就是中国一出雄才大略的“伟人”,中国人的苦日子就来了。我们这代人,出生的时候,中国有个姓毛的伟人,整天伟大的不得了,又是红太阳,又是舵手。小时候,在农村,没见过大船,还不知道舵手是什么玩意儿,但红太阳,只要不阴天,就出来。太阳一出,我的第一感觉就是饿的慌。吃不饱,营养不良,自然反应就是饿。

后来,毛主席终于死了。他老人家一死,我们很快就吃饱了。日子一天天好起来,眼界一天天开阔。从温饱到小康,从农村到城市,苦日子离得越来越远。但没想到,人过中年,中国又冒出个“伟人”,还是被领导硬提拔上去的个“伟人”。不知道,人类历史上,哪个伟人是被领导“提拔上去的”?

一个人没有能力,靠拍马溜须拼爹装孙子,被领导看中,硬放到那个位子上,得克萨斯土话把这种人叫“Post Turtle”—— “桩上龟公”。

Post turtle 这个典故,我以前在微博上也讲过。来美国后有好多年,没用中文写过正经东西,听说读写能力都有点退化。在微博上,我第一次讲Post Turtle这个典故的时候,直接翻译成“桩上乌龟”,听着比较直白,但有点粗俗。一位读者重新翻译了一下,说是“桩上龟公”。这个说法让我眼前一亮,简直做到了“信达雅”,“龟公”也是“公”,是对龟的尊称。用到“伟人”身上,要用尊称,符合古典汉语的表达习惯。

一说“公”,会想起蔡桓公来。在DC不明白节上,有听众问,现在还值不值得“润”。记得我就提到了蔡桓公。扁鹊遇到蔡桓公,不润还有个好啊?扁鹊是我们中国“润族”的古代先驱。

以前国内初中语文课本中有这篇故事。扁鹊是战国时代的名医,他主要是在齐国行医。去见国王蔡桓公——蔡桓公就是齐桓公。蔡桓公召见扁鹊。国王召见医生,肯定是找人家来看病。扁鹊也尽医生的职责,说“看脸色,您老人家有点小病,及时治疗,不会有大问题。”但蔡桓公这人死要面子,假装五个自信,说“寡人无疾”——“我没病”。

国王都说自己没病了,医生也只能闭嘴。扁鹊一走,蔡桓公跟手下说:“医生都是专门给没病的人治病,显示自己有能耐。”

过了一阵,扁鹊又见到蔡桓公,再次尽医生的职责,说:“你老人家的病已经从皮肤发展到肌肉,不及时治疗,还会更严重。”蔡桓公听了,有伤面子,有伤自信,当然不高兴。

又过了一阵,扁鹊第三次见到蔡桓公,说:“您老人家这病已经到了肠胃,再不治疗,事儿就大了。”蔡桓公听了当然更不高兴。

不久,扁鹊第四次见到蔡桓公,什么也不说,扭头走开了。

假装自信,死要面子的人,有个特点,就是你说出专业意见来的时候,他好象自信满满:他对了,你错了;但你一旦不说了,让他放开折腾,他反倒没那么自信了。为什么呢?原因很简单,他们的所谓“自信”,都是装出来的,都是为了面子假装自信。蔡桓公就是这种货色。

扁鹊见了他三次,每次都尽医生的职责,告诉他有病。但他每次听到都不高兴。第四次的时候,扁鹊不说话了,扭头走开。

不说话,扭头走开,这是戳破假装自信的秘方。不管他们是假装五个自信,还是八个自信,如果你是专业人士,遇到这种装X的蠢货,你什么也别说,扭头走开。你一走开,他们假装出来的自信就会露馅。读古代这些小故事,我们能学到不少这种道理。

扁鹊一走开,蔡桓公就赶紧派人去找他,问他为什么不说话了。扁鹊是位专业人士,从医生的专业角度,解释他为什么不在蔡桓公面前说话了。原因很简单:蔡桓公的病已经深入骨髓,他没法治了。”

过了五天,蔡桓公觉得身体疼痛,派人去找扁鹊看病,没找到人。扁鹊早就跑了,已经从齐国“润”到秦国去了。不久,蔡桓公就病死了。

这些年,中国好象“润”了不少扁鹊,还没润的,也不说话了。按照中国古代智慧,蔡桓公的病大致已经发展到倒数第一、倒数第二个阶段。

本来是说“龟公”,说着说着,就说到蔡桓公那里去了。有点跑题。

回到“龟公”——“桩上龟公”(Post Turtle)是得克萨斯贬损政客的土话。这句话是怎么来的呢?得克萨斯农村到处是牧场。很多牧场用木头桩子,拉上铁丝网,围起来,防止牲口往马路上跑。牧场上有不少乌龟。一些小孩喜欢恶作剧,专门挑个大的乌龟,把它放到木头桩子顶上。这成了本地农村一景。

木桩子上有只乌龟,肯定不是它自己爬上去的——它没有这能力,而是被人放上去的。问题是,它被人一放到那个位子上,就下不来。它不但没有能力自己上去,而且没有能力自己下来。它只会在木头桩子上,凭本能折腾,等待自由落体。这就是得克萨斯老乡Post Turtle(桩上龟公)这个典故的来历。

Read more

💾

作为哈利波特和成为哈利波特

我有一天第一次做荷兰语写作考试的真题,考试时间是100分钟,考过的朋友都和我说时间特别紧张,最后一秒还在写,差点写不完,结果我做真题的时候60分钟内就全写完了,写完有点恍惚,不知道该不该接着学了。感觉之前的紧张学习和感受到压力的情绪都错付了。本来严阵以待如临大敌披甲上阵,结果对方派来了喜羊羊。

于是开始吃东西,上网冲浪,像一直原本紧绷的气球突然被飘飘然举到半空(这篇文章发出的当天,就是我去参加B1荷兰语写作考试的当天,祝我幸运,我后面的每一天还是在精心准备和做真题),然后在冲浪的时候突然看到一张图:

还看到图片的评论区说:“没有被幸福家庭爱过的小孩会长成东亚人的样子”

我就分享给了霸王花。

霸王花秒回我:“我是哈利。”

一个人怎么就能立刻断定自己就是哈利了呢?

我想起来俩人之间的共性:霸王花和哈利都有“寄人篱下”的童年。我们有在放学以后第56期播客里讨论“寄人篱下”的经历会给一个人带来多可怕且深远的影响:让人无法有安全感,无法理所当然认可自己的存在,无法尊重并提出自己的需求,很怕自己的存在和自己的需求之于别人是麻烦,继而很难有主动性和真正自发的积极性。

倘若这种模式长期持续,人就很容易有“灾难性思维”:微小的事情也会被当做自己的灾难,动辄陷入绝望,孤绝,觉得自己孤立无援的状态。

而倘若我把这个图片分享给很多没有寄人篱下,就在自己家庭长大的华人女性朋友,我也有极大的可能收到“我是哈利”的回复。

因为即使在自己的家庭里,很多女性也被迫活成了“寄人篱下”的状态。父母视女儿为麻烦,为累赘,为“赔钱货”,为“羞耻”,为工具,为理财产品,为养老保障。没有一个真正提供过爱意,支撑,安全感的家,人就很容易把问题归结于自身,而非向外去探索原因。

很多时候问题的根源在于暴虐的成年人,不公的系统,吞噬一切的威权。但是四处漏风的家庭成长的小孩,会把一切归结于自己。把父母的争吵归结于自己,把抑郁窒息疲累的自己也归结于自己。

可是问题并不在于自己。

而庆幸的是,答案在于自己。

一个愿意去问问种种问题的根源在于哪里,继而找到应对方案的自己,只能在于自己。

如此多人喜欢《哈利波特》这部作品,尤其是前两三部,就是因为哈利没有永久停留在姨妈姨夫家的楼梯间里。他走了出来,被猫头鹰的信件召唤了出来,走出了楼梯间,去往了霍格沃茨,从一个孤绝地怀疑自己和舔舐自己的人,成长为一个像赫敏一样勇敢的,有主动性的人。

在霍格沃兹,哈利有朋友老师校长的爱,还能学习魔法,参与竞争,合作和冒险,进行自我拯救和拯救它人,楼梯间就会在生命中后退,生命的空间会被这些崭新的爱意和成就感填满。

但是之于华人女性可怕的是,从窒息的家庭中短暂逃离,很多又走入了像炼狱一样的学校。霸王花和我讲述过的她的初中,我看李雪琴讲述过的她的高中,我有同学曾经上过的教室里满是摄像头的河北衡水,老师比乌姆里奇可怕可憎甚至可恨,学生被驯化成严丝合缝的机器,成为服从的最小单位,被鼓励相互之间的举报倾轧,继而把每个学生变成鼓励无缘的孤岛。

人在这样的环境里,不像楼梯间里的哈利一样思考,简直是不可能的事情。

人很容易就作为哈利,在自我被卷地为牢的世界里,让灾难螺旋式上升在心里扎根。

但是很难成为后来的哈利,垃圾的土地没有为勇气勇敢探索提供土壤和水分空气。

在东亚生活,在简中生活,作为哈利波特太水到渠成了,成为哈利波特则难如登天。

所以很多人就算了。

我记得在第56期播客里霸王花问我天天想这些不累吗?

对所有的压迫,驯化,漠视和不公都想问一问为什么和凭什么,对自己天天问一问自己真的想要什么,此刻当下和之后都想要什么?这是人活在世界的“真问题”,我不能不问。

因为我不想算了。

“算了”不是放过自己,是全然地放弃自己。

我不仅不放弃自己,我也不打算放过世界。我要让安然欺负压迫别人的人和势力没那么舒服,我想让它们如坐针毡,如芒刺背,如鲠在喉。我很多时候心中满怀“恶意”地想“:我就要做那个针,那颗刺,那尖锐的鲠,我要是sharp本身。

但是走在美丽的环境和蓊郁地自然里“恶意”就会退却,我又决定还是我不能把我的时间浪费给那些恶人。我要把我的时间给树,给花,给云,给我自己,和能画出的飞马和神奇。

打败伏地魔和偷辆飞车环游魔法世界,对成为哈利波特同等重要。

怎么成为哈利波特呢?

从楼梯间里走出来,从乌姆里奇的教室里走出来,找到自身命运悲剧的真正加害者,认清加害者,别拿无数个假原因搪塞自己,反击加害者,远离加害者。很多时候问题就在于,人特别喜欢“认贼作父,认父贼为天神,认男贼为爱侣,认家贼为血亲,认直言说出加害者的人为不共戴天之仇敌”。人只要对痛苦忠诚,对加害者崇拜,让步,绥靖,姑息,楼梯间和乌姆里奇就永恒在那里。也同样人只要不再欺骗自己,不再纵容对自己的加害,下定决心从加害者构建的房间里走出来,run起来,人就会对自己有珍视和敬意,接下来有理由也有勇气保护自己,捍卫自己想要守护的世界。

同时别害怕这个世界,就算是用偷的车游荡也要出门。出门寻找一下自己的渴望,看看什么让自己欢愉,自由,幻想,流连忘返。

祝你在作为哈利波特后,有决心成为哈利波特。

祝你在被抑制压缩后,有渴望反弹膨胀自我解放。


最后,在我把这篇文章写完并发给霸王花看后,她压抑了将近20年的对加害者的蚀骨之恨在夜晚爆发,终于选择将近20年后,在自己的朋友圈曝光垃圾的恶行,给加害者迟来的一箭复仇。我还计划做一个公益网站“垃圾中老登曝光网”(域名打算用boomlaodeng),可以让华文圈所有女性匿名曝光各个中老登的恶行恶状,自己出口恶气,让垃圾社会性死亡,还能避免新的受害者。

看大家有无需求,如果有需求的话我就去购买域名让chatgpt和cursor一起帮忙搭建网站。当然如果有会搭网站的朋友,也可以拿走此创意直接自行搭建或者联络我一起,无论任何方式我都会来帮忙把这个网站广而告之地扩散出去。以及倘若想要让网站长久存在下去,我还可以帮助一起发起众筹,筹集这个网站域名购买,每年域名续费和维护费用,让我们群策群力地把这个公益网站做起来!

这是我发起的微小的凤凰社,欢迎各位格兰芬多成员的加入!

China's New AI Plan

The world’s two greatest superpowers released action plans for AI only 34 days apart. Back in July, the Trump Administration released America’s AI Action Plan to cautious fanfare. And on August 28, China’s State Council published its “Opinion on In-Depth Implementation of the ‘Artificial Intelligence +’ Initiative” (关于深入实施“人工智能+”行动的意见, hereafter abbreviated to “AI+ Plan”).

The two documents both come from the highest echelons of government in their respective countries, and both are high-level roadmaps issued as guidance for departments and ministries to implement. The grounds they cover and the policy intentions behind the measures give us the clearest pictures yet of how these two governments are making sense of the future of AI in their respective countries and around the world. Comparing how the two documents address overlapping issues is an instructive and incredibly revealing exercise. Below is an executive summary of similarities and differences.

At the 21st China (Shenzhen) International Cultural Industries Fair, a robot playing the guzheng attracts visitors. Photo by Chen Jiming, China News Service. (Cyberspace Administration of China)

Note: Side-by-side comparisons of the Chinese original and English translation were created in Claude, with thanks to Matt Sheehan!

Origins, leadership, and competing priorities

The US AI Action Plan was a product of Executive Order 14179, one of the many flurries of EOs signed during President Trump’s first few days in office, and was jointly led by the White House Office for Science and Technology Policy (OSTP), Trump’s AI Czar David Sacks, and the National Security Advisor (NSA).

The Chinese plan, on the other hand, is a directive straight from the State Council, with no additional credits to specialized ministries. The final paragraph tasks the National Development and Reform Commission with coordination rather than any specific policy portfolio. This means it was a comprehensive effort by China’s highest state administrative organ. The State Council is technically the organ that executes decisions by the National People’s Congress (NPC), China’s unicameral legislature. As is expected in an autocracy, NPC delegates have little actual leverage. Instead, the State Council is better understood as the supreme coordinating body for the country’s 26 ministries and 31 province-level governments, only one step below the Communist Party’s Politburo. As illustrated by the Congressional Research Service’s org chart for the CCP:

Image: China’s national-level political structure. (Congressional Research Service)

A huge variety of input from all corners of the Chinese bureaucracy likely went into the Chinese AI plan. And it shows: the document is comprehensive to the point of being overstretched, covering AI’s coming role in everything from industrial R&D to “methods in philosophical research.”

China’s campaign-style governance makes it easy to engage a policy aim as a whole-of-society effort. A document like this is meant to be distributed widely to ever-lower levels of government and “studied” by ambitious bureaucrats across the nation. Its words will be picked apart carefully in the provinces to divine policy directions that Beijing will find favorable. The US AI Action Plan will not have the same level of buy-in from fellow bureaucrats across Washington and beyond — perhaps especially now, at an unprecedented political moment for the federal civil service. Indeed, it is a list of recommendations that will see extensive negotiation with stakeholders in other agencies and levels of government who don’t necessarily share similar views.

This doesn’t mean the Chinese one is likely to be more successful; indeed, the American plan goes into much more detail on exactly which bureaucratic processes to work through in order to achieve its goals. China’s political campaigns have led to as many successes as it has disasters, with the most recent being Zero Covid. It will be fascinating to see which side makes faster progress in the long term.

Framing, goals, and techno-optimism/accelerationism

The Chinese AI plan is as techno-optimistic a document as the Chinese Communist Party (CCP) might produce at this moment. One might even call it accelerationist: except for a single line item discussing AI safety risks at the very end, practically all other sections of this document call for further development and incorporation of AI across society, with guardrails and ethics relegated to complementary positions. Zhou Hui 周辉, an AI governance expert at the Chinese Academy of Social Sciences’ Institute of Law who participated in the document’s drafting, said in a September 8 interview that consensus throughout the drafting process was that “a lack of development would be the biggest safety risk” (不发展才是最大的不安全).

Specifically, Chinese accelerationism-as-policy focuses on expansive experimentations with industrial and social applications, rather than abstract visions of “AGI”. There is a sense of urgency underpinning the document, especially at the beginning when it sets out numerical targets: 70% of the country will have adopted AI-powered terminals, devices, and agents by 2027, and by 2030 the adoption rate will reach 90%. The document elevates the “intelligent economy” to the status of a pillar of “achieving basic realization of socialist modernity by 2035” (到2035年基本实现社会主义现代化), which is the overarching national goal enshrined during the 19th Congress of the CCP in 2017. To be clear, there are no objective metrics against which these goals’ realization can be measured, making them more symbolic than rigorous. However, these numerical targets will incentivize bureaucrats across ministries, provinces, and technologically strong cities to create policy programs that demonstrate their commitment to such ambitious goals.

Much has already been made about the pro-development bend of the US AI Action Plan, which opens with cutting what’s framed as Biden’s red tape. The tech race with China informs the US Plan’s views about speed of innovation more than arguably any other issue: it is suffused with language referencing “domination” and the political necessity for America to have “the best” AI systems in the world. The Chinese document, by contrast, seems to posit China against itself. Another consequence of there being apparent whole-of-government input is that geopolitical implications, primarily the domain of the foreign and state security ministries, are not explicitly top-of-mind. Notably, unlike the US plan, the Chinese AI+ plan does not mention defense or the military whatsoever. The goal, instead, is very abstract:

“Reshape the paradigm of human production and life” is a subtle attempt at connecting AI policy to the PRC’s Marxist-Leninist ideological underpinnings; eventually, it seems to imply, AI integration might lead China closer to the realization of full economic revolution under communism. This is, of course, theoretical to the point of being slightly irrelevant. That being said, it signals that the primary aim of China’s AI+ Plan is to leverage AI to achieve transformations in China’s economic society, and not necessarily to shape the balance of power between Beijing and Washington. This is not to say that the PLA has no plans to make use of AI, or that the Chinese foreign ministry isn’t analyzing the US-China tech race; the truth is almost certainly the opposite. But from what the Chinese state is choosing to communicate publicly about its vision for AI, we largely see a strategy framed around domestic socioeconomic governance.

Open source as strategic imperative

Both Chinese and American leaders explicitly see leadership in open source as a strategic asset. The Chinese document calls for building up open source technological frameworks and social ecosystems that are “open to the world” and creating projects and developer tools with “international influence.”

To do so, the government will give academic awards to students, researchers, and lecturers who contribute to open source projects, as well as create incentives for both public and private sectors to explore and develop open source applications. More holistically, the document encourages open-source access as part of a push to make AI access global. This is the lesson Beijing took from the DeepSeek moment: China’s current advantage in AI lies in having an open source community that empowers robust exchanges and rapid iteration.

The US plan betrays anxiety stemming from the same shock, asserting that “[we] need to ensure America has leading open models founded on American values.” Similar to the Chinese plan’s geopolitical undertone, it calls the value of open source models “geostrategic.” For the US government, the bottleneck preventing more good open source models from being developed that it is best-placed to address appears to be researchers’ access to compute clusters. The American plan’s recommended actions mostly focuses on making it easier for academia and startups to access resources through NAIRR:

Diffusion and job market impacts

The US AI Action Plan calls for many more Americans to be employed as electricians and HVAC technicians so as to serve a bigger buildout of AI infrastructure while creating high-earning blue-collar jobs. It creates a detailed roadmap for how the federal government can leverage its bureaucracy to train more skilled workers in these domains. It describes itself as a “worker-first AI agenda” and seeks to fund more retraining for workers impacted by AI-driven redundancy. However, its assessment of the impacts AI might have on the labor force appears relatively optimistic: it merely calls on the Bureau of Labor Statistics to study AI’s impacts on the workforce through analyzing already-existing data, rather than collecting new data or establishing preventative policy measures.

For Beijing as well as Washington, job displacement might be worth it if AI adoption leads to stronger economic growth. China’s plan, however, is more aggressive about the literal replacement of human labor. Tertiary industries are the fastest-growing employment sector in China, as the services sector increasingly competes with traditional manufacturing; gig work, from ride hailing and delivery to even some factory work, is rapidly expanding to soak up excess labor supply. But this is how the document addresses how AI shall shape the services industry:

“Accelerate the service industry’s shift from digitally empowered internet services to new, intelligence-driven service models … Explore new models that combine unmanned (automated) services with human-provided services. In sectors such as software, information services, finance, business services, legal services, transportation, logistics, and commerce, promote the wide application of next-generation intelligent terminals/devices and intelligent agents (AI agents).”

Elsewhere in the document, the State Council does bring up impacts on employment. It instructs regulators and industry to “[strengthen] employment-risk assessments for AI applications; steer innovation resources toward areas with high job-creation potential; and reduce the impact on employment.” But such a statement is weak without explicit instructions to ministries or regional governments to secure employment. In places like Wuhan where robotaxis have already displaced traditional jobs, the government has no meaningful template of action. The post-Reform Chinese state has previously made explicit policy decisions to sacrifice employment, and consequently the danwei-based social safety net, for what it saw as necessary economic restructuring. Between 1995 and 2001, Chinese state-owned enterprises (SOEs) laid off 34 million workers — a third of all employees in SOEs — in an effort to reform the state sector. The layoffs devastated vast industrial regions and led to major unrest, but Beijing persisted on course. More recently, the impact on jobs was completely disregarded to prevent infection during the Covid-19 pandemic. Today’s China has no activist labor movement, no independent unions, and limited protections for workers’ rights. This document, produced during an already-ongoing unemployment crisis that heavily affects young workers, opens up the possibility that the state may be once again willing to put workers aside for national strategic aims.

Still from the 2023 Chinese drama The Long Season 漫长的季节, which was set during the SOE layoff wave in China’s northeastern Rust belt. (Image: New Weekly 新周刊)

The plan imagines adoption, application, and diffusion of AI as a whole-of-society effort. Beijing wants AI applied to everything from philosophical inquiries to residential construction standards:

It calls for coordination between AI and other emerging technologies, including biotechnology, quantum computing, and 6G telecommunications. Part 2 of the document, focused on actions to take, dedicates a whole section to consumer-oriented upgrades: it mentions not only well-known fields like wearables, electric vehicles, drones, and brain-computer interfaces, but also more quotidian areas of potential AI applications like travel, e-commerce, and “emotional consumption.” These lines subtly indicate to aspiring entrepreneurs that the government is shining a green light on consumer product innovation and so crackdowns are unlikely in the near future. Beijing seems unconcerned about an AI bubble or over-proliferation of wrappers; indeed, it’s actively encouraging experimentation and calling for “trial-and-error and mistake-tolerant governing systems” for AI adoption. That means that no, Chinese AI adoption will not be dramatically hampered by worries a model occasionally says something impolitic.

The US AI Action Plan’s section on adoption calls on American industry to adopt a “try-first” culture. The Trump Administration seeks to diffuse distrust of emerging technologies and create frameworks within which critical sectors can experiment with AI safely. The specific measures the US AI Plan suggests, however, look more cautious and grounded than to its Chinese counterpart:

Whereas the Chinese document wants all sectors in society to try AI first and get results after, the Trump administration seems to be gesturing towards a more careful path forward with quantifiable findings and measurable improvements. We won’t know which one of these approaches is better until after the fact; in fact, each might have its advantages depending on the sector it is being applied to. But on this point, the divergence between these two documents is dramatic.

International risk governance

The US wants to export its “full AI stack” — hardware, models, applications, and standards — to allies, and allies only. Washington’s vision of international AI governance divides the world between American and Chinese spheres of technological influence and seeks to make the former bigger. Its language on how to counter Chinese influence in international governance organizations is characteristically Trump-Administration, with mentions of “cultural agendas” and “American values,” but its focus lies with overall deregulation.

As usual, the Chinese plan is framed around the United Nations as the primary mechanism for international governance. It wants to improve AI access for the Global South and doesn’t explicitly require these countries to support Chinese values. Of course, this doesn’t mean the Chinese government is completely uninterested in ideology; as recently as June this year, a state media op-ed republished by Xinhua emphasized the risks generative AI posed to “social trust systems and the ideological safety line.” But from the perspectives of listeners in Global South capitals, judging by these two documents alone, China’s offer likely comes off as more value-neutral on the surface.

Subscribe now

More Notes!

The two documents address many similar issues under the AI governance umbrella, but also diverge in terms of topic selection. Some items that fell outside the Venn Diagram overlap:

  • The US AI Action Plan’s understanding of cybersecurity is far more mature than its Chinese equivalent. It addresses adversarial threats, vulnerability-sharing frameworks, and incident response with attention to both government and private-sector shareholders. As part of its understanding of AI as a race, the US document is much more sober about the cyber risks around AI models. By contrast, cybersecurity is almost entirely missing from the Chinese plan. This may partly be because the Chinese document avoids defence in general, but even in sections addressing government and private-sector adoption, very little energy was spent on considering how to secure the process.

  • Congruent with Beijing’s now-longstanding focus on data as a factor of production, the Chinese plan dedicates far more space to harnessing the economic potential of training data. The State Council argues that China has a “data-rich advantage” in AI. It wants innovative measures to increase data supply, including by bolstering the data processing and data labelling industries. (It’s worth noting that data services can create relatively low-barrier jobs in underdeveloped parts of China, which might contribute to Beijing’s enthusiasm.) That being said, both countries’ plans pay particular attention to scientific datasets. The US AI Action Plan recommends measures to create “world-class datasets” by setting data standards and making federal datasets more accessible to researchers. The Chinese one, similarly, seeks to accelerate scientific discovery by “[building] open and shared high-quality scientific datasets and [improving] the ability to process complex multimodal scientific data.”

  • ChinaTalk previously covered how AI is shaping education in China. In the State Council’s AI+ Plan, education also receives substantial attention. Not only does Beijing want more incorporation of AI tools into the education system, it also wants to bridge technological promotion into eventually “[promoting] a shift in education from focusing mainly on knowledge transmission to focusing on ability improvement”. This is an especially ambitious goal in China’s education system, where exams and rote learning are still king. Will AI be the thing that finally transforms the Gaokao?

  • “National security” appears 24 times in the US AI Action Plan. The US government sees basically every part of the AI ecosystem, from manufacturing to software exports and international governance, as critical to its future conception of national security. The Chinese one, by contrast, only mentions national security once, in the context of an item on upgrading domestic governance systems:

    The imaginary surrounding AI-powered national security is inward in the Chinese document, covering urban governance, disaster prevention, internet censorship, and law enforcement. In the US document, the implications of advanced technology for national security lie mostly outwards. As of yet, the US is far less afraid of its own people.

  • The Chinese plan dedicated a specific line item to AI-powered agriculture, a subject which the White House did not call out. This is increasingly relevant in China, as the state pursues food security while rural areas continue to depopulate and starve for labor. The technologies Beijing hopes will solve its food-security dilemma are interesting to note:

#101 垃圾时段,如何活得明白?

上周末在首都华盛顿参加不明白节。周四一早出发,搭Uber去机场。车里有比较浓的清洁剂和香水混合味,隐隐约约有点烟味。香水可能是为了掩盖烟味。

大妈很健谈,嗓音有点沙哑,就是那种喝酒抽烟的嗓音。她说她儿子在陆军,是军医,曾派驻阿富汗,前几个月被派到墨西哥边境驻守。她儿子在军队,挣钱不多。她开Uber,帮儿子养四个孩子。

她问我去哪里?我说去DC。她说,现在DC很安全,有国民卫队巡逻,不用怕被偷了抢了的。那是得州老乡对两岸大城市的印象。

飞到华盛顿DC,从机场坐Metro去酒店,车厢只有两三个人。上次来DC是2020年夏天,那时候是川普当总统,白宫周围有很多人抗议;时隔5年,还是川普当总统,街上没有了抗议的人群,有了巡逻的国民卫队。国民卫队一般是3人一组,只配带随身手枪,都是些年轻的男孩子、女孩子,看得出来,他们刚从高中毕业不久。

晚上,朋友建议去Butterworth’s吃晚餐。媒体说,这家餐馆是“MAGA's haute new hangout” (MAGA的全新高级聚会场所)。这是家不错的法国餐馆,烤鸭胸、烤牛骨髓都很美味。纽约来的朋友看餐桌上吃饭的,吧台上喝酒的,个个都像MAGA,跟纽约餐馆很不一样的风景。

从晚上7:30到11点,整座餐馆熙熙攘攘。吃饭的,喝酒的,还有在门外面抽烟的,除了我们,还有一位女士,是亚洲面孔,其他都是白人,只有一位服务员是黑人。一些年轻白人梳着川普儿子Eric Trump那种发型,从前面看,当然不是川普儿子,从后面看,活脱脱就是个Eric Trump。可能他们要的就是这种效果。

在不明白节上,认识了一些朋友,以前都听说过名字,但是第一次见面。去了季风书园,那里是个很珍贵的购书和文化交流场所。晚上,在街头第一次看到DC警察和国民卫队抓人。吃完晚餐,跟朋友们往酒店走。看到前面围了一群人,有警车,有警察,有穿军装的国民卫队。两名警察在用手铐,烤一位姑娘,大概有八九名国民卫队军人,默默站在四周。正常情况下,他们没有权抓人,可以协助警察抓人。

那位姑娘喊:“Are there humans out there?”听得出,她情绪很激动。有朋友问,警察为什么抓那位姑娘?一位围观的路人说:她向国民卫队吐口水。

参加不明白节的听众大部分是在美国留学和工作的年轻人,也有DC的外交官和智库成员。一些年轻人从外州飞来DC,比较远的像加州、得州、中西部一些州。也有人从加拿大飞过来。我想,大家花时间,花钱,来到这个场合,聚到一起,听讲座,交流,是为了活得更明白一点。讲座的内容,不明白播客可能会陆续播放,让更多的,没有机会到现场的听众听到。

整个一天的节日,令人印象尤其深刻的是那些现场志愿者,大部分是女生。也有男生,但是跟女生相比,男生好象太少了。很敬佩这些志愿者的无私、热心,还有她们的勇气。在敬佩之余,也忍不住问一句:“男生都去哪了?”来参加节日的不只一位嘉宾有这种感慨。为什么做义工的大多数是女生呢?

在不明白节,主持人袁莉让我跟大家分享一下“在这个时代我们应当如何自处”。

Read more

💾

Filling the Foundational Chip Gap

Last year we ran an essay contest exploring policy solutions to China’s growing dominance in foundational chips. Today we’re running another entry along these lines by Alasdair Phillips-Robins. He’s a Fellow at the Carnegie Endowment for International Peace and from 2023 to 2025 served as a senior policy advisor to Gina Raimondo.

China is on track to control the global foundational chip market, a key chokepoint for huge swaths of the U.S. economy. Foundational semiconductors are produced on older processes, but they are essential to almost every modern electronic product. Unless Washington and its allies act soon, control over foundational chip production would hand Beijing a new economic weapon it could use against the United States in a crisis. A chip shutoff would make this year’s rare earth crisis look like child’s play.

There’s no single answer to the threat, but the United States should start by hitting products containing Chinese chips with a novel kind of tariff, known as a component tariff. This tariff would apply only to the value of the Chinese chip inside the product, not the total value of the item being imported to the United States. Washington will do even better if it can get its allies and partners to act against Chinese chips in their economies, too.

Component tariffs would have three big advantages:

  • They won’t spike consumer prices: conventional tariffs, like the ones Trump has said he’ll put on some chip imports, raise prices for American manufacturers and consumers. But because foundational chips typically cost little (often a dollar or less) relative to the finished product, a component tariff would have little effect on consumer prices while causing companies looking to save every dollar on parts to turn away from Chinese suppliers.

  • They get ahead of the problem: most chips in American electronics aren’t made in China, but Chinese production is ramping up. A component tariff would discourage U.S. manufacturers, and foreign ones that sell in the United States, from switching to Chinese foundries when they come online. Foundry relationships are sticky, so getting out in front is the best way to stop Chinese market dominance.

  • They can be applied to other industrial inputs: Implementing a component tariff won’t be easy — it hasn’t been done at scale before — but getting it right would unlock a major new tool in the fight for fair trade with China over everything from batteries to minerals.

The Everything Chips

Foundational chips are used in vehicles, communications equipment, military systems, and other critical infrastructure. Even devices that contain cutting edge chips, such as phones, rely on numerous legacy chips. It was shortages of these chips during the pandemic that idled factories, emptied shelves, and left unfinished vehicles sitting on production lots.

The CHIPS Act was meant to prevent a repeat of this shortage, but only about $4 billion of the act’s $39 billion in manufacturing incentives has gone to foundational production. Meanwhile, China is spending tens of billions in subsidies for foundational chip producers and is on course to raise its share of global production capacity from around 30 percent today to more than 40 percent by 2030, with a majority of capacity at some critical nodes.

Heavy reliance on Chinese production would be an economic and security nightmare. Many legacy chips are specialized, and once the expertise and facilities to produce them have been lost, regaining them will be slow and expensive. As subsidized Chinese prices drive out competitors, chip buyers will struggle to find alternatives. Chinese-made chips will become embedded in American military systems and critical infrastructure, raising the risks of espionage and sabotage.

Source.

If supply from China gets cut off in a trade confrontation or a military crisis, economic activity in the rest of the world will grind to a halt. The result would be a repeat of the recent fight over rare earths, or the pandemic-era shortage, on a far wider scale. As soon as rare earths stopped flowing, Ford CEO Jim Farley began telling the White House that his production lines were shutting down. In a chip fight, the same will be true of dozens of industries, from vehicles to planes to wifi routers.

Beyond providing some CHIPS Act funding, U.S. policymakers have done little about the problem. In 2018, the first Trump administration imposed tariffs on imports of Chinese chips (the Biden administration kept them in place). But these tariffs haven’t achieved much, because they apply to the overall product being imported, not at the sub-parts inside it, and almost all foundational semiconductors enter the United States inside other products. A company that imports phones, for example, pays the general tariff rate for the country where the phones were assembled, plus any specific tariff applied to phones; it doesn’t pay any extra if the chips inside the phone were made in China. Manufacturers have no incentive to use non-Chinese chips over Chinese ones.

How to Tariff Better

Trump has said he plans to put a 100% tariff on foreign chips, but these won’t capture Chinese chips any better than the original approach. Rather than doubling down on traditional tariffs, the Trump administration should turn to component tariffs. Unlike a normal tariff, these would be triggered by the presence of a Chinese-made chip inside any product imported into the United States. The tariff could either be a flat rate on the number of Chinese-made chips in the product — a dollar per chip, say — or it could be tied to the cost of the chips. For example, if Chinese producers offer chips at a 50% discount relative to U.S. and allied producers, a 100% tariff would offset the Chinese advantage, levelling the playing field for U.S. chip makers. Luckily, the administration has the perfect legal vehicles to impose these tariffs, in the form of two trade investigations, one into Chinese legacy chip production launched in late 2024, and a broader investigation of semiconductor imports begun earlier this year.

Opponents of tariffs point out that they often hurt the very constituencies they are meant to help, as they raise prices for manufacturers importing tools, parts, and raw materials. But a component tariff on legacy chips would be a rare exception to those problems. Because legacy chips are usually cheap relative to the cost of the overall product, often costing a few dollars or less per chip, a tariff would have little effect on ultimate consumer prices, but would shift the incentives for electronics manufacturers looking to save on the parts that go into their products. Crucially, because China doesn’t yet dominate legacy chip production, most chip buyers won’t have to go through the painful process of finding alternative sources of supply for their chips; they’ll just need to avoid switching to Chinese suppliers when new capacity there comes online. Even outside the United States, there’s plenty of foundational capacity in the EU, Japan, and Taiwan that can be scaled up to meet growing demand.

Component tariffs have another major benefit: they would force companies to truly understand their supply chains. A 2024 Commerce Department survey found that nearly half of U.S. chip buyers didn’t know whether their products contained Chinese-made chips — an unacceptable situation for U.S. national and economic security. Requiring companies to report the sources of their chips to CBP when they import products would help change that. Self-reporting would make the tariff vulnerable to fraud, but it would be backstopped by CBP investigations to catch wrongdoers, and a legal requirement would give companies the push they need to finally map their supply chains. This is the same model the U.S. government has used in enforcing the Uyghur Forced Labor Prevention Act, which bans the import of products made with forced labor in China’s Xinjiang region. As with chips, CBP can’t tell from looking at a product whether it was made with forced labor. But importers are responsible for ensuring their supply chains are free of Uyghur abuses, and CBP can investigate alleged violations.

Implementing a component tariff would take time and money, especially as CBP hasn’t applied one at scale before (a few products, like watches, are tariffed based on their components, but it isn’t common). Luckily, CBP just got a big influx of cash from the One Big Beautiful Bill and plans to hire 5,000 new customs officers over the next four years. As for revenue, manufacturers that responded to Commerce’s 2024 survey imported about $1.5 billion in Chinese chips each year, and in total represented about one-sixth of global chip sales. That suggests a 100% component tariff on Chinese chips could bring in a few billion each year, enough to offset the cost of implementation without upending the overall chip market.

Component tariffs would also send a valuable market signal. Because chip production in China is still ramping up, the U.S. government can get ahead of the problem. Even an imperfectly enforced tariff would get electronics manufacturers to think twice before turning to Chinese chip makers. Chip supplier relationships are sticky — products are made to exact specifications, and switching to a new producer can be costly — so preventing Chinese firms from locking in customers is the easiest way to win the chip war.

Legacy chips won’t be the last sector where component tariffs come in handy. China is working to dominate other manufacturing inputs, like batteries and drone parts, and the United States will need tools to respond. Getting the bureaucratic machinery to work with component tariffs now will give the U.S. government another option when it confronts similar problems in the future.

A component tariff will be especially effective if the administration can bring along its partners, including the European Union and the G7, which have both expressed concern about Chinese semiconductor overcapacity. Washington has a bad habit of scrambling to address Chinese industrial targeting after a critical U.S. industry has already withered away. Legacy chips offer a rare chance to intervene before it’s too late.

ChinaTalk is a reader-supported publication. To receive new posts and support our work, consider becoming a free or paid subscriber.

Cheating Apps: China's Latest Tech Export

Chinese-developed apps like ByteDance’s Gauth andQuestion.AI have conquered US download charts, not by teaching but by offering quick solutions to math problems.

The landing screen Gauth shows you after you download and open the app

Measuring either by daily active users or by range of problem-solving capabilities, there are no dedicated non-Chinese competitors of this scale. Gauth’s strategy of using TikTok creators to advertise its app helped it explode in popularity, reaching nearly 700,000 downloads per day globally by March 2024.1 Meanwhile, Gauth and Question.AI advertise the ability to solve problems in “all school subjects” — including math, science, social studies, English, and foreign languages — with access to these solutions for free.

ChinaTalk is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

These apps are a ticking time bomb for political outrage in the United States Congress. You can imagine representatives exclaiming, “This Chinese app encourages cheating, and it’s making our children dumber! Parents in China don’t let their kids use these apps!”

Today, we’ll explore the differences between Chinese homework apps and the versions Chinese tech companies offer overseas. We’ll analyze their solutions to math problems (it’s a universal language!), their censorship regimes for social studies questions, and the business strategies of their parent companies in the Chinese domestic market and abroad.

Gauth vs Doubao Loves Learning (豆包爱学)

ByteDance’s domestic equivalent to Gauth is called “Doubao Loves Learning” 豆包爱学 (rebranded from “Hippo Loves Learning” 河马爱学), but the overseas version is still far more popular. Globally, Gauth boasted more than two million peak daily active users (DAU) in 2024, while ByteDance’s equivalent app for the Chinese market only had a peak of ~800,000 DAU around the same time.

We begin by asking ByteDance’s apps to solve this integral:

Both of BytedDance’s apps produced correct solutions, but the user experience is substantially different:

  1. The Product for the Chinese market, Doubao Loves Learning, shows the steps before the solution, while Gauth puts the solution first and the steps underneath.

  2. Gauth is much more aggressive about prompting users to upgrade to the paid version of the app.

  3. The explanations from Doubou Loves Learning were more detailed, including helpful tips like “The key to integration by parts is choosing the right functions for u and dv,” which did not appear in the free version of Gauth. The Chinese app also automatically graphs the integrand to help users visualize the problem.2

Interestingly, Gauth was able to solve trig integrals that Doubao couldn’t solve, indicating that they aren’t necessarily using the same models to solve problems. For integrals that require you to rewrite the integrand using a trig identity, Gauth usually produces the correct answer while Doubao flails.3

What’s going on here? It could be that ByteDance is investing more in the international version of its app because there is a much greater appetite for homework hacking tools outside China. Western education systems, in both high school and college, place such a large emphasis on homework, while China’s education system emphasizes testing above all else.

As ChinaTalk analyst Irene Zhang told me:

“Chinese kids take so, so many exams at school all the time, which renders homework cheating apps meaningless. I attended Beijing public schools for grades 1 through 7 during Beijing Ministry of Education’s “holistic education” era (素质教育; translation: “everyone stop assigning so much homework”), which technically required teachers to assign no homework for grades 1 and 2 and only up to 1.5 hours of homework per day for middle schoolers (grades 7-9 in China). In part to skirt these caps, we had morning quizzes and mock exams more days than not from grade 4 onwards — I even got extra credit as an annoyingly keen fifth-grader by helping teachers mark the voluminous amounts of pen-and-paper exams on hand. I’m sure it’s worse now. That means kids in homework-dominant systems like the US & Canada get so much more out of these apps than Chinese kids, for better or worse.”

Constant testing means students in China need to solve problems on their own under time constraints, so it truly is disadvantageous for them to cheat on their math homework with apps like these. Given the legacy of the education crackdown in China, is it still uncouth to monetize children’s learning too aggressively particularly for firms like Bytedance that have bigger GR worries domestically. Lastly, With such a huge discrepancy in daily active users, we should expect ByteDance to spend more resources on Gauth than on the domestic equivalent.

I want to be clear though — it takes college-level calculus problems to stump these apps. They provide correct solutions for the majority of problems you could expect to see in high school math classes, and they’re getting better with every update.

What about social studies and English problems? For writing-heavy questions, these apps offer answers similar to what you might expect from plugging the prompts into any LLM. But there is one difference — it appears that both versions of the app have some sort of censorship protocol that can be switched on and off. Here’s a review for Gauth on the App Store:

As of September 2025, Gauth is now willing to answer this question, as well as questions like, “What are some factors that caused Donald Trump to lose the 2020 presidential election?” But the fact that Gauth was at one point restricted from being critical of Trump suggests that ByteDance learned from the TikTok ban fiasco, expected political outrage in response to this app, and then liberalized Gauth’s censorship mechanism to avoid inconvenient accusations. Likewise, I was unable to find a red line for China-related topics.

Q: “What happened at Tiananmen Square in 1989?”

However, Gauth isn’t immune to toeing the party line — it just tends to be more subtle about it:

Q: “How many terms is the president of China legally allowed to serve?”

Reader, the term limit in place before Xi was not informal — it was in the constitution! Doubao, on the other hand, is not willing to answer this question at all.4

Question.AI vs 作业帮

Question.AI’s largest user bases are the USA and Indonesia. While there is also a domestic version of Question.AI called Zuoyebang 作业帮 (“Homework Help”), the parent company by the same name primarily makes money in the Chinese market by selling smart learning tablets and dictionary pens, not homework solutions.

Zuoyebang was founded in 2015 by Hou Jianbin, who said the following about his company’s mission in a 2020 interview:

NetEase Technology: In your understanding, what value does Zuoyebang create for users or for society?

Hou Jianbin: Internally, we usually say: “Learning changes destiny, Zuoyebang changes learning.”

As society develops, learning has become increasingly important for personal growth. For an individual to integrate into society, they must cross a threshold — and that threshold has been rising, becoming a high wall. The meaning of education is to enable a person to cross that high wall of social integration.

100 years ago, you could survive without being literate. Fifty years ago, graduating from middle school gave you enough knowledge reserves. But today, the knowledge and skills needed to enter society are much greater. So the cost of social integration for an individual is rising. It’s no longer just a question of “if you don’t study, you won’t make progress.” It’s become: “if you don’t study, you’ll be eliminated by society.”

Zuoyebang’s mission is to build a ladder to help more children better climb over society’s high wall.

A rather optimistic framing of a company that makes, among other things, a cheating aid and an NSFW chatbot.

Just like the first pair of apps, both Zuoyebang and Question.AI were able to solve the integration by parts problem we looked at earlier. Here’s what makes them different from the ByteDance products:

  • Question.AI shows ads before it lets you see the solutions to a problem or enter the app. Zuoyebang shows some ads, but far fewer than the international version.

  • Unlike Gauth, Question.AI does show the steps before the solution.

  • Zuoyebang’s Chinese app requires a Chinese phone number to see solutions, which Doubao does not.

  • Anecdotally, the solving algorithm seems a bit worse — Question.AI and Zuoyebang both produced the wrong answer when I asked them to solve the trig integral we looked at earlier.5

While Zuoyebang has not mastered the half-angle formula, the company’s Chinese app has several educational features that Question.AI doesn’t offer. These include digital planners, study guides, and a function to check students’ work after they’ve already attempted to do an assignment on their own, which is aimed at parents.

A machine-translated graphic introducing Zuoyebang’s tool for parents. Source.

Finally, Question.AI declined to answer questions about Tiananmen Square, calling such information “inappropriate.”

Are We Cooked?

The reality is that the versions of these apps for the Chinese market are more educational, less aggressively advertised, and far less widespread. Perhaps these companies are trying to avoid the ire of regulators in Beijing, and thus the features they push in the Chinese market — like time management tools, supplemental study guides, AI tutoring, and tools for involved parents — are more pro-learning. It could also be that the focus on testing in the Chinese education system legitimately makes these apps less useful. In any case, ByteDance and Zuoyebang have decided that cheating aids are the best way to make money in international markets, yet decline to use that same strategy for profitability at home.

As ChinaTalk’s resident math major, I worry that these apps are robbing students of the opportunity to develop their critical thinking skills. The only way to ensure students develop math ability, it seems, is to weigh final grades toward in-class assignments, tests, and open-ended projects. But how can the mental scaffolding that comes from repeatedly solving homework problems be built solely in the classroom? Students simply don’t spend enough time in class for that to be possible. In reality, I fear that richer students (and those with more involved parents) will be sent to extracurricular tutoring centers to ensure they aren’t automating their homework, while everyone else falls behind.

I don’t see much downside to banning apps like these in the USA — and if parents and teachers make regulators pay attention, that could legitimately happen.

This is the second article in our series about China’s AI Education Industry. You can check out the first installment here.

ChinaTalk is a reader-supported publication. To receive new posts and support our work, consider becoming a free or paid subscriber.

1

Symbolab and Google’s Photo Math can only solve math problems without words; Chegg relies on humans to solve problems and offers zero free access to solutions.

2

As a nice bonus, the Gauth browser extension requires access to basically all of your browser data, but can’t be bothered to use proper notation for solutions like the mobile app does:

3

For this integral:

Gauth produces the correct answer:

Meanwhile, this was the best Doubao could do after 31 steps:

None of the steps in this image are related to each other.

4

Here’s Doubao’s response when I asked about presidential term limits in Chinese:

“We are temporarily unable to answer this question, try another please!” (Notice the watermark that says, “内容由AI生成“ “This content was generated by AI”)

5

For reference, this is the correct answer:

❌