YouTube каталог
What Apple’s Neural Engine Tells You About the Next Decade
🚀 Career
en

Apple робить ставку на локальний AI: глибокий аналіз Neural Engine та його майбутнього

TiffinTech11 днів тому3 квіт. 2026Impact 6/10
AI Аналіз

Apple нарощує потужність Neural Engine в кожному поколінні чипів, роблячи ставку на локальний AI. Це дає переваги в швидкості та приватності, але обмежує можливості масштабування порівняно з хмарними AI-сервісами.

Ключові тези

  • Apple послідовно збільшує обчислювальні можливості Neural Engine в кожному новому чипі.
  • Локальна обробка AI-задач забезпечує більшу конфіденційність даних користувача та меншу затримку.
  • Apple інтегрує нейронні прискорювачі безпосередньо в графічні процесори для підвищення продуктивності AI.
Можливості

Локальний AI від Apple пропонує безкоштовний inference після покупки, на відміну від хмарних API, які коштують грошей за кожен виклик.

Нюанси

Більшість розробників ще не використовують можливості Neural Engine повною мірою. Потенціал локального AI розкриється лише тоді, коли з'являться додатки, які зможуть ефективно використовувати його переваги.

Опис відео

Okay, I just got the new MacBook Pro and I want to talk about it, but not in the way that we typically hear people speak about it. We're not going to go through the general specs, the things that you can easily search up online. We all know about that. I want to talk about a bigger story here. If you sit down and read about the reviews, you hear about the CPU, you'll hear about the different specs because those are the numbers people know how to compare. I mean, Geekbench scores, render times, frame rates. But there's also a 16 core neural engine inside this machine that barely gets mentioned. And Apple has been making it bigger every single generation since 2017. And so have other companies. I mean Qualcomm, Intel, really every major chip company on Earth is quietly dedicating more and more transistors to the same type of component. And when that happens, when companies that maybe aren't aligned on other things, but they all start making the same thing or the same investment at the same time, that's not a product feature. That's a signal. And if you know how to read that, you can see exactly what they think is coming. And for this video, we're going to focus on Apple. So, I'm very inspired by my new MacBook. But first, I want to say a big shout out to Surf Shark, who sponsored this video. I mean, I travel constantly for work. GTC, CES, I mean, you name it. Every single time I am sitting in an airport or hotel lobby connected to public Wi-Fi, sending files, I mean doing so many things, things I should probably not be doing on an open network, but I need to. And that is where Surf Shark comes in. It encrypts everything between your device and the internet. So nobody on the network can see what you are doing. And the other thing I use it for all the time is changing my virtual location. Surf Shark has over 4,500 servers in 140 locations. So, if I'm traveling and something is georestricted, I can just connect to a server back home and I'm good. Another thing I really love is one account works on unlimited devices, too, which is really unheard of. All right, so if you want to go sign up, go to surfshark.com/tiffech or use the code tiffant at checkout to get four extra months free. The link is in my description. And there actually is a 30-day money back guarantee. So, there's really no risk. It's just preventing risk, especially when you're using public Wi-Fi. All right, so we have first we have to step back a sec. Apple left Intel back in 2020. That's when they shipped the M1, their first custom silicon for Mac. And at the time, a lot of people treated it like a gamble. Could Apple really design a laptop chip as good as say Intel's? Now, well, 6 years later, the answer is pretty clear. Absolutely. But the more interesting thing is how they did it. Because Apple's approach to chip design is fundamentally different than everyone else in the industry. Now let's go through an example of this. Intel, we'll use them as an example, designs a processor. Then Dell puts it in a laptop. HP puts it in a different laptop. Another company puts it in another one. So this one company controls the chip. The manufacturer controls everything else. So many other companies work the same way. They design the silicon, license it out, and then other companies build products around it. And it works. It's a great way to do things. But Apple, they design the chip. Apple designs the memory architecture. Apple writes the operating system. Apple builds the developer frameworks that software is written against. It's one company, one entire stack from the transistor all the way to the API. And that has really interesting things that you can actually see. I mean, take unified memory. On a traditional PC, the CPU has its own memory and the GPU has its own memory. And when the data needs to move between them, it gets copied. That takes time and power. On the M5, the CPU and GPU share one pool of memory. So there's no copying. The data is just there. And Apple could do that because the team designing the chips is the same team designing the operating systems memory scheduler. They built them together. Now let's go back to the neural engine. This is something that is so underrated in my opinion. I mean this benefits from the same integration. It sits on the same die accessing the same memory pool and the same software that runs on it which is a framework called core ML and it was built by people who sit down the hall from people designing the transistor layout. I mean I don't know if it's right down the hall but you get what I'm saying. When you control the full vertical like that every layer can be optimized for layers above and below it. And this in my opinion is where it gets really interesting. The M5 introduced something new. For the first time in an M series chip, Apple embedded neural accelerators directly inside the GPU cores. So, the neural engine is doing its thing on one part of the chip and now the GPU can also run AI workloads natively. And Apple claims it has over four times the peak GPU compute for AI compared to the M4. Now, that's the kind of design decision that only happens when one company owns the entire pipeline. You can coordinate what the neural engine handles versus what the GPU handles because you control the software routing those workloads. And really nobody else in the PC space can move like that. And then the M5 is generation 5 of this approach. So same architecture scales from a 599 MacBook Neo all the way up to Mac Studio. I mean 5 years ago this silicon program didn't exist. Now it's powering every Mac Apple sells. Okay. Okay, so when I was doing research on this, this is where, in my opinion, the story gets pretty fun. I want to show you something. In 2017, Apple put the first neural engine in the A11 chip, the one inside the iPhone X. It had two cores. It could do 600 billion operations per second. And that sounds like a lot, but all it really did was power Face ID and Animoji, and developers couldn't even access it. Apple kept it locked through their own software. Now, one year later, A12 jumped to eight cores. 5 trillion operations per second, nine times faster, using a tenth of the power. And this happened, or this time, Apple opened it up to everyone. They released a framework called Core ML that essentially let any developer run machine learning models on the neural engine. Fast forward to 2020, the A14 doubled the core count to 16 and hit 11 trillion operations per second. That same year, Apple shipped the M1 with an identical neural engine, bringing it to the Mac for the very first time. Then the M4 pushed to 38 trillion operations per second. And now the M5 takes that further with a faster 16 core neural engine, plus those new neural accelerators in the GPU. Stay with me for a sec here. Follow that arc for a sec. Two cores to 16, 600 billion to 38 trillion, 8 years every single generation. And Apple chose to dedicate more transistors to this component. And transistors are expensive. Chip real estate is finite. Every square meter you give to the neural engine is a square millimeter even you don't give to say the GPU or the CPU. And Apple kept making that trade-off anyways. And they're not alone in doing that. There have been a few other companies that have done the same thing or something similar. Anyways, for example, Qualcomm, their new X2 Elite coming this year pushes that to 80 trillion. So when you zoom out and see that pattern across different companies, you're watching something form in real time. So everyone agrees that AI compute matters. The question though that they disagree on is where it should live. So on one side there's a school of thought that says the real AI work happens in data centers. You write a prompt, it goes to a server farm full of GPUs, the model runs, and the answer comes back. That's the architecture behind every API call you make when you're talking to Chat GBT, Claude, Gemini. The models are massive. The compute required to run them is equally as massive. And the argument is that your laptop will never have enough power to match what rack of GPUs in a data center can do. There's another school of thought, and this is where Apple is planting its flag. It seems it says meaningful AI should run locally on the device in front of you, private by default. So, no round trip to a server, no API cost per call. I mean, no dependency on someone else's infrastructure. When Apple Intelligence summarizes your email or transcribes a voice memo or cleans up a photo that's running on your neural engine, your data never leaves the machine. Now, if you write software, you already understand this trade-off. Cloud inference scales, but it costs money every time someone calls it, and the user's data has to leave their device. Local inference is private and essentially free after purchase, but you're limited by whatever silicon the user owns. Here's what I find really compelling about where we are right now. Neither side is going to win outright. The future is almost certainly somewhere in between. Some workloads will run locally and some will run in the cloud and the developer is the person who is deciding which is which. Now, in my opinion, Apple is building the strongest case anyone has ever built for that local side of the equation. I mean, when you think about the M5, it gives you a dedicated neural engine, neural accelerators in the GPU, unified memory, so your models don't waste time copying data around, and a mature developer framework in core ML that routes workloads across all of it automatically. That's a complete system for ondevice AI, and they've been integrating it for 8 years. I mean, think about what that means for apps coming next. A developer building a photo editing tool can run their model on the neural engine and it can cost them their user anyways nothing per inference. A health app can analyze sensitive data without it ever leaving a phone. Another good example would be a a coding assistant. It can run suggestions locally with zero latency. These are real product decisions that makes a silicon so possible. And every generation the ceiling gets higher. So here's the thing I keep coming back to. You can learn a lot from marketing, but you can also learn more from earnings calls. But the most honest thing any company publishes is probably the silicon itself because you can't fake transistors. You can't fake what you are using. Marketing can say anything, but the chip is the actual commitment. And Apple has been investing in the neural engine for 8 years. And you know, other companies that say Qualcomm just nearly doubled its MPU performance in a single generation. And all of the silicon, every chip I just mentioned runs through one company's factories. TSMC in Taiwan and it manufactures the M5. They manufacture I think almost every company's chip. It's incredible. Qualcomm, Intel, the entire AI chip race is based there really. And right now the software is starting to catch up. For example, Apple shipped Apple Intelligence. Qualcomm is pushing Copilot Plus. Intel is marketing Lunar Lake as AIPC. I mean, there's so many different companies, but most of the neural engines 16 cores spend most of their time waiting for something to do. Most developers haven't shipped their first core ML model yet. Yet, most of the applications that will justify this investment haven't been fully built yet. And that gap though between what the hardware can do and what the software can do, it's really where the opportunity lives. And I think every generation as Apple keeps making these AI engines bigger, the runway gets longer. So, I've been spending some time with this machine, the MacBook Pro, the CPU is fast, the battery lasts all day. I mean, the display, it's gorgeous. You can read about these benchmarks from everyone all over the place. Other people are going to cover that. The story I really wanted to tell and share with you today is the part that doesn't really get shown a lot, which is the chip. It's not going to show up in a Geekbench score. Apple has been building the neural engine since 2017. Two cores that could barely handle Face ID. Now it's 16 cores with accelerators embedded in the GPU running a complete framework for ondevice machine learning. It's incredible. And they're doing this because they believe the compute should stay with you on your device under your control. And I see a lot of benefit to that when it comes to security and so much more. Now, the M5 is a really good laptop chip, and I think it's also something bigger than at play here. It's a year five of a silicon thesis, year eight of a neural engine thesis, and one piece of an AI industrywide bet that AI belongs on the device, not just in the cloud. That's the story really that the chip is telling. You just have to know where to look. It sounds kind of cheesy, but it's really true. Look at the hardware. Look at the chips. Those don't lie. All right, now I got to go build on this. I am obsessed. I've had so many versions of this throughout the year, and I felt inspired to do some research and look in some history. So, I hope you enjoyed this video. Leave down in the comments what other topics you want me to deep dive into. I hope you're still really enjoying these deep dives into tech and asking and answering big questions. All right, I'll see you in the next one. Thanks, everyone.