
📹 YouTube AI | Сьогодні · 17 відео
Повний каталог нових AI-відео з 4 каналів. Вибирайте що подивитись.


A few weeks ago, Google announced this new research called Turbo Quan, claiming that their new compression algorithm is capable of reducing LM's KV cache memory usage by up to six times and achieving up to eight times speed up. This headline itself is huge, reaching up to 38,000 likes. Because if Turbo Quin is applied, at least for Google, they get to free up around 83% of the total memory that existing hardware was occupied with, which some would say led to the crash of AI chip stocks with pric








If you ask any reasoning LLM the question, what's the most clever trick to speed up LLMs? They would all mention this concept called speculative decoding. This trick definitely feels a bit magical. It's not doing any mathematical tricks or hardware optimization to make the LLM generate faster. Yet, it is pretty much a lossless technique that guarantees two to three times speed up. How is this legal? Well, just like every magic trick, once you understand what happens behind the scenes, you would


Game AI, the practice of implementing artificial intelligence to craft the experience of players in video games, is facing an existential crisis. A crisis of confidence in technology. A crisis that at its heart is about perils of scale and complexity. A crisis of handling risk in a land of the risk averse. A crisis of lost knowledge and of forgotten values. a crisis born of a niche that exists for a single purpose in a world that seldom acknowledges its existence. The process and study of how we

One of the big focuses for LMS in 2026 is solving long context. And just in the past few months, I have already covered many cool new approaches different labs are coming up with to address this bottleneck. From linear attention, sparse attention to R&M based methods. But they all share a major trade-off. That is the need to drop full attention over all context in some sort of way. Linear attention compresses it. Sparse attention ignores some parts. R&M based methods decays it all eventu






![DeepSeek's Insane Architecture Breakthrough [Engram Explained]](/_next/image?url=https%3A%2F%2Fimg.youtube.com%2Fvi%2FxUlX6jvwVfM%2Fhqdefault.jpg&w=3840&q=75)
When I thought I was finally used to how good DeepSeek is at publishing research, my mind has yet again been blown away with how insanely well-crafted this new paper is. Like, if you want to do good LM architectural research or just good ablazion studies, you should just get my video and straight up read the paper because there's just no way for me to do the paper a complete justice. They wasted no time improving their idea works with methods so rigorous that it makes any other architectural rel






![DOOM 1993 in 2024 - [PART II]](/_next/image?url=https%3A%2F%2Fimg.youtube.com%2Fvi%2FweWLYxXpn8Q%2Fhqdefault.jpg&w=3840&q=75)
































