YouTube каталог
How I Save Over 50% of My Claude Code Context (12 Rules)
💼 Business
en

12 правил економії контексту в Claude: як зменшити витрати на 50%

Jono Catliff8 днів тому6 квіт. 2026Impact 6/10
AI Аналіз

Автор YouTube-каналу Jono Catliff розкриває 12 способів оптимізувати контекст в Claude. Це допоможе користувачам уникнути обмежень, пришвидшити роботу та зменшити витрати на обробку великих обсягів інформації.

Ключові тези

  • Оптимізація файлу claw.md для зменшення контексту
  • Використання навичок та еталонних файлів для повторного використання
  • Ефективне управління великими файлами та очищення пам'яті Claude
Можливості

Зменшення витрат на використання Claude на 30-50% за рахунок оптимізації контексту • Підвищення швидкості обробки запитів за рахунок зменшення обсягу контексту • Можливість працювати з більшими проєктами без перевищення лімітів контексту

Нюанси

Більшість порад універсальні і можуть бути застосовані до інших LLM, але деякі команди (наприклад, `/context`) специфічні для Claude.

Опис відео

The longer you use Claude, the worse it gets. And not because the model is bad, but because your context fills up with junk. Every single file it reads, every message you send, and every tool it calls eats away at that context window. And of course, when that fills up, Claude starts cutting corners. Here are 12 ways to keep your context sharp. And the first one is shortening up that claw.md file. Now, it's natural to want to throw in as much context as possible to hope that it does a better job understanding your project, but in this case, less is more. So, I have two claw.md files in front of me. One that's bloated, and it comes in at 910 lines. And then I have another claw.md file where it's 33 lines. Let's take a look at how it performs. So, I asked the exact same question, explain the project structure, and suggest improvements. And just with this one claw.md file change, when we run the context required to output this, it required 41% of the context window with a clean claw.md file, which still seems pretty high. But on the inverse with this bloated claw.md file, it came in at 45%. So right away, just by shortening up your claw.md file, you can save 4% of your context window. And by the way, that it's not a one-off thing. Let's head over to rule number two, which is adding in one additional step into your claw.md file. And this is something that I had to learn the hard way because I always kept hitting uh context limits. And I spent half my time just trying to compact every single message. It was pretty frustrating. And so what I decided to do was I added in this one line or this one part to my cloud.MD MD file saying when the context exceeds 50% suggest starting a new conversation or using sub aents for independent tasks and essentially provide me with ways that I can reduce the context when I'm building out projects. I want Claude to actually tell me and work actively towards reducing the the context because as your project expands like you can see over here it just gets really bloated. There's a ton of stuff going on. A lot of it's BS and you don't need it. And a great way to avoid that from happening is by getting Claude to work on this for you. So you can see here at this point, this conversation is at 75% context uh capacity and then it's going to start recommending solutions for you. So this is a great way to get Claude to do the heavy lifting for you. All right, so let's move on here to rule number three, which is all about using skills. So when you get started with cloud code, usually by default, most people will throw everything into a cloud. MD file. And this will take up a lot of context. Every single time you message Claude, it's going to increase the context. What you want to do is you want to break every single workflow down into a skill. I kind of use skills and workflows synonymously. To me, they're pretty much the same thing, but essentially what happens is you have a workflow or a skill for creating LinkedIn messages or uh posts. You have another skill for replying to emails, and then you have a third skill for creating proposals. The beautiful thing is is you're not cramming everything into the claw.md file. you're only using context from this LinkedIn skill if you actually call it to use uh to create LinkedIn posts or you only use the email skill if you're creating emails and so on and so forth. So I have a sample here where I have this analyze bank CSV um skill over here and it's just going through a CSV file that we have. It's going to answer these 10 questions for us and when we come down here it's using approximately 27% context because we've we've gone through we've perfected the skill. is asking all the right questions. Now, on the inverse, if you were to not build out a skill and just use a standard claw.md file and then ask, you know, five questions at once and then you're like, "Oh crap, I have another question and then I have another question and another question." By the time you're all said and done, it's 45% context window compared to 27%. So, this is a great way for you to be able to reduce the context that you're using inside clawed code. Now, to take it a step further, next we have reference files. So a great example of this would be tone. So let's just say you type a specific way. You want claude code to memorize how you type. The thing is is that these are kind of reusable templates that you use across all your skills. So when you're writing a LinkedIn post, you want it to reference your tone. When you create emails, you also want it to reference your tone. And when you create proposals, you also want it to reference your tone. And let's say for reference over here, we we uh I don't even know what I was saying there, so I'm just going to skip over this. But [laughter] clearly made a typo. I can't recognize, but reference two is banned phrases. So perhaps we only use banned phrases for propulsions. The point is is that you can create reusable templates that you use over and over and over again. So instead of like before stacking everything or baking every single one of these reference files into every individual skill, you only reference these individual components when you actually need them, which means that also reduces the amount of context that you're using. So, if we take a look at this, I said, "Hey, I want you to write an email using a particular skill here." And then here's all of the additional files that you could reference if you need to actually pull them up. If not, just skip over them. And what happens is that when you phrase it like this or when you prompt it like this, you're using way less context than you otherwise would if you were just baking all of these additional files over here, which could be very long, by the way, into the the the the skill in its in in its entirety. So here's another example here where we just baked every single one of those reference files into a skill. And naturally, this can explode in context, right? we have 457 lines as opposed to this one which only has 31 lines and then references anything if it's needed. So again in this case when we're referencing other files as reusable templates we're only using 25% context whereas if we were to bake everything in we're using 31% context every single message. So it does make a big difference. Now the next thing is all about dealing with large files. Okay. So, in this instance, we have this transcript, which is 30,01 lines. The problem is is that if you just throw this into a message here, this is a ton of context that you're going to be eating up. So, the play here is when you're asking questions regarding very large files, you don't actually want to include it as a message. What you want to do is include it as a file that it reads and then you reference that individual file that you created here. And the reason why is because right off the bat, if we were to fire this off, this is going to consume 71% of our context usage. And if we were to uh just ask it to read that exact file over here where we have this large uh this large transcript instead, now it's only going to consume 38% of our context. So that cuts it down substantially by almost half just by referencing a file in our file system over here. Okay, you can see the file right there as opposed to just dumping it into the chat window like we did above here. Now, probably one of my favorites actually is the fact that if you literally just change the model you're using, this will substantially impact your uh context. So in this example, I just said hey using the hi coup model and literally just by saying hey a third of my context over a third is already gone. But that's the inverse is actually true because if we switch over to opus over here then we run context it's 9% just for saying hey. So if you just switch to a better model you get substantially more context. Now, another great thing is is that if you're trying to figure out, hey, you know, like I'm using so much context, what's going on? How can I remove this? You can literally just type in here slashcontext. This is a command inside of uh inside of cloud code, and it's going to tell you exactly how you're using your context. So, we can see down here the categories, the tokens, and the percentages that we're using for this particular prompt or for this particular chat window so far. And we're going to cover all these things later on, but we can see here there's a ton of space being used for MCP tools. There's a ton of space being used for memory files and skills as well. Now, keep in mind this is just a very basic conversation where all I did was say, "Hey." And so, naturally, as you have way more complicated conversations, there's going to be way more usage that it's going to list here. Now, just one other thing that you can do if you ever want to reset your context window, you can type in the command clear, and this is literally just going to clear the conversation and start new. This is the exact same as also just hitting this plus button right over here. You can create a new task. So, if any point in time your context window is just through the roof, like I've had times where I it compacts the conversation and then I'm at like 2% left or 3% left. And so it's like every message I'm sending it's it's recompacting. You can always of course go ahead and start a new conversation in a new tab. And when I say compact there, let me just kind of back up here and explain. So let's say I have this very long conversation here where I just keep asking questions and Claude code keeps giving me responses. What I can do when this context window hits like 90% or 100% or whatever the case may be, you can hit slash compact. What this is going to do is it's going to take the history of the conversation, break it down into a much smaller prompt so it gets a bit of context, and then you can start the conversation pretty much brand new just by hitting compact here. And this is going to shrink it down. You can also add in here what you want to keep. So if there's any important information that you want to make sure persists after the compact, you can also write it in here. And this will just help keep all the important points to you. All right, so up next we have something called memory. And when I mean memory, I just mean that when you create a Cloud Code account, Claude will remember certain things about your personal life and your work life. And when I say memory as a file, it's not something that's in your file structure here that you create or delete or edit, but just by default inside cloud code somewhere, they have a uh a memory file. And you can ask it what's in there by just sending a message like please check all my memories that you have about me in claude code and it's going to go ahead and tell you everything it knows about you. So for me it's saying hey this is your name you have 17 memories some are personal some are workflows some are content projects some are technical rules and all that kind of stuff. And an interesting thing that it actually had in here was this hier project which was just a demo build I built on YouTube one time. And this is certainly not something that I'd actually wanted to remember about me. So what I can do here is I can say please go ahead and delete everything about Hierbby. That was just a sample project. And so the point here is that if you're not careful, you can have a lot of information stored in memory that you might not actually want there that is slowing your project down substantially. So up next we have MCP connectors. And so in a tab what you can do is you can write in here cloud MCP list and it's going to list out every single one of the connections that you have to external applications. Now, if you want to delete these from your account, all you have to do is head over to claw.ai/ settings/connectors and then you have a list of all of your connections over here, and you can uh remove anything that you're not using. And then, of course, that's going to reduce the context in your chat. Now, just as a brief uh demo here, if we pull up MCP tools, like you can see all the tokens that I'm using on things like Slack, for example, or on things like Air Table. So, it's not a nonzero amount. Like, this is quite a substantial amount. And I'm just using three MCP servers. Like, I honestly don't even have that many, but if you had like 20 or 30 or 40, this would take up a tremendous amount of context. So, the point here is that if you're not actually using these MCP servers, then you should probably go ahead and remove them so that this doesn't bloat down your entire system. Now the last example here of how you can remove context in this window is by spinning up sub aents inside of claude code. So I have this demo here but essentially you might ask claude to complete a very big task. Okay, so in my example I said I have all of these junk files and keep in mind these are very very well maybe not this one but for the most Oh yeah, these are all very large files. Like you can see there's tons of text. This is a binary file. It could be like a picture or a video or whatever the case may be. So I said, "Hey, I have a huge file over here called junk files." And what I want you to do is use sub agents to break this down. I want you to first of all extract questions. Second of all, uh, deal with action items and then the decision-m separately and then it linked to that folder. And so what we're doing here is we're spinning up something called a sub agent. So instead of one main thread or or conversation handling this task, we're actually going to silo it into three different sub agents where they all take on a third of the responsibility. So one is going to be dealing with questions uh question extraction. Okay, so 33% of the work's going here. Then 33% of the work is going to action item uh the action item sub aent. And then 33% is going to the decision extraction sub aent. And we can see that over here. Here it's saying I'll help you break this down uh using sub agents and then it's spawning three of them right over here. And by default cloud code should do this on very large projects but you can also get it to do it for you most of the time by just saying hey I want you to use sub aents and when I did ask it to spawn up these sub aents it went ahead and processed that for me. And this is a great way where you can now start using multiple different sub aents to accomplish one task rather than one main thread where it might get overloaded with context. So that is it for this video guys. Thank you so much for watching. If you found value in it, make sure to hit that like button and that subscribe button down below. It tells me I'm doing a really good job. If you guys want free blueprints to all of the material that I offer on YouTube, I'd highly recommend taking a look at my free school community where I give away resources like proper claw.md files and all that kind of stuff. And lastly, if you want to have some transformations inside the AI automation niche, you can also take a look at my paid community where I offer two transformations. Number one is how you can find and close deals within 30 days or less and essentially rinse and repeat the process. And transformation number two is how you can essentially automate up to 80% of your existing business using the exact blueprints that allowed me to scale to seven figures. All you have to do is copy and paste. And if you don't want to have to deal with any of that, you can reach out to our my agency over here where we will automate everything on your behalf. So, thanks guys for watching and I'll see you in the next one. Bye-bye.