Hermes Agent Full Setup Tutorial: How to Setup Your First AI Agent (Gemma 4)

💼 Business

Налаштування Hermes Agent з Gemma 4: Повний посібник для локального AI-агента

Bart Slodyczka•3 місяці тому•8 квіт. 2026•Impact 5/10

AI Аналіз

Відеоінструкція показує, як налаштувати Hermes Agent з локальною моделлю Gemma 4 для приватної взаємодії з AI. Пояснюється встановлення, конфігурація та інтеграція з Telegram і локальним веб-пошуком, акцентуючи увагу на конфіденційності та контролі.

Ключові тези

Покрокова інструкція зі встановлення та налаштування Hermes Agent.
Інтеграція з локальними AI-моделями (Gemma 4) за допомогою Ollama.
Налаштування приватної взаємодії з AI через Telegram і локальний веб-пошук.

Можливості

Повний контроль над даними та конфіденційність • Зменшення витрат на використання платних AI-сервісів • Можливість кастомізації та інтеграції з локальними системами

Нюанси

Використання локальних моделей AI дозволяє уникнути залежності від хмарних сервісів, але може обмежити можливості через обчислювальні ресурси.

Опис відео

▼

Hello legends. In this video, I'm going to show you how to set up Hermes Agent for the very first time. We'll start off by going across to GitHub and then downloading and setting up Hermes Agent step by step. We'll plug it into Telegram. We'll plug it into a free local AI model that's running on your computer. And we'll also plug it into a free local web search that's running on your computer. Now, both of these two things, we'll need to download some different tools, and I'll cover all that in this video so that by the end of this video, you'll be able to chat with a Hermes agent that is 100% private. So the first thing we need to do is go across to the Hermes agent GitHub and then copy this install command. Now before we set this up, I don't recommend running this on your main computer. Best thing to do is get a separate device and then install Hermes agent on that device. The reason being is that on your main computer, you most likely have very private files, passwords, you probably logged into your like banking applications. So depending on the settings you give Hermes agent, if it ever gets hacked, it can possibly access all that stuff. So you want to separate this, give it its own device and then run it on that. So assuming you have that separate device, open up a terminal and then paste in that install command. Now Hermes agent's going to go away and look at all the packages and libraries you have on your computer. It'll then determine any missing packages that you have and then automatically install them. Once it's done installing, we have two options. One for a quick setup and then one for a full setup. We'll be going through the full setup step by step together. Now for models, I don't want to use any paid models. So, I'm going to go to more providers and then go all the way down until I get to custom endpoint. So, at this stage, we want to figure out how to get a local model onto our computer and then plug it into Hermes agent. And to do that, we're going to be downloading and using something called Olama. Ola is going to give us the ability to download and manage models uh local AI models on our device and then also plug them into tools like Hermes Agent. So, let's go ahead and copy this install command, open up a brand new terminal, and then hit enter. So, this command is going to go ahead and install O Lama on your computer. Once it's done, let's go back into O Lama and start browsing some local models. We'll click on models. And over here, we can see a bunch of free open source models that we can download and install onto our computer. I'm going to go ahead with Gemma 4. I've actually been doing a lot of testing with Gemma 4 recently. I've been doing a lot of this actually using OpenClaw, and these models are super small and work very well with OpenClaw. Now unfortunately I did try the E2B which is a 7 gig model with Hermes agent and it wasn't able to do like basic tasks. I asked it to do some web searching and it wasn't able to follow those instructions which is bit disappointing in OpenClaw it actually worked perfectly fine but I then upgraded to the E4B which is a 9.6 gig model and it was able to do those basic requests. So what I recommend is just finding a model. You can even browse different models on OAMA. find one that fits on your device and actually works for your use case. So, you might need to go away and do some testing. Now, if the device that you're using doesn't actually have the ability to download a model onto it, like if you're using a VPS or a really like a not very powerful computer, you can still use Olama and use the cloud version. They've got a $20 a month tier. It has really good API limits, so you can also use that with Hermes agent. But for this video, we're just going to go ahead and download this local model. So, I'm just going to copy this model name. And now I'm going to start downloading the model to my computer. So I'm going to go type in O Lama pull and then paste in the model name and then hit enter. And then O Lama will download and install that model onto your computer. And if you ever want to see all the models you have installed on your device, you can just go Lama list, hit enter, and then you'll be able to see all the models that you have on your device. And back in the Hermes agent settings, we're asked to enter in a an API base URL. This base URL just needs to point across to O Lama. And then whenever we download a new model, we can just select that model using this setting. So the base URL for Olama is always this. So you can just copy this down as it is. http localhost 11434-v1. We're going to go ahead and hit enter. And for the API key, I'm not going to set one now. So I'm just going to hit enter. And over here, we can see all the models that we have available to us. And these are the exact same models that we had over here when we use the O Lama list command. So, the cool thing is that whenever you go off and you download a new model, it's always going to appear under the O Lama list. I'm going to be using the Gemma 4 E4B. I'm just going to press one and then hit enter. For the context length, I'm just going to hit enter and it's going to automatically detect what it needs to set. Now, we have the option for text to speech. So, for text to speech, I don't want to use any cloud versions. I actually prefer when I'm running these AI agents to always use locally hosted stuff. So for this instance, I'm going to go with this neuts, which is local on a device. I'm going to hit enter, and it's asking me if I want to download and install all the dependencies. I'm going to go capital Y and hit enter for yes. For the terminal back end, we're just going to keep the recommended local. So we're going to hit enter. And then on the bottom here, we have some specialized agent questions. So the first one is what what is the maximum tool calling iterations that you want for your conversation? So if your agent ever has to go away and do complex tool calling for you like multiple times execute a tool, get a response, execute the next tool, things like deep research or actually creating documents and sending them across to different places and managing files, all that kind of stuff has many many iterations. So over here you can set your limits. I think really this is going to come down to a bunch of trial and error depending on what model you're using, what task you're performing. So all this stuff you can change later on. I'm going to go with the default of max iteration 60. I'm going to hit enter. And same thing for the tool progress display. So every time we have a conversation with the Hermes agent, we can either choose to display all the different micro steps that it's doing in a background before it actually comes to us with a response. I'm going to choose this as all because I like to see if I'm ever doing web research, what are the actual tools that it's executing? If I'm changing files or doing different things, I want to see all those micro steps in case I can make some optimizations later on. So I'm going to hit enter and keep the default. Same thing with context compression. as the context gets really long, then your agent gets a little bit dumber because there's just so many tokens floating in a balance. So by compressing that, not only does your agent retain some of the smartness across a really long conversation, but also you manage the context window. So for the smaller models that we're using, we have like a 128,000 token context window. So So you might want to set a lower compression. So we're just going to go with the default of 0.5. And for session reset mode, so another important thing to note here is that if you never reset your session and you're constantly chatting with your agent through Telegram and you're accumulating three to four messages per day, after 7 days, you've got 7* 4, so 28 different messages that you've accumulated and you don't really realize that all those actually stay within the agent context. So for these smaller models, it's actually very important to constantly flush that context as you work on a new task. So over here we have the recommended which is either by inactivity or by daily reset which means either every 24 hours it's going to reset your conversation or if you're inactive for a period of time like if you don't message it for 5 hours then you want the conversation to be reset. So let's go with the recommended setting. And for the inactivity timeout it's going to be 1440 minutes. So I'm just going to keep it at default. Now all this stuff you can change later on. Whatever you're setting here use it for a few days use it for a week. And if you're finding that it's not really meeting your needs, you can just come back into the settings and then raise it or lower it as you need. And for daily reset hour, assuming we're using 24-hour time, the default here is 4, which is just 4:00 a.m. local time. And now we have the option of setting up some different channels. So to set up Telegram, you have to hit the space bar to then select it and then hit enter. And now we're asked for the bot token. So let's go across to Telegram, go to the bot father, and then type in forward slashnewbot. Hit enter. And we're asked to choose a name for our bot. So I'm just going to go Hermes demo. And now for the username, this has to end in bot. So I'm going to go Hermes demo bot. If you run this process and you choose a name and it's not available, the bot follower would just tell you you have to choose a different name. So that's why I put the 001 here to try and make it a bit more unique. Now that we have the bot created, let's copy this bot token and paste it into here and hit enter. And now over here, we're asked to input our Telegram user ID. Whatever you put into here is going to be allowed to message your Hermes agent. If you leave it blank and you just hit enter, that means if anyone finds your Hermes agent on Telegram, they can have conversations with it. So really, you want to be very restrictive over here. I recommend just starting off with your own Telegram bot token. So to find your Telegram user ID, you can pop open the sidebar and then search for this name. It's raw data_bot, which is this first contact over here. Let's click it. And all you need to do is press start. And then you'll be able to see your Telegram ID over here. So let's just copy that, paste it into here, and hit enter. And let's set our Telegram user ID as the home channel. So we're going to go for capital Y and hit enter. And we're going to choose yes. We want to install the gateway as a launch service. Let's hit enter. And start the service now. Yes. And now we've got a list of tools that we can give to our Hermes agent. So you can actually go through these one by one. If you press the space bar, you can either deselect or select them. So this is up to you what you want to choose. But let's say for now I don't want to do any vision or image analysis, image generation. And actually that seems all good for me for right now. I'm just going to hit enter from any of these settings and it's going to take me to the next page. Now this setting is about giving our agent access to a browser. For now I'm just going to skip this and keep the defaults for our text to speech provider. Even though we set it up before, we're just going to go skip and keep the default. I'm not really sure why we have that setting twice. And for our search provider for web search, we're not going to go to firecrawl cloud. We're going to go down and go to firecrawl self-hosted. So let's click on enter. And now at this stage, we have to set up firecrawl. So firecrawl has an option to use a paid cloud service where you can pay them and pay per use or you could also go and use their self-hosted version. So So go across to this self-host MD file and you can read through here. And actually it was a little bit tricky getting this set up for the very first time. So, I've just got a bunch of commands I'm going to give you to run. Now, before we can install Firecrawl on our computer, we first have to download and install Docker. Docker is a container and then we're going to be installing Firecrawl within that container. So, go across to docker.com and just choose the correct version for yourself. After you download and install Docker and now we want to go across to a brand new terminal and we want to copy these commands and I'm going to leave this in the description of this video. Copy these commands. Actually before you do put them into your chat GBT or claude explain what you're doing that you're setting up Hermes agent with locally hosted firecrawl is going to be default configuration. So maybe your claude will be able to give you some uh tips on like what to change for your personal setup. I'm just going to paste in the commands over here. So once again you have to have Docker installed and turned on for this to work. So before you run these commands make sure your Docker is open. So looks like it's installed and started up in Docker. And you can confirm that by opening up your Docker application. And you can see you have a fire crawl uh container over here that's running. So back with Hermes agent, the default URL that they list over here is the actual URL that we need for the Firecrawl instance. So I'm just going to copy what was there just now. And I've just I've just pasted it in. I'm going to hit enter. And it looks like we've just gone back a couple of settings. So I'm just going to hit enter again. I'm going to skip this for the browser. And this is a little bit strange because I've done the Hermes setup process like three or four times and each time something kind of bugs out here and there. So, it's never actually bugged out like this and come to give me the image generation setup. So, if this happens to you, just kind of, you know, play nice with it. I'm going to hit enter to enter in no API key there. I'm going to skip because the text to speech, we already did this before. Fire crawl we have for self-hosted, it's already active. So, I'm going to skip this and keep the defaults for now. And it looks like we're ready to launch the Hermes chat. So I'm just going to click on Y and hit enter. And actually what happens here is that the system crashes. So if you see this, this also happens to me and it's completely normal. I think I'm going to go and type in Hermes and hit enter. And now we're opening up our chat with the uh Hermes agent using our locally hosted model. So I'm just going to test this out and say hi there. So, it might take us about 10 or 20 seconds to first load the model into our RAM to get a response. Since I'm actually recording a video at the same time that this is happening, my computer is going to be lagging a little bit. So, it took yeah, you know, like 30 seconds there. But let's see if we can use the web search tool. Research the latest AI news. Now, what we see happening over here is that the agent is actually dropping down all the tool calls that it's making as it's getting our response. And that was one of the settings that we set earlier in the video. And nice. That was actually pretty quick. So even though my computer is running under some load with this uh like recording the video on my screen, that was a pretty quick response and we have some uh we've got a successful web search. So that means that our fire crawl is working successfully locally. It's completely private for us and also we're using the Gemma 4 E4B model which once again is local and private to us. Now the final thing I want to do is actually test out that my Telegram bot works. So, I'm just going to click the Telegram bot name, open up a new session, click start, and let's just see if I send hi there if this will work. And there we go. We have the response from our Gemma 4B model directly into Telegram as well. So, at this stage, we've done our initial configuration with Hermes. Then if you ever want to go back and actually make changes to your Hermes agent, there's a bunch of terminal commands available on the Hermes GitHub that you can type into your terminal and will give you the option to like change the model that you're using or add new models if you downloaded them through O Lama, give your agent access to different tools or change the configuration settings. So for example, if we want to run one over here just to finish off the video, I'm going to run Hermes uninstall and hit enter. And this is one of the commands to actually remove Hermes from our computer. So I've got the option to uninstall but keep the data or two for full uninstall. I'm going to click two and hit enter. And then at the very end I'm just going to click yes. And that's going to remove Hermes agent completely from my device. All right guys, thank you very much for watching. If you want to see more Hermes Agent content, please subscribe to my channel and drop a comment below. See you in the next one.

Дивитись на YouTube Підписатись на AI-дайджест

Ще з цього каналу

Claude Managed Agents Full Tutorial: How to Setup Your First AI Agent

3 місяці тому

Gemma 4 + SearXNG = 100% FREE & PRIVATE OpenClaw (Full Setup)

3 місяці тому

Claude Cowork Projects Is Here (And It's Changing How I Work)

4 місяці тому

The Easiest Way To Make Claude Cowork Skills #aiagents #techtok #claude

4 місяці тому