🛠 How-to

Модель Minimax M2.7: Самоеволюційний агент для бізнесу

Developers Digest•2 місяці тому•25 берез. 2026•Impact 8/10

Позитивна📺 Медіа і Контент 🏦 Фінанси і Банкінг 🎓 Освіта

AI Аналіз

Minimax представила M2.7 — відкриту модель-агент, яка може самостійно вдосконалювати власну архітектуру та виконувати складні агентні завдання. Модель набирає 50 балів у індексі штучного інтелекту та пропонує агресивну ціну $0,50 за мільйон токенів, що робить її значно дешещою за конкуренти. Демонстрації показують, що вона може генерувати повноцінні додатки, оптимізовувати код та покращувати власну продуктивність на 30% через ітеративну самовдосконалення.

Ключові тези

Перша відкрита модель-агент з можливістю самостійного рекурсивного вдосконалення архітектури.
Набирає 50 балів у індексі штучного інтелекту на різнихbenchmark-ах.
Агресивна ціна: $0,50 за мільйон токенів, приблизно 10‑20 разів дешещо frontier‑моделей.
Промовна здатність генерувати повноцінні додатки та автономно оптимізовувати код.
Внутрішній workflow забезпечує постійну самоверіку та покращення продуктивності до 30% на внутрішніх тестах.

Можливості

🟢 Можливості — інтегрувати M2.7 у CI/CD пайплайни для автоматичного рефакторингу та генерації boilerplate‑коду, що скоротить час виходу продукту на 30 %. 🔴 Загрози — залежність від зовнішнього API та можливі зміни у політиці цін Minimax, що може підвищити витрати при масштабуванні.

Нюанси

Хоча модель позиціонується як «самоеволюційна», її покращення полягають головним чином у тонкій налаштуванні существуючих підказок та артефактів, а не у радикальній зміні архітектури. Це означає, що самовдосконалення обмежене оптимізацією підказок і інструментів, а не фундаментальними змінami ваг. Таким чином, термін «self‑evolving» маркетинговий більше, ніж технічний.

Опис відео

▼

Miniax has just released M2.7, one of their latest open- source models. This is the first model that's openly talking about early echoes of self-evolution. Now, what they mentioned within the blog post is M2.7 is the first model that has deeply participated in its own evolution. M2.7 is capable of building complex agent harnesses, completing highly elaborative, productive tasks, leveraging capabilities such as agent teams, complex skills, and dynamic tool search. Now, when we take a look at the model on the artificial analysis intelligence index, this model scores a 50. This is an aggregate score across a ton of different benchmarks such as humanity's last exam, GPQA. When we take a look at a model like this and we compare it to some of the other state-of-the-art models, one very important aspect with the intelligence of this model is one of the big stories around models like this and M2.7 is just the price when compared to some of the latest models that are out there. This is a blended rate of 50 cents per million tokens of input and output. And when we compare this pricing to some of the other models that are out there is this can be in the order of 10 or 20 times cheaper. So you're going to be able to get much more performance out of this model than you would with other options that are out there. Now when we take a closer look at the benchmark. Now the one thing that I do like with the benchmarks that Miniax provides is they don't actually show all of the benchmarks where the model just outperforms. They do they are honest with the representation in showing some areas where it isn't quite at the performance of some of the frontier models like we see on the bottom row here. But when we compare it to some popular benchmarks like Swebench Pro for instance, we can see that this model outperforms even the latest from Google and their Gemini 3.1 Pro model. Now one of the really interesting things with this model is this is the first lab that I've seen openly talk building an agent from model self-evolution. This is the first time that they share an internal workflow that enables these M2 series models to self-evolve. And this is something that arguably we're going to see much more of in 2026. Now, what they describe is this workflow allows them to serve as an exploration of the boundaries of the model's agentic capabilities. And that's one thing that all of these models are really focused on right now is the agentic capabilities of what the model can actually perform. It's one thing to actually plot well on different benchmarks. It's another thing to actually perform things like productive real world tasks. And what's interesting with this is you could potentially use this same type of system at a higher level with how you actually use the model. One of the interesting things that they described is during the iteration process, they realized that the model's ability to recursively evolve its own harness is also critical. Our internal harness autonomously collects feedback, builds evaluation sets for internal tasks, and based on this, it continuously iterates on its own architecture. They do call out in particular skills and MCP implementations memory mechanism to complete the task better and more efficiently. And that's one thing that's nice to see is they're actually leveraging some of these new architectures that we're seeing such as skills as well as MCP that we've seen over the past year. Now, additionally within here, what they also describe is some examples in terms of how it optimized its performance, which I do find quite interesting. They mentioned that, for example, they had M2.7 optimize a model's programming performance on an internal scaffold. M2.7 ran entirely autonomously, executing an iterative loop to analyze failure trajectories, plan changes, modifiations, and compare results, decide to keep or revert the changes for over 100 rounds. And what they found is that M2 discovered effective optimizations throughout the process, systematically searching optimal combinations. One of the most interesting aspects of this announcement that they released is what they describe is going through the process is basically what it would do is it would run through this iterative loop. It would analyze different failure trajectories. It would plan the changes, modify the scaffold code and run evaluations. It would compare the results and decide to keep or revert the changes. And ultimately through this iterative process, they found that this achieved a 30% improvement on internal evaluation sets. So, in this video, what I wanted to do is show you a number of different examples in terms of what the model's actually capable of doing. Now, in terms of some of the front-end capabilities, here's an example of one of the generations from the model where if you ask it to generate a website, here is an example of that. Now, in terms of where you can leverage the model, so the model is particularly strong as you might imagine within software engineering. So, if you do want to leverage this for instance for real world programming, I'll show you a couple different demonstrations in terms of how you can leverage this. Now, the one thing that I do want to highlight with how they structured their plans, it is quite a bit different than how we typically see. So, you actually are able to get 1,500 model requests per 5 hours for $10 a month. Now, I'll leave a link to the blog post within the description of the video. Now, I actually want to show you some demonstrations of the model itself. Within the blog post, they have a handful of examples in terms of how you can leverage the model, but I wanted to go through and show you some demonstrations. So, you can leverage this from anything from building spreadsheets to actually building applications. Now, the next thing that I want to touch on, which is actually quite interesting, is how they've actually decided to price the models on what they call their token plan. For as little as $10 a month, you're going to be able to get 1,500 requests per 5 hours. And if you want to go to the $20 per month here, you can get 4,500 model requests every 5 hours. This isn't across the month. This is every 5 hours. And now, additionally, if you want the model at a higher throughput at 100 tokens per second, you will be able to pay a higher premium for that. But in terms of the value that you get for $10, this is actually pretty amazing. There's a whole host of options with how you can get started with the token plan. Once you've signed up for the plan that you want, you can head on over and get your API key for the token plan. And then once you've done that, you have a few different options. they do have compatibility with the anthropic SDK where for instance if what you want to do is actually set this up within an application that you're building is you can plug in the model string for Miniax M2.7 swap out the API key for your token plan and then you're going to be able to just set that base URL to Miniax and then you can route all of the different requests to it and now additionally if you want to leverage the model within a whole host of coding tools whether it's open claw open code cloud code or a whole host of other harnesses they have guides to do all of those that now within here what I'm going to do is I'm going to show you how you can get started with open code in particular. If I run open code off login, what I can do within here is I can search for the different providers. And within here, I can search for miniax. What I can do is I can select the miniax.io. Once I've selected the model, I can go ahead and I can put in my API key. And then once I've done that, I can go ahead and I can copy my API key to put it in. And then once you've done that, you can go ahead and run open code. And then once you're within open code and you select models, I can search for Miniax again. And I have all of the different models within here. For instance, I can select a minimax m2.7 and within here I can say hello world just to make sure that it's working. And then there we go. So I have that response there with the thinking trace as well. Additionally within here is if you are interested in seeing how many tokens you have left with your current plan. You can see I have 1500 requests per 5 hours for this model at the 50 tokens per second normally as well as 100 token per second off peak. Now what I'm going to do is I'm just going to go to my desktop. I'm going to create a new directory. I'm going to call this Miniax M2.7 demo. I'm going to go within this directory. I'm going to fire up Open Code again. And the nice thing with Open Code is once you have set the model, it will set that as the default model every time that you go in. Once you've set this, you don't need to go in and configure the model each time. What I'm going to do within here is I'm going to say create me a Nex.js application. I want it to be a site on developers digest. I want it to have a blog and I want it to be within a Neo brutalist style. Let's also see the blog w with a few different blog posts on Typescript. And the one thing to know with this model is it is very good at agentic task within here. You can see it's listing out the directories. It's coming up with a plan for what it's going to build for us. It's going to run the commands to create our next.js project. And the one thing with this model is you'll see it will begin to go through all of these tasks agentically. Within here, I can see it's going through, it's writing some stylesheets for us. It's creating all of the different post CSS for all of the Tailwind configuration. It's going to go through. It's going to build out all of the various components. So, here we go. So, here is what it has generated for us. So, within here, I can click through. I can see the different blog posts. I can see all of the different coding blocks within here. It's a starting point for what I want to continue to develop. Just to give you an idea in terms of the number of requests that it took within that agentic process that it used within open code to actually generate the Nex.js application. The one thing with Nex.js is this isn't just something that's within a single HTML file. I'm not just asking for one request to have all of this generated. It's going through and it's trying things. And the one thing to know with this with the 1500 model requests per five hours is you can see with this one request I'm just at 1% and this is going to reset in an hour. But actually run up on the limits even on their cheapest tier. It does seem like you are going to be able to push this model quite a bit. So what I'm going to do within here is I'm going to say convert the homepage to be a SAS landing page. I want it to be a very rich page. Let's have it with all of the different aspects that you typically see. Everything from a beautiful hero page in Neobra style all the way through to a pricing section through to FAQs, social validation, so on and so forth. Also add in some animations throughout the entire design. So here is what it has generated for us. So we have this beautiful hero section that it's created for us. Stay ahead of the TypeScript game. We have the code sample. We have start trial. So on and so forth. I have the social validation of all of the different companies, the hypothetical companies that you could include in a section like this. I have everything you need to know to master Typescript just like I had in the early context of what I had asked of the model content in and around Typescript. In terms of the overall design sensibilities and the neo brutalist style that I had asked for, I asked this type of question of a lot of different models and this is definitely a very impressive generation. If I take a look here, I can see we have these functional FAQ sections as well. And now if I take a look and I refresh the token usage, I am still only at 2%. So if I take a look through this, it was able to go through and write all of this code, all of these different components for what it has on this main page here. And so that's just to give you a glimpse in terms of some of its capabilities. But the model is able to do much, much more. You're going to be able to create things like spreadsheets, powerpoints, you're going to be able to create documents, a bunch of real world work you're going to be able to actually do with this model because of its agentic capability. Actually being trained on these agentic capabilities, you're going to be able to use it within a whole host of ecosystems. And another interesting use case for this token plan is you could potentially also leverage it within your OpenClaw orchestrator. You can set this to be the main model that you have to delegate all of the different tasks. For anyone that does use OpenClaw, you can potentially leverage this as an option as well. So all in all, there's a ton of different applications in terms of how you can leverage the model. Kudos to the team and Miniax for what they've created. from the self-evolving early echoes of self-evolution like they described within the blog post, but through to actually the quality as well as the speed and price of the model. It's all a very compelling option. So, otherwise, that's it for this video. If you found this video useful, please like, comment, share, and subscribe. Until the next one.

Дивитись на YouTube Підписатись на AI-дайджест

Ще з цього каналу

Composio: Connect OpenClaw & Claude Code to 1,000+ Apps via CLI

близько 2 місяців тому

Claude Mythos Preview in 6 Minutes

близько 2 місяців тому

Self Improving Agents in 5 Minutes

близько 2 місяців тому

Replit Agent 4: Design-to-Full App with Parallel Agents & Infinite Canvas

близько 2 місяців тому