YouTube каталог
Is OpenAI About to KILL the Banana?
🎨 Creative
en

Is OpenAI About to KILL the Banana?

Theoretically Media7 днів тому7 квіт. 2026
Опис відео

Is OpenAI's new Image 2 a banana killer or are they going to slip on appeal? We'll take a look at the latest. Plus, has Cance 2.0 already been dethroned? Well, on the leaderboards, yes, but I remain skeptical. Although, I do have some thoughts on who it might be. And apparently, Pixver just dropped a new cinematic video model, so we'll pop in there. All that plus a new world critic for AI video. Not quite what you're thinking, but I do wish it was. Kicking off a couple of days ago, we had some new image models pop up on the arenas, uh, named masking tape alpha, gaffer tape alpha, and packing tape alpha. These outputs demonstrated that the models were very good at text rendering, scene coherence, and prompt following. Timing wise, this does make sense as OpenAI has very much indicated that the new update code name Spud is coming along. So, this image model is likely following in line with the GPT1 image model in that it's not a separate image model, but rather the image generation capabilities of a new big multimodal model. We'll talk about all that in a minute, but in the meantime, well, let's look at some spuddy images. One image that was floating around a bit was essentially the YouTube homepage. I don't know if that's OpenAI flexing on Google or what. Uh, you know, I can say as someone that spends a lot of time looking at the YouTube homepage, uh, this does look pretty legit. Um, you know, I mean, obviously we have a Mr. Beast styled thumbnail here. Um, you know, Cold Fusion down here, Andrew Huberman over here, who's he's the guy that tells you to jump into an ice bath within like, you know, 30 seconds of waking up and stare directly at the sun for 7 minutes. It's actually not true, but uh, it's where Huberman is where it all started. Even over here on the sidebar under subscriptions, uh, all of the channels here are spelled correctly. Actually, to be honest, I'm probably uh, subscribed to most of them. Um, now I mean the one kind of dead giveaway that it is actually AI based or AI generated uh, is well, I mean, we have some garbled text on the Last of Us title over here and actually Ali Abdul's thumbnail is definitely not Ali Abdul, but other than that, I mean, it's pretty good. Uh, and you got to figure those are the like the final tier of thumbnails there. And that's the only place that we're getting any kind of like decoherence or uh text rendering problems or actually in this case thumbnail rendering problems. So overall, I mean that's that's a lot of context. I I don't know why you create this image, but there is an image of a Trader Joe's receipt uh from Honolulu, Hawaii. Uh it's got a phone number. We should call it. >> We're sorry. You have reached a number that has been disconnected or is no longer in service. >> Speaking of Hawaii, here is uh just an image of a server catching a wave. Uh overall, everything looks pretty good. Nothing that uh is calling out to me immediately as AI generated. A little bit of like there might be something going on with the back foot there and the angle of the board possibly, but again, I think you'd really have to be looking for that. Ricardo Wolf ran a comparison. This is uh an elderly man and woman uh hugging in a subway station. This is actually Nano Banana 2. So, uh, taking a look at that in image 2. Now, it is important to note that the prompt actually calls out for completely raw quality, unprocessed, unedited image with full iPhone camera quality. Um, so between the two of them, I do kind of think that uh the GBT image model might have nailed the assignment a little bit better. The Nano Banana 2 image, while it looks more appealing to me, does kind of look a little bit more on the post-process side. Again, the model does seem to do a very good job with text rendering. Uh here is a Bath & Body Works that I presume is next door to the Trader Joe's in Hawaii. Um but I mean, you know, we can see signage actually looks completely and totally legit. Uh we don't have the resolution to like really punch in and see, you know, what's written on this uh in the frame here. Uh nor any of the uh you know, various lotions that are there. Um although I'm sure that if I spend a half an hour in there, I would get a headache from all the various smells. The model does seem to do a pretty good job with maps, but once again, when you get into this much detail, you will run into like some issues apparent. I mean, like, uh, I think United Kingdom is one that I noticed over here, uh, where it's United uh, Kinger. Though, I will give it some leeway considering that a lot of the text context probably gets eaten up down here, uh, in these info boxes, I guess. Uh, all of the information here I I presume to be correct at least. At the end of the day, you should probably not navigate with an AI generated map, though. So, that's a quick way of finding your way into the back rooms. Heading back over to AI generated screenshots of YouTube videos. Uh, here is I really like this one a lot. Uh, I time traveled to the Middle Ages. This changes everything. Um, this is really good. Um, if I were just to see this, I probably would presume it is a screenshot of a AI generated video on YouTube. Uh, Jake Explores here, my guy with 1.21 million subscribers. This guy gets 3.8 million views. Good. Good on you, fake AI guy. Um, and I do like the fact that medieval fan 88 down here leaves uh either this is the greatest documentary ever or you're the biggest idiot alive. So, clearly Image 2 was trained on real YouTube comments. And finally, of course, GTA 6 screenshots because we will get AGI before we get GTA 6. Uh, this was actually called out for as being uh on a 4K monitor and a photo taken of it. Uh, overall, I mean, it reads GTA for sure. We have our, you know, classic, um, font down here. Uh, map probably not correct in terms of UI. Now, running this in Nano Banana 2, we do end up with, well, kind of a mess here. Uh, we do have our character who looks actually strikingly similar to the character in Grand Theft Auto 6 that we've seen. Uh, although, uh, I don't know why she's wearing a backpack with the logo of the game while she's in the game. Uh, and then on top of that, the wanted meter up here, I'm pretty sure is from Red Dead 2. It's been a while since I've played that. Uh, and then, you know, none of the UI over here looks like it's necessarily accurate. Again, I think this is Nano Banana 2 using its auto reggressive features, going out and looking for details about the game and then creating that image based off of that. But again, uh, you know, the GTA 6 backpack, um, well, I don't know, leave it up to Rockstar to have, uh, you know, swag for their game that you can purchase in the game. Now, again, the larger part is that when this does go live, that uh, it isn't just an image model like Nano Banana 2, this isn't a diffusion model, but rather is a auto reggressive multimodal model or as we have been seeing more and more, a thinking model. So, it will have the ability to not only think about what you want to generate as an image, but also be able to do web research on the images that you're looking to create. Um, I mean, that part does remain to be seen, although I would say that is highly likely. So, at the end of the day, is this a banana killer? I mean, that does remain to be seen. At the very least, it it definitely does look like at least now you're able to generate in other aspect ratios than 32. So, thank you for that, Open AAI. And if Open AI continues with its like kind of usual release pattern, we will get a week of truly unhinged prompt generation. So, make sure you have all of your Stormtrooper, Pikachu, and Super Mario prompts ready to go. Quick aside, did you guys see that Mila Jovovich uh who you might remember as Lilo from uh the Fifth Element or the Resident Evil movies or even Dazed Confused? Well, she just opensourced an AI memory system on GitHub. It's called Memplace. Uh and it's legit. It works with claude uh chatbt and cursor and kind of does what it says on the tin. It basically ends up creating memory rooms that live offline and locally. So yes, we have multipassed our way into a pretty weird timeline. But mem place is linked down below. I'm going to give it a shot. I will let you guys know how it works out. Speaking of leaderboards, uh we recently had a new video model topped the leaderboards ousting Cance 2.0. Uh this one is of course another one of the cool stealth names. Uh, this one is called Happy Horse. Now, is this a sea dance killer? From what I'm seeing, uh, no. Uh, but it I mean it does look very good. If I had to call it, and this is pure speculation, but I mean I think that this might be Hilu Minia's long-awaited update. The other contenders for this would have been WAN 2.7, which was just recently released. I did talk about that in Friday's newsletter. Overall, I I kind of find that one okay. I'm still trying to dig in to find out uh where it's really flying, but kind of like the literal NSDR not subscribed, didn't read. Um is that it kind of feels very much like a 0.1 update. Now that said, Pixverse just like as I started recording this video, literally dropped a new model. This one's called their C1 model, which is aiming for more cinematic and VFX focused outputs. Now, I've got a few things to show you, but I'll definitely have to circle back for a more comprehensive review. Uh, one strength immediately is the fact that I was actually able to use this image uh as an image input. CDS 2.0 kicks me back on it because of well the rooe tech and macrosness of it all. Um, so yeah, just running this uh the prompt was simple with pilot gets in the cockpit uh and the Valkyrie takes off into the sky. So, overall, perfect. No, but also a very trash prompt. And there's actually a lot to like in here. We do get multi-shot in here. Uh, the transformation part a little bit on the wonky side, but again, um, I didn't give it much detail in terms of prompt. I definitely had enough to understand that that was a plain robot and that it should transform. So, uh, yeah, I'll definitely swing back to this. Friend of the channel, Brent Lynch, ran this. This is a storyboard sequence created in Nano Banana 2, I believe, uh, that we ran a few videos back, um, starring, of course, Flamethrower Girl. Now, the results here are pretty interesting. Pixver, uh, apparently can understand, uh, sort of these multiframe inputs. Um, and I don't think this was intentional, but it does something kind of cool in that it ends up creating this kind of like animated uh version of the output. Kind of has like uh a slightly Borderlands kind of feel to it. I don't know. I again, I don't think this was intentional, but um at the same time, I think it's a really fascinating and interesting look. Rounding out, we have a research preview for Galo Zero, which I do hope the follow-up will be Figaro and then of course, you know, end with Bezabub, who has a devil put aside from me. Um, but this is a world critic for AI generated video, not a critic in kind of like a scathing writer on Rotten Tomatoes. So, what this does is it will take an AI generated input and then analyze it for, you know, decoherence, morphing, uh, any kind of like structural issues that ends up coming up. Obviously, it's very good at finding kind of like Morphe text and whatnot, which, you know, it's it's going to it's going to be working overtime there. And of course, everyone's favorite uh you know, decohering fingers and hands after all these years. Man, we we still end up with the Play-Doh hands. Um but yeah, I mean this is uh pretty cool. I did want to point this out as well because this was all put together by a very small team. Apparently, in about 5 days, they noted that they did not shower very much. So, uh, kudos to them, uh, for cranking out a project. And once again, going to show that we do live in an age where if you have an idea, uh, and you know, a couple of pals around you, I mean, you can crank out some pretty impressive stuff. So, where does Galo go from here? Well, I'm probably to jail. He put a gun up against that guy's head, pull the trigger, and now he's dead. I'm I'll stop. I'll stop. I promise. the obvious place is to be used as you know kind of a plugin for an a video model that would go and find uh discrepancies and then go through and inpaint it but even like we haven't hit the point where we have QC processes for AI generated video but I mean I can see this uh being once again plugged into sort of the larger studio projects to find uh inconsistencies and and weird morphy issues in AI generated outputs as again quality control. If you're interested in checking out Galileo, there is a wait list. Uh I'll have that linked down below. Closing up, this does look like it's going to be a pretty busy week, so I do not doubt that I will see you again very soon. As always, I thank you for watching. My name is T.