« BackGPT Image 1.5openai.comSubmitted by charlierguo 11 hours ago
  • vunderba 7 hours ago

    Okay results are in for GenAI Showdown with the new gpt-image 1.5 model for the editing portions of the site!

    https://genai-showdown.specr.net/image-editing

    Conclusions

    - OpenAI has always had some of the strongest prompt understanding alongside the weakest image fidelity. This update goes some way towards addressing this weakness.

    - It's leagues better at making localized edits without altering the entire image's aesthetic than gpt-image-1, doubling the previous score from 4/12 to 8/12 and the only model that legitimately passed the Giraffe prompt.

    - It's one of the most steerable models with a 90% compliance rate

    Updates to GenAI Showdown

    - Added outtakes sections to each model's detailed report in the Text-to-Image category, showcasing notable failures and unexpected behaviors.

    - New models have been added including REVE and Flux.2 Dev (a new locally hostable model).

    - Finally got around to implementing a weighted scoring mechanism which considers pass/fail, quality, and compliance for a more holistic model evaluation (click pass/fail icon to toggle between scoring methods).

    If you just want to compare gpt-image-1, gpt-image-1.5, and NB Pro at the same time:

    https://genai-showdown.specr.net/image-editing?models=o4,nbp...

    • pierrec 5 hours ago

      This showdown benchmark was and still is great, but an enormous grain of salt should be added to any model that was released after the showdown benchmark itself.

      Maybe everyone has a different dose of skepticism. Personally I'm not even looking at results for models that were released after the benchmark, for all this tells us, they might as well be one-trick ponies that only do well in the benchmark.

      It might be too much work, but one possible "correct" approach for this kind of benchmark would to periodically release new benchmarks with new tests (that are broadly in the same categories) and only include models that predate each benchmark.

      • vunderba 5 hours ago

        Yeah that’s a classic problem, and it's why good tests are such closely guarded secrets: to keep them from becoming training fodder for the next generation of models. Regarding the "model date" vs "benchmark date" - that's an interesting point... I'll definitely look into it!

        I don't have any captcha systems in place, but I wonder if it might be worth putting up at least a few nominal roadblocks (such as Anubis [1]) to at least slow down the scrapers.

        A few weeks ago I actually added some new, more challenging tests to the GenAI Text-to-Image section of the site (the “angelic forge” and “overcrowded flat earth”) just to keep pace with the latest SOTA models.

        In the next few weeks, I’ll be adding some new benchmarks to the Image Editing section as well~~

        [1] - https://anubis.techaro.lol

        • echelon 42 minutes ago

          The Blender previz reskin task could be automated!

          Generate a novel previz scene programatically in Blender or some 3D engine, then task the image model with rendering it in a style (or to style transfer to a given image, eg. something novel and unseen from Midjourney).

          Throw in a 250 object asset pack and some skeletal meshes that can conform to novel poses.

          Furthermore, anything that succeeds from that task can then be fed into another company's model and given an editing task.

        • somenameforme an hour ago

          You don't need skepticism, because even if you're acting in 100% good faith and building a new model, what's the first thing you're going to do? You're going to go look up as many benchmarks as you can find and see how it does on them. It gives you some easy feedback relative to your peers. The fact that your own model may end up being put up against these exact tests is just icing.

          So I don't think there's even a question of whether or not newer models are going to be maximizing for benchmarks - they 100% are. The skepticism would be in how it's done. If something's not being run locally, then there's an endless array of ways to cheat - like dynamically loading certain LoRAs in response to certain queries, with some LoRAs trained precisely to maximize benchmark performance. Basically taking a page out of the car company playbook in response to emissions testing.

          But I think maximizing the general model itself to perform well on benchmarks isn't really unethical or cheating at all. All you're really doing there is 'outsourcing' part of your quality control tests. But it simultaneously greatly devalues any benchmark, because that benchmark is now the goal.

          • smusamashah 5 hours ago

            I think training image models to pass these very specific tests correctly will be very difficult for any of these companies. How would they even do that?

            • 8n4vidtmkvmk 3 hours ago

              Hire a professional Photoshop artist to manually create the "correct" images and then put the before and after photos into the training data. Or however they've been training these models thus far, i don't know.

              And if that still doesn't get you there, hash the image inputs to detect if its one of these test photos and then run your special test-passer algo.

          • singhkays 5 hours ago

            GPT Image 1.5 is the first model that gets close to replicating the intricate detail mosaic of bullets in the "Lord of War" movie poster for me. Following the prompt instructions more closely also seems better compared to Nano Banana Pro.

            I edited the original "Lord of War" poster with a reference image of Jensen and replaced bullets with GPU dies, silicon wafers and electronic components.

            https://x.com/singhkays/status/2001080165435113791

            • smusamashah 5 hours ago

              Z-image was released recently and that's what /r/StableDiffusion all talks about these days. Consider adding that too. It is very good quality for its size (Requires only 6 or 8 gigs of ram).

              • vunderba 4 hours ago

                I've actually done a bit of preliminary testing with ZiT. I'm holding off on adding it to the official GenAI site until the base and edit models have been released since the Turbo model is pretty heavily distilled.

                https://mordenstar.com/other/z-image-turbo

              • lobochrome 33 minutes ago

                Stupid Cisco Umbrella is blocking you

                • heystefan 4 hours ago

                  So when you say "X attempts" what does that mean? You just start a new chat with the same exact prompt and hope for a different result?

                  • vunderba 4 hours ago

                    All images are generated using independent, separate API calls. See the FAQ at the bottom under “Why is the number of attempts seemingly arbitrary?” and “How are the prompts written?” for more detail, but to quickly summarize:

                    In addition to giving models multiple attempts to generate an image, we also write several variations of each prompt. This helps prevent models from getting stuck on particular keywords or phrases, which can happen depending on their training data. For example, while “hippity hop” is a relatively common name for the ball-riding toy, it’s also known as a “space hopper.” In some cases, we may even elaborate and provide the model with a dictionary-style definition of more esoteric terms.

                    This is why providing an “X Attempts” metric is so important. It serves as a rough measure of how “steerable” a given model is - or put another way how much we had to fight with the model in order for it to consistently follow the prompt’s directives.

                  • mvkel 4 hours ago

                    This leaderboard feels incredibly accurate given my own experience.

                    • irishcoffee 6 hours ago

                      > the only model that legitimately passed the Giraffe prompt.

                      10 years ago I would have considered that sentence satire. Now it allegedly means something.

                      Somehow it feels like we’re moving backwards.

                      • echelon 6 hours ago

                        > Somehow it feels like we’re moving backwards.

                        I don't understand why everyone isn't in awe of this. This is legitimately magical technology.

                        We've had 60+ years of being able to express our ideas with keyboards. Steve Jobs' "bicycle of the mind". But in all this time we've had a really tough time of visually expressing ourselves. Only highly trained people can use Blender, Photoshop, Illustrator, etc. whereas almost everyone on earth can use a keyboard.

                        Now we're turning the tide and letting everyone visually articulate themselves. This genuinely feels like computing all over again for the first time. I'm so unbelievably happy. And it only gets better from here.

                        Every human should have the ability to visually articulate themselves. And it's finally happening. This is a major win for the world.

                        I'm not the biggest fan of LLMs, but image and video models are a creator's dream come true.

                        In the near future, the exact visions in our head will be shareable. We'll be able to iterate on concepts visually, collaboratively. And that's going to be magical.

                        We're going to look back at pre-AI times as primitive. How did people ever express themselves?

                        • SchemaLoad 4 hours ago

                          I'm struggling to see the benefits. All I see people using this for is generating slop for work presentations, and misleading people on social media. Misleading might be understating it too. It's being used to create straight up propaganda and destruction of the sense of reality.

                          • irishcoffee 4 hours ago

                            You basically described magic mushrooms, where the description came from you while high on magic mushrooms.

                            It’s just a tool. It’s not a world-changing tech. It’s a tool.

                            • Rodeoclash 5 hours ago

                              Where is all this wonderful visual self expression that people are now free to do? As far as I can tell it's mostly being used on LinkedIn posts.

                              • scrollaway 5 hours ago

                                It’s a classic issue that you give access to superpowers to the general population and most will use them in the most boring ways.

                                The internet is an amazing technology, yet its biggest consumption is a mix of ads, porn and brain rot.

                                We all have cameras in our pockets yet most people use them for selfies.

                                But if you look closely enough, the incredible value that comes from these examples more than makes up for all the people using them in a “boring” way.

                                And anyway who’s the arbiter of boring?

                          • BoredPositron 6 hours ago

                            Nano Banana has still the best VAE we have seen especially if you are doing high res production work. The flux2 comes close but gpt image is still miles away.

                            • echelon 7 hours ago

                              I really love everything you're doing!

                              Personal request: could you also advocate for "image previz rendering", which I feel is an extremely compelling use case for these companies to develop. Basically any 2d/3d compositor that allows you to visually block out a scene, then rely on the model to precisely position the set, set pieces, and character poses.

                              If we got this task onto benchmarks, the companies would absolutely start training their models to perform well at it.

                              Here are some examples:

                              gpt-image-1 absolutely excels at this, though you don't have much control over the style and aesthetic:

                              https://imgur.com/gallery/previz-to-image-gpt-image-1-x8t1ij...

                              Nano Banana (Pro) fails at this task:

                              https://imgur.com/a/previz-to-image-nano-banana-pro-Q2B8psd

                              Flux Kontext, Qwen, etc. have mixed results.

                              I'm going to re-run these under gpt-image-1.5 and report back.

                              Edit:

                              gpt-image-1.5 :

                              https://imgur.com/a/previz-to-image-gpt-image-1-5-3fq042U

                              And just as I finish this, Imgur deletes my original gpt-image-1 post.

                              Old link (broken): https://imgur.com/a/previz-to-image-gpt-image-1-Jq5M2Mh

                              Hopefully imgur doesn't break these. I'll have to start blogging and keep these somewhere I control.

                              • vunderba 6 hours ago

                                Thanks! A highly configurable Previz2Image model would be a fantastic addition. I was literally just thinking about this the other day (but more in the context of ControlNets and posable kinematic models). I’m even considering adding an early CG Poser blocked‑out scene test to see how far the various editor models can take it.

                                With additions like structured prompts (introduced in BFL Flux 2), maybe we'll see something like this in the near future.

                            • minimaxir 11 hours ago

                              I have a Nano Banana Pro blog post in the works expanding on my experiments with Nano Banana (https://news.ycombinator.com/item?id=45917875). Running a few of my test cases from that post and the upcoming blog post through this new ChatGPT Image model, this new model is better than Nano Banana but MUCH worse than Nano Banana Pro which now nails the test cases that previously showed issues. The pricing is unclear but gpt-image-1.5 appears to be 20% cheaper than the current gpt-image-1 model, which would put a `high`-quality generation in the same price range as Nano Banana Pro.

                              One curious case demoed here in the docs is the grid use case. Nano Banana Pro can also generate grids, but for NBP grid adherence to the prompt collapses after going higher than 4x4 (there's only a finite amount of output tokens to correspond to each subimage), so I'm curious that OpenAI started with a 6x6 case albeit the test prompt is not that nuanced.

                              • vunderba 10 hours ago

                                I'll be running gpt-image-1.5 through my GenAI Showdown later today, but in the meantime if you want to see some legitimately impressive NB Pro outputs, check out:

                                https://mordenstar.com/blog/edits-with-nanobanana

                                In particular, NB Pro successfully assembled a jigsaw puzzle it had never seen before, generated semi-accurate 3D topographical extrapolations, and even swapped a window out for a mirror.

                                • jngiam1 8 hours ago

                                  The mirror test is cool!

                                  • IgorPartola 7 hours ago

                                    Subtle detail but the little table casts a shadow because of the light in the window and the shadow remains unchanged after the mirror replaces the window.

                                    • dash2 6 hours ago

                                      More obviously, the objects in the mirror aren't actually reversed!

                                      • vunderba 6 hours ago

                                        That one's on me! It was still using the old NB image.

                                        Updated the mirror test to use the NB Pro version.

                                  • niklassheth 8 hours ago

                                    Nice! Your comparison site is probably the best one out there for image models

                                  • qingcharles 8 hours ago

                                    I just tested GPT1.5. I would say the image quality is on par with NBP in my tests (which is surprising as the images in their trailer video are bad), but the prompt adherence is way worse, and its "world model" if you want to call it that is worse. For instance, I asked it for two people in a row boat and it had two people, but the boat was more like a coracle and they would barely fit inside it.

                                    Also: SUPER ANNOYING. It seems every time you give it a modification prompt it erases the whole conversation leading up to the new pic? Like.. all the old edits vanish??

                                    I added "shaky amateur badly composed crappy smartphone photo of ____" to the start of my prompts to make them look more natural.

                                    Counterpoint from someone on the Musk site: https://x.com/flowersslop/status/2001007971292332520

                                    • vunderba 3 hours ago

                                      I actually just finished running the Text-to-Image benchmark a few minutes ago. This matches my own testing as well. GPT-Image 1.5 is clearly a step up as an editing model, but it performed worse in purely generative tasks compared to its predecessor - dropping from 11 (out of 14) to 9.

                                      Comparing NB Pro, GPT Image 1, and GPT Image 1.5

                                      https://genai-showdown.specr.net/?models=o4,nbp,g15

                                    • abadar 9 hours ago

                                      I really enjoyed your experiments. Thank you for sharing your experiences. They've improved my prompting and have tempered my expectations.

                                      • echelon 8 hours ago

                                        I've been a filmmaker for 10+ years. I really want more visual tools that let you precisely lay out consistent scenes without prompting. This is important for crafting the keyframes in an image-to-video style workflow, and is especially important for long form narrative content.

                                        One thing that gpt-image-1 does exceptionally well that Nano Banana (Pro) can't is previz-to-render. This is actually an incredibly useful capability.

                                        The Nano Banana models take the low-fidelity previz elements/stand-ins and unfortunately keep the elements in place without attempting to "upscale" them. The model tries to preserve every mistake and detail verbatim.

                                        Gpt-image-1, on the other hand, understands the layout and blocking of the scene, the pose of human characters, and will literally repair and upscale everything.

                                        Here's a few examples:

                                        - 3D + Posing + Blocking: https://youtu.be/QYVgNNJP6Vc

                                        - Again, but with more set re-use: https://youtu.be/QMyueowqfhg

                                        - Gaussian splats: https://youtu.be/iD999naQq9A

                                        - Gaussians again: https://youtu.be/IxmjzRm1xHI

                                        We need models that can do what gpt-image-1 does above, but that have higher quality, better stylistic control, faster speed, and that can take style references (eg. glossy Midjourney images).

                                        Nano Banana team: please grow these capabilities.

                                        Adobe is testing and building some really cool capabilities:

                                        - Relighting scenes: https://youtu.be/YqAAFX1XXY8?si=DG6ODYZXInb0Ckvc&t=211

                                        - Image -> 3D editing: https://youtu.be/BLxFn_BFB5c?si=GJg12gU5gFU9ZpVc&t=185 (payoff is at 3:54)

                                        - Image -> Gaussian -> Gaussian editing: https://youtu.be/z3lHAahgpRk?si=XwSouqEJUFhC44TP&t=285

                                        - 3D -> image with semantic tags: https://youtu.be/z275i_6jDPc?si=2HaatjXOEk3lHeW-&t=443

                                        I'm trying to build the exact same things that they are, except as open source / source available local desktop tools that we can own. Gives me an outlet to write Rust, too.

                                        • pablonaj 7 hours ago

                                          Love the samples of the app you are making, will be testing it!

                                          • echelon 7 hours ago

                                            Images make this even easier to see (though predictable and precise video is what drives the demand) :

                                            gpt-image-1: https://imgur.com/gallery/previz-to-image-gpt-image-1-x8t1ij... (fixed link - imgur deleted the last post for some reason)

                                            gpt-image-1.5: https://imgur.com/a/previz-to-image-gpt-image-1-5-3fq042U

                                            nano banana / pro: https://imgur.com/a/previz-to-image-nano-banana-pro-Q2B8psd

                                            gpt-image-1 excels in these cases, despite being stylistically monotone.

                                            I hope that Google, OpenAI, and the various Chinese teams lean in on this visual editing and blocking use case. It's much better than text prompting for a lot of workflows, especially if you need to move the camera and maintain a consistent scene.

                                            While some image editing will be in the form of "remove the object"-style prompts, a lot will be molding images like clay. Grabbing arms and legs and moving them into new poses. Picking up objects and replacing them. Rotating scenes around.

                                            When this gets fast, it's going to be magical. We're already getting close.

                                        • oxag3n 9 hours ago

                                          If this was a farm of sweatshop Photoshopers in 2010, who download all images from the internet and provide a service of combining them on your request, this would escalate pretty quickly.

                                          Question: with copyright and authorship dead wrt AI, how do I make (at least) new content protected?

                                          Anecdotal: I had a hobby of doing photos in quite rare style and lived in a place where you'd get quite a few pictures of. When I asked gpt to generate a picture of that are in that style, it returned highly modified, but recognizable copy of a photo I've published years ago.

                                          • mortenjorck 7 hours ago

                                            > how do I make (at least) new content protected?

                                            Air gap. If you don’t want content to be used without your permission, it never leaves your computer. This is the only protection that works.

                                            If you want others to see your content, however, you have to accept some degree of trade off with it being misappropriated. Blatant cases can be addressed the same as they always were, but a model overfitting to your original work poses an interesting question for which I’m not aware of any legal precedents having been set yet.

                                            • echelon 5 hours ago

                                              Horror scenario:

                                              Big IP holders will go nuclear on IP licensing to an extent we've never seen before.

                                              Right now, there are thousands of images and videos of Star Wars, Pokemon, Superman, Sonic, etc. being posted across social media. All it takes is for the biggest IP conglomerates to turn into linear tv and sports networks of the past and treat social media like cable.

                                              Disney: "Gee {Google,Meta,Reddit,TikTok}, we see you have a lot of Star Wars and Marvel content. We think that's a violation of our rights. If you want your users to continue to be able to post our media, you need to pay us $5B/yr."

                                              I would not be surprised if this happens now that every user on the internet can soon create high-fidelity content.

                                              This could be a new $20-30B/yr business for Disney. Nintendo, WBD, and lots of other giant IP holders could easily follow suit.

                                              • empressplay 5 hours ago

                                                Disney invests $1 billion in OpenAI, licenses 200 characters for AI video app Sora

                                                https://arstechnica.com/ai/2025/12/disney-invests-1-billion-...

                                                • echelon 5 hours ago

                                                  One day later, "Google pulls AI-generated videos of Disney characters from YouTube in response to cease and desist":

                                                  https://www.engadget.com/ai/google-pulls-ai-generated-videos...

                                                  The next step is to take this beyond AI generations and to license rights to characters and IP on social media directly.

                                                  The next salvo will be where YouTube has to take down all major IP-related content if they don't pay a licensing fee. Regardless of how it was created. Movie reviews, fan animations, video game let's plays.

                                                  I've got a strong feeling that day is coming soon.

                                            • margorczynski 8 hours ago

                                              We are probably entering the post-copyright era. The law will follow sooner or later.

                                              • rafram 7 hours ago

                                                That seems unlikely to me. One side is made up of lots and lots of entrenched interests with sympathetic figures like authors and artists on their side, and the other is “big tech,” dominated by the rather unsympathetic OpenAI and Google.

                                              • LudwigNagasena 7 hours ago

                                                Using references is a standard industry practice for digital art and VFX. The main difference is that you are unable to accidentally copy a reference too close, while with AI it’s possible.

                                                • 999900000999 7 hours ago

                                                  A middle ground would be Chat GPT at least providing attribution.

                                                  Back in reality, you can get in line to sue. Since they have more money than you, you can't really win though.

                                                  So it goes.

                                                  • ur-whale 7 hours ago

                                                    > Question: with copyright and authorship dead wrt AI, how do I make (at least) new content protected?

                                                    Question: Now that the steamboats have been invented, how do I keep my clipper business afloat ?

                                                    Answer: Good riddance to the broken idea of IP, Schumpeter's Gale is around the corner, time for a new business model.

                                                    • nobody_r_knows 8 hours ago

                                                      my question to your anecdotal: who cares? not being fecicious, but who cares if someone reproduced your stuff and millions of people see your stuff? is the money that you want? is it the fame? because fame you will get, maybe not money... but couldn't there be another way?

                                                      • swatcoder 8 hours ago

                                                        People have values that go beyond wealth and fame. Some people care about things like personal agency, respect and deference, etc.

                                                        If someone were on vacation and came home to learn that their neighbor had allowed some friends stay in the empty house, we would often expect some kind of outrage regardless of whether there had been specific damage or wear to the home.

                                                        Culturally, people have deeply set ideas about what's theirs, and feel like they deserve some say over how their things are used and by whom. Even those that are very generous and want their things be widely shared usually want to have have some voice in making that come to be.

                                                        • visarga 9 minutes ago

                                                          If I were a creative I would avoid seeing any work I am not legally allowed to get inspired by, why install furniture into my brain I can't sit on? I see this kind of IP protection as poisoned grounds, can't do anything on top of it.

                                                        • netule 8 hours ago

                                                          Suddenly, copyright doesn't matter anymore when it's no longer useful to the narrative.

                                                          • ragequittah 8 hours ago

                                                            Copyright has overstepped its initial purpose by leaps and bounds because corporations make the law. If you're not cynical about how Copyright currently works you probably haven't been paying attention. And it doesn't take much to go from cynical to nihilist in this case.

                                                            • netule 8 hours ago

                                                              There's definitely a case of miscommunication at play if you didn't read cynicism into my original post. I broadly agree with you, but I'll leave it at that to prevent further fruitless arguing about specifics.

                                                            • BoorishBears 8 hours ago

                                                              OpenAI does care about copyright, thankfully China does not: https://imgur.com/a/RKxYIyi

                                                              (to clarify, OpenAI stops refining the image if a classifier detects your image as potentially violating certain copyrights. Although the gulf in resolution is not caused by that.)

                                                              • CamperBob2 7 hours ago

                                                                (Shrug) This is more important. Sorry.

                                                              • oxag3n 8 hours ago

                                                                To clarify my question - I do not want anything I create to be fed into their training data. That photo is just an example that I caught and it became personal. But in general I don't want anymore to open source my code, write articles and put any effort into improving training data set.

                                                                • illwrks 8 hours ago

                                                                  The issue is ownership, not promotion or visibility.

                                                                  • jibal 8 hours ago

                                                                    facetious

                                                                    [I won't bother responding to the rest of your appalling comment]

                                                                    • Forgeties79 8 hours ago

                                                                      As a professional cinematographer/photographer I am incredibly uncomfortable with people using my art without my permission for unknown ends. Doubly so when it’s venture backed private companies stealing from millions of people like me as they make vague promises about the capabilities of their software trained on my work. It doesn’t take much to understand why that makes me uncomfortable and why I feel I am entitled to saying “no.” Legally I am entitled to that in so many cases, yet for some reason Altman et al get to skip that hurdle. Why?

                                                                      How do you feel about entities taking your face off of your personal website and plastering it on billboards smiling happily next to their product? What if it’s for a gun? Or condoms? Or a candidate for a party you don’t support? Pick your own example if none of those bother you. I’m sure there are things you do not want to be associated with/don’t want to contribute to.

                                                                      At the end of the day it’s very gross when we are exploited without our knowledge or permission so rich groups can get richer. I don’t care if my visual work is only partially contributing to some mashed up final image. I don’t want to be a part of it.

                                                                      • smileson2 3 hours ago

                                                                        You should be proud your work will now be distilled enterally and an aspect of your work will forever influence the world

                                                                        • CamperBob2 7 hours ago

                                                                          The day after I first heard about the Internet, back in 1990-whatever, it occurred to me that I probably shouldn't upload anything to the Internet that I didn't want to see on the front page of tomorrow's newspaper.

                                                                          Apart from the 'newspaper' anachronism, that's pretty much still my take.

                                                                          Sorry, but you'll just have to deal with it and get over it.

                                                                          • Forgeties79 6 hours ago

                                                                            > Sorry, but you'll just have to deal with it and get over it.

                                                                            You were fine until this bit.

                                                                            • onraglanroad 6 hours ago

                                                                              They're still fine because they're right.

                                                                              You got to play the copyright game when the big corps were on your side.

                                                                              Now they're on the other side. Deal with it and get over it.

                                                                    • agentifysh 8 hours ago

                                                                      I am very impressed a benchmark I like to run is have it create sprite maps, uv texture maps for an imagined 3d model

                                                                      Noticed it captured a megaman legends vibe ....

                                                                      https://x.com/AgentifySH/status/2001037332770615302

                                                                      and here it generated a texture map from a 3d character

                                                                      https://x.com/AgentifySH/status/2001038516067672390/photo/1

                                                                      however im not sure if these are true uv maps that is accurate as i dont have the 3d models itself

                                                                      but ive tried this in nano banana when it first came out and it couldn't do it

                                                                      • gs17 7 hours ago

                                                                        > however im not sure if these are true uv maps

                                                                        I can tell you with 100% certainty they are not. For example, Crash doesn't have a backside for his torso. You could definitely make a model that uses these as textures, but you'd really have to force it and a lot of it would be stretched or look weird. If you want to go this approach, it would make a lot more sense to make a model, unwrap it, and use the wireframe UV map as input.

                                                                        Here's the original Crash model: https://models.spriters-resource.com/pc_computer/crashbandic... , its actual texture is nothing like the generated one, because the real one was designed for efficiency.

                                                                        • Nition 4 hours ago

                                                                          That's a remake model in a modern game. The original Crash was even simpler than that one.

                                                                          Most of Crash in the first game was not textured; just vertex colours. Only the fur on his back and his shoelaces were textures at all.

                                                                          • gs17 3 hours ago

                                                                            "Original" as in the original of the one they used in their tweet.

                                                                          • agentifysh 7 hours ago

                                                                            yeah definitely impressive compared to what nano banana outputted

                                                                            tried your suggested approach by unwrapaped wireframe uv as input and im impressed

                                                                            https://x.com/AgentifySH/status/2001057153235222867

                                                                            obviously its not going to be accurate 1:1 but with more 3d spatial awareness i think it could definitely improve

                                                                          • 101008 7 hours ago

                                                                            > however im not sure if these are true uv maps that is accurate as i dont have the 3d models itself

                                                                            also in the tweet

                                                                            > GPT Image 1.5 is **ing crazy

                                                                            and

                                                                            > holy shit lol

                                                                            what's impressive if you don't know if it's right or not (as the other comment pointed out, it is not right)

                                                                          • sharkjacobs 10 hours ago

                                                                            Was it ever explained or understood why ChatGPT Images always has (had?) that yellow cast?

                                                                            • minimaxir 9 hours ago

                                                                              My pet theory is that OpenAI screwed up the image normalization calculation and was stuck with the mistake since that's something that can't be worked around.

                                                                              At the least, it's not present in these new images.

                                                                              • BoorishBears 9 hours ago

                                                                                There's still something off in the grading, and I suspect they worked around it

                                                                                (although I get what you mean, not easily since you already trained)

                                                                                I'm guessing when they get a clean slate we'll have Image 2 instead of 1.5. In LMArena it was immediately apparent it was an OpenAI model based on visuals.

                                                                              • KaiserPro 8 hours ago

                                                                                Meta's codec avatars all have a green cast because they spent millions on the rig to capture whole bodies and even more on rolling it out to get loads of real data.

                                                                                They forgot to calibrate the cameras, so everything had a green tint.

                                                                                Meanwhile all the other teams had a billion macbeth charts lying around just in case.

                                                                                • jiggawatts 8 hours ago

                                                                                  Also, you'd be shocked at how few developers know anything at all about sRGB (or any other gamut/encoding), other than perhaps the name. Even people working in graphics, writing 3D game engines, working on colorist or graphics artist tools and libraries.

                                                                                • ACCount37 9 hours ago

                                                                                  Not really, but there's a number of theories. The simplest one is that they "style tuned" the AI on human preference data, and this introduced a subtle bias for yellow.

                                                                                  And I say "subtle" - but because that model would always "regenerate" an image when editing, it would introduce more and more of this yellow tint with each tweak or edit. Which has a way of making a "subtle" bias anything but.

                                                                                  • amoursy 9 hours ago

                                                                                    There was also the theory that is was because they scanned a bunch of actual real books and book paper has a slight yellow hue.

                                                                                    • danielbln 9 hours ago

                                                                                      That seems unlikely, as we didn't see anything like that with Dall-E, unless the auto regressive nature of gpt-image somehow was more influenced by it.

                                                                                  • vunderba 9 hours ago

                                                                                    I never heard anything concrete offered. At least it's relatively easy to work around with a tone mapping / LUTs.

                                                                                    • viraptor 9 hours ago

                                                                                      My pet theory is that this is the "Mexico filter" from movies leaking through the training data.

                                                                                      • onoesworkacct 5 hours ago

                                                                                        There's definitely an analysis on the net somewhere, can't remember the details though.

                                                                                        • dvngnt_ 8 hours ago

                                                                                          maybe their version of synth-id? it at least helps me spot gpt images vs gemini's

                                                                                          • efilife 7 hours ago

                                                                                            I'm guessing that it was intentional all along, as no other models exhibit this behavior. It was so it could be instantly recognized as ChatGPT

                                                                                            • kingkawn 9 hours ago

                                                                                              Colloquially called the urine filter

                                                                                              • jebronie 5 hours ago

                                                                                                lets not mince words, its called the "piss filter"

                                                                                              • varjag 7 hours ago

                                                                                                Not always, it started at a very specific point. Studio Ghibli craze + reinforcement learning on the likes.

                                                                                                • minimaxir 4 hours ago

                                                                                                  The Studio Ghibli craze started with the initial release of images in ChatGPT, and the yellow filter has always existed even at that time. They did not make changes to the model as a result of RL (until pontentially today, with a new model)

                                                                                                  • weird-eye-issue 3 hours ago

                                                                                                    That's not how it works the model doesn't just update in real time to likes and besides it was already yellow upon release

                                                                                                • blurbleblurble 9 hours ago

                                                                                                  It's really weird to see "make images from memories that aren't real" as a product pitch

                                                                                                  • kingstnap 8 hours ago

                                                                                                    It's strange to me too, but they must have done the market research for what people do with image gen.

                                                                                                    My own main use cases are entirely textual: Programming, Wiki, and Mathematics.

                                                                                                    I almost never use image generation for anything. However its objectively extremely popular.

                                                                                                    This has strong parallels for me to when snapchat filters became super popular. I know lots of people loved editing and filtering pictures but I always left everything as auto mode, in fact I'd turn off a lot of the default beauty filters. It just never appealed to me.

                                                                                                    • nurettin 8 hours ago

                                                                                                      It would creep me out if the model produced origami animals for that prompt.

                                                                                                      • 999900000999 8 hours ago

                                                                                                        I can actually imagine actors selling the rights to make fake images with them.

                                                                                                        In late stage capitalism you pay for fake photos with someone. You have chat gpt write about how you dated for a summer, and have it end with them leaving for grad school to explain why you aren't together.

                                                                                                        Eventually we'll all just pay to live in the matrix. When your credit card is declined you'll be logged out, to awaken in a shared studio apartment. To eat your rations.

                                                                                                        • ares623 8 hours ago

                                                                                                          I can see them getting paid like residuals from TV re-runs.

                                                                                                          But after a point it'll hit saturation point. The novelty will wear off since everyone has access to it. Who cares if you have a fake photo with a celebrity if everyone knows it's fake.

                                                                                                      • password-app 5 hours ago

                                                                                                        Impressive image quality improvements. Meanwhile, AI agents just crossed a milestone: Simular's Agent S hit 72.6% on OSWorld (human-level is 72.36%).

                                                                                                        We're seeing AI get better at both creative tasks (images) and operational tasks (clicking through websites).

                                                                                                        For anyone building AI agents: the security model is still the hard part. Prompt injection remains unsolved even with dedicated security LLMs.

                                                                                                        • mingabunga 7 hours ago

                                                                                                          Did an experiment to give a software product a dark theme. Gave Both (GPT and Gemini/Nano) a screenshot of the product and an example theme I found on Dribbble.

                                                                                                          - Gemini/Nano did a pretty average job, only applying some grey to some of the panels. I tried a few different examples and got similar output.

                                                                                                          - GPT did a great job and themed the whole app and made it look great. I think I'd still need a designer to finesse some things though.

                                                                                                          • abbycurtis33 11 hours ago

                                                                                                            I still use Midjourney, because all of these major players are so bad at stylistic and creative work. They're singularly focused on photorealism.

                                                                                                            • ianbicking 8 hours ago

                                                                                                              I haven't really kept up with what Midjourney has been doing the past year or two. While I liked the stylistic aspects of Midjourney, being able to use image examples to maintain stylistic consistency and character consistency is SO useful for creating any meaningful output. Have they done anything in that respect?

                                                                                                              That is, it's nice to make a pretty stand-alone image, but without tools to maintain consistency and place them in context you can't make a project that is more than just one image, or one video, or a scattered and disconnected sequence of pieces.

                                                                                                              • xnx 9 hours ago

                                                                                                                This is surprising. Is there a gallery of images that illustrates this?

                                                                                                              • FergusArgyll 9 hours ago

                                                                                                                That's the opinionated vs user choice dynamic. When the opinions are good, they have a leg up

                                                                                                                • empressplay 5 hours ago

                                                                                                                  That's because it's a two-way street, a multi-modal model that is highly proficient at real-life image generation is also highly proficient at interpreting real-life image input, which is something sorely needed for robotics.

                                                                                                                  • kingkawn 9 hours ago

                                                                                                                    This is a cultural flaw that predates image generation. Even PG has made statements on HN in the past equating “rendering skill” with the quality of art works. It’s a stand-in for the much more difficult task of understanding the work and value of culture making within the context of the society producing it.

                                                                                                                    • doctorpangloss 8 hours ago

                                                                                                                      Suppose the deck for Midjourney hit Paul Graham's desk, and the CEO was just an average Y Combinator CEO - so no previous success story. He would have never invested in Midjourney at seed stage (meaning before launch / before there were users) even if he were given the opportunity.

                                                                                                                      Better to read that particular story in the context of, "It would be very difficult to make a seed fund that is an index of all avant garde culture making because [whatever]."

                                                                                                                  • GaryBluto 2 hours ago

                                                                                                                    God OpenAI are so far behind. Their own example shows that trying to only change specific parts of the image doesn't work without affecting the background.

                                                                                                                    • encroach 3 hours ago

                                                                                                                      This outperforms Gemini 3 pro image (nano banana pro) on Text-to-Image Arena and Image Edit Arena. I'm surprised they didn't mention this leaderboard in the blog post.

                                                                                                                      I like this benchmark because its based upon user votes, so overfitting is not as easy (after all, if users prefer your result, you've won).

                                                                                                                      https://lmarena.ai/leaderboard/text-to-image

                                                                                                                      https://lmarena.ai/leaderboard/image-edit

                                                                                                                      • nycdatasci 3 hours ago

                                                                                                                        The arena concept doesn’t work for image models due to watermarks.

                                                                                                                        • encroach 2 hours ago

                                                                                                                          There are no watermarks in the arena.

                                                                                                                      • anonfunction 8 hours ago

                                                                                                                        So the announcement said the API works with the new model, so I updated my Golang SDK grail (https://github.com/montanaflynn/grail) to use but it returns a 500 server error when you try to use it, and if you change to a completely unknown model it's not listed in the available models:

                                                                                                                          POST "https://api.openai.com/v1/responses": 500 Internal Server Error {
                                                                                                                            "message": "An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_******************* in your message.",
                                                                                                                            "type": "server_error",
                                                                                                                            "param": null,
                                                                                                                            "code": "server_error"
                                                                                                                          }
                                                                                                                        
                                                                                                                          POST "https://api.openai.com/v1/responses": 400 Bad Request {
                                                                                                                            "message": "Invalid value: 'blah'. Supported values are: 'gpt-image-1' and 'gpt-image-1-mini'.",
                                                                                                                            "type": "invalid_request_error",
                                                                                                                            "param": "tools[0].model",
                                                                                                                            "code": "invalid_value"
                                                                                                                          }
                                                                                                                        • zkmon 8 hours ago

                                                                                                                          AI-generated images would remove all the trust and admire for human talent in art, similar to how text-generation would remove trust and admire for human talent in writing. Same case for coding.

                                                                                                                          So, let's simulate that future. Since no one trusts your talent in coding, art or writing, you wouldn't care to do any of these. But the economy is built on the products and services which get their value based how much of human talent and effort is required to produce them.

                                                                                                                          So, the value of these services and products goes down as demand and trust goes down. No one knows or cares who is a good programmer in the team, who is great thinker and writer and who is a modern Picasso.

                                                                                                                          So, the motivation disappears for humans. There are no achievements to target, there is no way to impress others with your talent. This should lead to uniform workforce without much difference in talents. Pretty much a robot army.

                                                                                                                          • arnz-arnz an hour ago

                                                                                                                            all I can hope for is that a new industry or reliable ecosystem of vetters of real human talent will emerge. Are you really as good a writer as you claim to be? Show us the badge. That or AI firms have to be forced to 'watermark' all their creative outputs, and anyone misleading the public/audience should be punishable by law.

                                                                                                                          • aziis98 8 hours ago

                                                                                                                            I know this is a bit out of scope for these image editing models but I always try this experiment [1] of drawing a "random" triangle and then doing some geometric construction and they mess up in very funny ways. These models can't "see" very well. I think [2] is still very relevant.

                                                                                                                            [1]: https://chatgpt.com/share/6941c96c-c160-8005-bea6-c809e58591...

                                                                                                                            [2]: https://vlmsareblind.github.io/

                                                                                                                            • KaiserPro 8 hours ago

                                                                                                                              Is there a watermarking, or some other way for normal people to tell if its fake?

                                                                                                                              • mmh0000 8 hours ago

                                                                                                                                I know OpenAI watermarks their stuff. But I wish they wouldn't. It's a "false" trust.

                                                                                                                                Now it means whoever has access to uncensored/non-watermarking models can pass off their faked images as real and claim, "Look! There's no watermark, of course, it's not fake!"

                                                                                                                                Whereas, if none of the image models did watermarking, then people (should) inherently know nothing can be trusted by default.

                                                                                                                                • laurent123456 7 hours ago

                                                                                                                                  There are ways to tell if an image is real, if it's been signed cryptographically by the camera for example, but increasingly it probably won't be possible to tell if something is fake. Even if there's some kind of hidden watermark embedded in the pixels, you can process it with img2img in another tool and get rid of the watermark. Exif data, etc is irrelevant, you can get rid of it easily or fake it.

                                                                                                                                  • ewoodrich an hour ago

                                                                                                                                    Sure, you can always remove it, but an average person posting AI images on Facebook or whatever probably won't bother. I was skeptical of Google's SynthID when I first heard about it but I've been seeing it used to identify suspected AI images on Reddit recently (the example I saw today was cropped and lightly edited with a filter but still got flagged correctly) and it's cool to have a hard data point when present. It won't help with bad/manipulative actors but a decent mitigation for the low effort slop scenario since it can survive the kind of basic editing a regular person knows how to do on their phone and typical compression when uploading/serving.

                                                                                                                                  • wavemode 5 hours ago

                                                                                                                                    I think society is going to need the opposite - cameras that can embed cryptographic information in the pixels of a video indicating the image is real.

                                                                                                                                    • PhilippGille 8 hours ago

                                                                                                                                      https://help.openai.com/en/articles/8912793-c2pa-in-chatgpt-...

                                                                                                                                      It doesn't mention the new model, but it's likely the same or similar.

                                                                                                                                      • adrian17 8 hours ago

                                                                                                                                        I just checked several of the files uploaded to the news post, the "previous" and "new", both the png and webp (&fm=webp in url) versions - none had the content metadata. So either the internal version they used to generate them skipped them, or they just stripped the metadata when uploading.

                                                                                                                                      • mnorris 7 hours ago

                                                                                                                                        I ran exiftool on an image I just generated:

                                                                                                                                        $ exiftool chatgpt_image.png

                                                                                                                                        ...

                                                                                                                                        Actions Software Agent Name : GPT-4o

                                                                                                                                        Actions Digital Source Type : http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgori...

                                                                                                                                        Name : jumbf manifest

                                                                                                                                        Alg : sha256

                                                                                                                                        Hash : (Binary data 32 bytes, use -b option to extract)

                                                                                                                                        Pad : (Binary data 8 bytes, use -b option to extract)

                                                                                                                                        Claim Generator Info Name : ChatGPT

                                                                                                                                        ...

                                                                                                                                        • KaiserPro 7 hours ago

                                                                                                                                          Exif isn't all that robust though.

                                                                                                                                          I suppose I'm going to have to bite the bullet and actually train an AI detector that works roughly in real time.

                                                                                                                                      • ge96 7 hours ago

                                                                                                                                        I get the tech implementation is amazing, I wonder if it takes away from genuineness of events, like the Astronaut photo, I get it's just a joke/funny too but it's like a photo of you in a supercar vs. actually buying one. Or fake AI companions vs. real people. Beauty filters/skinny filters vs. actually being healthy.

                                                                                                                                        • onoesworkacct 5 hours ago

                                                                                                                                          the next generation of humans growing up will not even care whether media is real or not any more. The saturation of AI content and FUD around real content is going to blur the lines to the extent that there's no point even caring about it. And it's an intractable problem.

                                                                                                                                          hopefully this leads to greater importance of seeing things with your own wetware.

                                                                                                                                        • alasano 9 hours ago

                                                                                                                                          It's still not available in the API despite them announcing the availability.

                                                                                                                                          They even linked to their Image Playground where it's also not available..

                                                                                                                                          I updated my local playground to support it and I'm just handling the 404 on the model gracefully

                                                                                                                                          https://github.com/alasano/gpt-image-1-playground

                                                                                                                                          • anonfunction 8 hours ago

                                                                                                                                            Yeah I just tried it and got a 500 server error with no details as to why:

                                                                                                                                              POST "https://api.openai.com/v1/responses": 500 Internal Server Error {
                                                                                                                                                "message": "An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_******************* in your message.",
                                                                                                                                                "type": "server_error",
                                                                                                                                                "param": null,
                                                                                                                                                "code": "server_error"
                                                                                                                                              }
                                                                                                                                            
                                                                                                                                            Interestingly if you change to request the model foobar you get an error showing this:

                                                                                                                                              POST "https://api.openai.com/v1/responses": 400 Bad Request {
                                                                                                                                                "message": "Invalid value: 'blah'. Supported values are: 'gpt-image-1' and 'gpt-image-1-mini'.",
                                                                                                                                                "type": "invalid_request_error",
                                                                                                                                                "param": "tools[0].model",
                                                                                                                                                "code": "invalid_value"
                                                                                                                                              }
                                                                                                                                            • weird-eye-issue 3 hours ago

                                                                                                                                              My Enterprise account got an email 1.5 hours ago that it is available in API but my other accounts haven't gotten any email yet

                                                                                                                                              • minimaxir 9 hours ago

                                                                                                                                                It's a staggered rollout but I am not seeing it on the backend either.

                                                                                                                                                • joshstrange 7 hours ago

                                                                                                                                                  > staggered rollout

                                                                                                                                                  It's too bad no OpenAI Engineers (or Marketers?) know that term exists. /s

                                                                                                                                                  I do not understand why it's so hard for them to just tell the truth. So many announcements "Available today for Plus/Pro/etc" really means "Sometime this week at best, maybe multiple weeks". I'm not asking for them to roll out faster, just communicate better.

                                                                                                                                              • xnx 9 hours ago

                                                                                                                                                Great to have continued competition in the different model types.

                                                                                                                                                What angle is there for second tier models? Could the future for OpenAI be providing a cheaper option when you don't need the best? It seems like that segment would also be dominated by the leading models.

                                                                                                                                                I would imagine the future shakes out as: first class hosted models, hosted uncensored models, local models.

                                                                                                                                                • gs17 8 hours ago

                                                                                                                                                  > Still some scientific inaccuracies, but ~70% correct

                                                                                                                                                  That's still dangerously bad for the use-case they're proposing. We don't need better looking but completely wrong infographics.

                                                                                                                                                  • astrange 7 hours ago

                                                                                                                                                    It's pretty common for infographics to be wrong. The people making them aren't the same people who know the facts.

                                                                                                                                                    I'd especially say like 100% of amateur political infographics/memes are wrong. ("climate change is caused by 100 companies" for instance)

                                                                                                                                                    • rcarmo 7 hours ago

                                                                                                                                                      We don’t, but most Marketing departments salivate for them.

                                                                                                                                                    • raw_anon_1111 4 hours ago

                                                                                                                                                      I still can’t get it to draw a “13 hour clock” correctly

                                                                                                                                                      • neom 10 hours ago

                                                                                                                                                        Anyone else have issues verifying with openai? I always get a "congrats you're done" screen with a green checkmark from Persona, nothing to click, and my account stays unverified. (Edit, mystically, it's fixed..!)

                                                                                                                                                        • mohsen1 9 hours ago

                                                                                                                                                          Unlike Nano Banana it allows generating photos of children. Always fun to ask AI to imagine children of a couple but it's also kinda concerning that there might be terrible use cases.

                                                                                                                                                          • hexage1814 9 hours ago

                                                                                                                                                            If memory serves me, Nano Banana allows generating/editing photos of children. But anything that could be misinterpreted, gets blocked, even absolutely benign and innocent things (especially if you are asking to modify a photo that you upload there). So they allow, but they turn on the guardrails to a point that might not be useful in many situations.

                                                                                                                                                            • r053bud 9 hours ago

                                                                                                                                                              I was able to generate photos of my imagined children via Nano Banana

                                                                                                                                                              • BoorishBears 9 hours ago

                                                                                                                                                                I haven't seen that, meanwhile gpt-image-1.5 still has zero-tolerance policing copyright (even via the API) so it's pretty much useless in production once exposed to consumers.

                                                                                                                                                                I'm honestly surprised they're still on this post-Sora 2: let the consumer of the API determine their risk appetite. If a copyright holder comes knocking, "the API did it" isn't going to be a defense either way.

                                                                                                                                                              • dzonga 8 hours ago

                                                                                                                                                                we seriously can't be burning GW of energy just to have sama in a GPT-Shirt Ad generated by A.I

                                                                                                                                                                impressive stuff though - as you can give it a base image + prompt.

                                                                                                                                                                • drawnwren 8 hours ago

                                                                                                                                                                  counterpoint: we should make energy abundant enough that it really doesn't matter if sama wants to generate gpt-shirt ads or not.

                                                                                                                                                                  we have the capability, we just stopped making power more abundant.

                                                                                                                                                                  • iknowstuff 8 hours ago

                                                                                                                                                                    I think we can say the pause we took was reasonable once we realized the environmental impact of dumping greenhouse gases into the atmosphere but if now that can ensure further growth won’t do it, let’s make sure we restart, just clean this time.

                                                                                                                                                                  • astrange 7 hours ago

                                                                                                                                                                    It's a joke about one of his old fits.

                                                                                                                                                                    https://x.com/coldhealing/status/1747270233306644560

                                                                                                                                                                  • eterm 5 hours ago

                                                                                                                                                                    I have a "go to" prompt for images:

                                                                                                                                                                    > In the style of a 1970s book sci-fi novel cover: A spacer walks towards the frame. In the background his spaceship crashed on an icy remote planet. The sky behind is dark and full of stars.

                                                                                                                                                                    Nano banana pro via gemini did really well, although still way too detailed, and it then made a mess of different decades when I asked it to follow up: https://gemini.google.com/share/1902c11fd755

                                                                                                                                                                    It's therefore really disappointing that GPT-image 1.5 did this:

                                                                                                                                                                    https://chatgpt.com/share/6941ed28-ed80-8000-b817-b174daa922...

                                                                                                                                                                    Completely generic, not at all like a book cover, it completely ignored that part of the prompt while it focused on the other elements.

                                                                                                                                                                    Did it get the other details right? Sure, maybe even better, but the important part it just ignored completely.

                                                                                                                                                                    And it's doing even worse when I try to get it to correct the mistake. It's just repeating the same thing with more "weathering".

                                                                                                                                                                    • bongodongobob 4 hours ago

                                                                                                                                                                      You're just not describing what you want properly. Looks fine to me. Clearly you have something else in mind, so I think you're just not describing well. My tip would be to use actuall illustration language. Do you want a wide angle shot? What should depth of field be? Oil painting print? Ink illustration? What kind of printing style? Do you want a photo of the book or a pre-print proof? What kind of color scheme?

                                                                                                                                                                      A professional artist wouldn't know what you want.

                                                                                                                                                                      You didn't even specify an art style. 1970s sci-fi novel cover isn't a style. You'll find vastly different art styles from the 70s. If you're disappointed, it's because you're doing a shitty job describing what's in your head. If your prompt isn't at least a paragraph, you're going to just get random generic results.

                                                                                                                                                                      • eterm 4 hours ago

                                                                                                                                                                        The killer feature of LLMs is to be able to extrapolate what's really wanted from short descriptions.

                                                                                                                                                                        Look again at Gemini's output, it looks like an actual book cover, it looks like an illustration that could be found on a book.

                                                                                                                                                                        It takes on board corrections (albeit hilariously literaly).

                                                                                                                                                                        Look at GPT image's output, it doesn't look anything like a book cover, and when prompted to say it got it wrong, just doubles down on what it was doing.

                                                                                                                                                                    • smlavine 7 hours ago

                                                                                                                                                                      This is terrifying. Truth is dead.

                                                                                                                                                                      • WhyOhWhyQ 6 hours ago

                                                                                                                                                                        Makes you wonder what's really meant when we talk about progress.

                                                                                                                                                                      • ezero 10 hours ago

                                                                                                                                                                        Even from their own curated examples, this looks quite a bit worse than nano banan in terms of preserving consistency on image edits.

                                                                                                                                                                        • almosthere 9 hours ago

                                                                                                                                                                          I didn't have a good experience with NB. I am half Indian. Immediately changes my face to a prototypical Indian man every time I use it.

                                                                                                                                                                          This tool is keeping my look the same.

                                                                                                                                                                          • gundmc 9 hours ago

                                                                                                                                                                            I find including "don't change anything else" in the NBP prompt goes a long way.

                                                                                                                                                                            • almosthere 7 hours ago

                                                                                                                                                                              I tried all of those types of prompts

                                                                                                                                                                          • mortenjorck 7 hours ago

                                                                                                                                                                            Nano Banana became useless for image edits once the safety training started rejecting anything as “I can’t edit some public figures.”

                                                                                                                                                                            My own profile picture? Can’t edit some public figures. A famous Norman Rockwell painting from 80 years ago? Can’t edit some public figures.

                                                                                                                                                                            Safety’d into oblivion.

                                                                                                                                                                          • surrTurr 9 hours ago

                                                                                                                                                                            not super impressed. feels like 70% as good as nano banana pro.

                                                                                                                                                                            • sfmike 8 hours ago

                                                                                                                                                                              Hope to see more "red alert" status from the ai wars putting companies into al hands on deck. This is only helping cost of tokens and efficacy. As always competition only helps the end users.

                                                                                                                                                                              • thumbsup-_- an hour ago

                                                                                                                                                                                now you can create good memories with your family without meeting them

                                                                                                                                                                                • nightshift1 5 hours ago

                                                                                                                                                                                  What is the endgame? Why is OpenAI throwing that much money on image/video generation? Is there a profitable market for AI-generated image slop? Do people choose ChatGPT instead of Gemini/Grok/Claude because of the image generation capabilities? To me, it looks like a huge fiery money pit.

                                                                                                                                                                                  • BrokenCogs 5 hours ago

                                                                                                                                                                                    The endgame is to make money during the hype and then cash out before it crashes.

                                                                                                                                                                                    • bdangubic 5 hours ago

                                                                                                                                                                                      if that is the endgame openai is doing everything but working towards that goal :)

                                                                                                                                                                                      • BrokenCogs 5 hours ago

                                                                                                                                                                                        Yeah they fumbled big time

                                                                                                                                                                                  • pdevr 9 hours ago

                                                                                                                                                                                    >Now remove the two men, just keep the dog, and put them in an OpenAI livestream that looks like the attached image.

                                                                                                                                                                                    Where is the image given along with the prompt? If I didn't miss it: Would have been nice to show the attached image.

                                                                                                                                                                                    • taytus 9 hours ago

                                                                                                                                                                                      on top of the prompt. It has a weird layout; I had to scroll up to see it.

                                                                                                                                                                                    • 0dayman 9 hours ago

                                                                                                                                                                                      nah Nano Banana Pro is much better

                                                                                                                                                                                      • gostsamo 8 hours ago

                                                                                                                                                                                        Alt text is one of the nicest uses for ai and still Open AI didn't bother using it for something so basic. The dogfooding is not strong with their marketing team.

                                                                                                                                                                                        • celeryd 8 hours ago

                                                                                                                                                                                          If it can't generate non-sexual content of a woman in a bikini, I am not interested.

                                                                                                                                                                                          • randall 4 hours ago

                                                                                                                                                                                            double popped collar ftw

                                                                                                                                                                                            • StarterPro 9 hours ago

                                                                                                                                                                                              In the image they showed for the new one, the mechanic was checking a dipstick...that was still in the vehicle.

                                                                                                                                                                                              I really hope everyone is starting to get disillusioned with OpenAI. They're just charging you more and more for what? Shitty images that are easy to sniff out?

                                                                                                                                                                                              In that case, I have a startup for you to invest in. Its a bridge-selling app.

                                                                                                                                                                                              • czhu12 9 hours ago

                                                                                                                                                                                                Haven’t their prices stayed at $20/m for a while now?

                                                                                                                                                                                                • wahnfrieden 8 hours ago

                                                                                                                                                                                                  They've published anticipated price increases over coming years. Prices will rise dramatically and steadily to meet revenue targets.

                                                                                                                                                                                                  • cheema33 7 hours ago

                                                                                                                                                                                                    AI doesn’t have much of a moat. People can and will easily switch providers.

                                                                                                                                                                                                    • wahnfrieden 2 hours ago

                                                                                                                                                                                                      Sure but there are only a couple leading providers worth considering for coding at least, and there will be consolidation once investment pulls back. They may find a way to collude on raising prices.

                                                                                                                                                                                                      Where switching will be easier is with casual chat users plus API consumers that are already using substandard models for cost efficiency. But there will also always be a market for state of art quality.

                                                                                                                                                                                              • ChrisArchitect 11 hours ago
                                                                                                                                                                                                • dang 9 hours ago

                                                                                                                                                                                                  We'll merge that thread hither to give some other submitters a chance.

                                                                                                                                                                                                • enigma101 6 hours ago

                                                                                                                                                                                                  Really can't stand the image slop suffocating the internet.

                                                                                                                                                                                                  • catigula 9 hours ago

                                                                                                                                                                                                    Nano Banana Pro is so good that any other attempt feels 1-2 generations behind.

                                                                                                                                                                                                    • Jonovono 9 hours ago

                                                                                                                                                                                                      Nano banana pro is almost as good as seedream 4.5!

                                                                                                                                                                                                      • BoorishBears 9 hours ago

                                                                                                                                                                                                        Seedream 4.5 is almost as good as Seedream 4!

                                                                                                                                                                                                        (Realistically, Seedream 4 is the best at aesthetically pleasing generation, Nano Banana Pro is the best at realism and editing, and Seedream 4.5 is a very strong middleground between the two with great pricing)

                                                                                                                                                                                                        gpt-image-1.5 feels like OpenAI doing the bare minimum to keep people from switching to Gemini every time they want an image.

                                                                                                                                                                                                    • ares623 7 hours ago

                                                                                                                                                                                                      My copium is that analog photography makes a come back as a way to recover some level of trust and authenticity.

                                                                                                                                                                                                      • famahar an hour ago

                                                                                                                                                                                                        I was reading a trend report on art and it seems like collage, squiggly hand drawn text, and lots of intentional imperfections are becoming popular. I'm not sure how hard it is for AI to recreate those, but it is nice to see people trying to do more of what AI struggles with.

                                                                                                                                                                                                        • Forgeties79 7 hours ago

                                                                                                                                                                                                          Good luck getting it developed unfortunately. I have to ship it off now, there isn’t a single local spot in my city that will develop anymore

                                                                                                                                                                                                          • ares623 6 hours ago

                                                                                                                                                                                                            When the demand is back, the labs should start coming back. There's a few in my relatively small city which is pretty surprising. But the costs are still too high to cover the low volume I guess.

                                                                                                                                                                                                        • brador 8 hours ago

                                                                                                                                                                                                          Every person in every picture in their examples is white except for 1 Asian dude. Like a 46:1 ratio for the page (I counted). Not one Middle Eastern or Black or Jewish or Indian or South American person.

                                                                                                                                                                                                          Not even one. And no one on the team said anything?

                                                                                                                                                                                                          Come on Sam, do better.

                                                                                                                                                                                                          • rvz 9 hours ago

                                                                                                                                                                                                            Another bunch of "startups" have been eliminated.

                                                                                                                                                                                                            • moralestapia 9 hours ago

                                                                                                                                                                                                              Among those, Photoshop.

                                                                                                                                                                                                              • koakuma-chan 9 hours ago

                                                                                                                                                                                                                I wish. Even Nano Banana Pro still sucks for even basic operations.

                                                                                                                                                                                                            • adammarples 6 hours ago

                                                                                                                                                                                                              Still can't pass my image test

                                                                                                                                                                                                              Two women walking in single file

                                                                                                                                                                                                              Although it tried very hard and had them staggered slightly