The hybridised world and ethos of Studio Ghibli
Following the fleeting trend, a probe into a how Studio Ghibli films are more than just a two-hour watch. They are a sensory experience.
What sets these models apart is their ability to determine when and how to use these tools to deliver clear, accurate, and well-formatted responses, usually in ‘under a minute’.
File photo
ChatGPT geolocation: Earlier this week, on April 16, OpenAI introduced o3 and o4-mini, the newest additions to the o-series models, designed to engage in deeper reasoning before generating responses.
These models are the most advanced OpenAI has released so far, marking a significant leap in ChatGPT’s capabilities for a wide range of users–from curious explorers to expert researchers.
Advertisement
The newest visual reasoning models in ChatGPT are now capable of autonomously and intelligently using various tools—like web search, Python for analyzing data and files, generating images, and interpreting visual inputs.
Advertisement
What sets these models apart is their ability to determine when and how to use these tools to deliver clear, accurate, and well-formatted responses, usually in ‘under a minute’. This enables them to solve more layered and difficult problems with much ease.
By combining advanced reasoning with full tool access, these models deliver significantly better results on both academic benchmarks and real-world applications, raising the bar for what’s possible in terms of intelligence and practical value.
OpenAI states that o3 is their most sophisticated reasoning model to date, setting new standards in areas such as coding, mathematics, scientific analysis, and visual comprehension.
The newly released visual reasoning models, o3 and o4-mini, are capable of incorporating images directly into their chain-of-thought, enabling them to “think” with visuals as part of the problem-solving process.
With significantly improved visual intelligence, ChatGPT can now analyze images with greater depth, accuracy, and reliability. It effortlessly combines this capability with tools like web search and image editing–automatically zooming, cropping, flipping, or enhancing images as needed.
Depending on the task, the model intelligently decides which tools to use to generate the most helpful response.
On social media platform X, users have been sharing impressive examples of ChatGPT using o3 to interpret street-view photos, decipher restaurant menus, and even guess specific locations from user-uploaded images or screenshots–just by asking, “Where is this?”
The geoguessing power of o3 is a really good sample of its agentic abilities. Between its smart guessing and its ability to zoom into images, to do web searches, and read text, the results can be very freaky.
I stripped location info from the photo & prompted “geoguess this” pic.twitter.com/KaQiXHUvYL
— Ethan Mollick (@emollick) April 17, 2025
alright pic.twitter.com/59DA0p3AE0
— henry (@arithmoquine) April 17, 2025
Watch gpt o3 play GeoGuessr for me
It’s incredible to me how it examines distinct objects and qualities of the image for clues and ties them together like a real player
(video is sped up)
inspired by @josh_bickett pic.twitter.com/NS4cc9ce4i
— Michael Milstead (@michael_milst) April 17, 2025
Users pin-point ‘geo spying’
Originally developed as a playful tool for ‘GeoGuessr’ enthusiasts, ChatGPT’s new visual capabilities have sparked growing privacy concerns.
Online, users have raised alarms about potential “geo spying,” as the model can analyze incredibly fine-grained visual cues like shop logos, street signs and architectural styles–to accurately pinpoint where a photo was taken.
And if this level of detail is possible today, it’s easy to imagine the implications when AI becomes ten times more powerful.
Advertisement