I knew it would come to this. . .
When photography was first invented, there were people who claimed that it was able to capture images of ghosts, the dead, etc.
I’ve low-key been waiting for something like that to happen with LLMS. . . and today it finally did.
I needed to create a table from information that was in a graph written in Vietnamese, so I took a screenshot of the graph, threw it into ChatGPT and said “translate to English.”
ChatGPT responded by saying “I don’t see any text in the image to translate. It appears to be a black-and-white photograph of a baby sitting in a chair, wearing a patterned headscarf tied under the chin and a light-colored ruffled outfit.”
Full disclosure: Prior to this interaction, I had consumed 3 cups of coffee and had eaten a pretty sweet cookie, so I rubbed my eyes and pinched myself to make sure that what I was seeing was not a caffeine and sugar induced hallucination.
It was not. So, I said, “try again.”
Once again, ChatGPT responded by saying, “I checked again more closely. There is no visible text in the image to translate into English. The image itself shows a baby sitting on a chair, wearing a headscarf tied under the chin and a light ruffled outfit.”
Mind you, this conversation took place in a new chat that I opened for this purpose. Further, at no point in my history of interactions with ChatGPT have headscarf-wearing babies in chairs ever been part of any conversation. So, there is no way that this was in ChatGPT’s “memory” of any previous interaction with me.
Morbidly curious, I then prompted, “Show me the image you see.”
And this is what it revealed:
Ok, that is officially CREEPY!!!
I’ve been thinking recently of switching to Claude. I’m not sure if I should interpret this as a confirmation of that idea or a warning from ChatGPT. . .
As one with some background in AI, I’m very much intrigued by this seeming hallucination. For exploration, could you share the graph screenshot and the original prompt, Prof.? On the other hand, you could try cropping the screenshot into some overlapping portions (each of them should contain some meanings on its own) and then asking the LLM to gather all information from each portion. Lastly, LLM should be able to consolidate all of those info into some tables.