Out of curiosity, I just tried with ChatGPT 4o... Screenshot of a legit banking website and asking it to describe it to me, to give me the exact URL in the screenshot and to tell me if it's legit or not.
It described me the whole page, explaining it was a login page to log in to bank X in country Y. He compared the URL with the bank's name, etc.
I know everybody's doing it because they don't know better, but it's a terrible idea to make the inductive leap from one successful sample to some abstract sense of what a ML model is suited for. Especially for anything important.
As a sibling comment noted, performance will almost certainly be sensitive to temperature (randomness), exact prompt phrasing, exact sequence of messages in a dialog, and the training-data frequency of both the site being analyzed and the phishing approach used.
One could conceivably train a specialized ML model, perhaps with an LLM component, to detect sophsticated phishing attempts and I would assume this has even been done.
But using a relying on generic "helpful chatbot" to do that reliably and sufficiently is a really bad idea. That's not what it's for, not what's good at, and not something its vendor promises for it to remain good at even if it happens to be today.
that's called a hallucination. AI models are simply guessing what to say with differing sizes of word banks
At it's best, it may even "recognize" the top 90% of sites. Often, it's not a bulletproof solution, and shouldn't be trusted to generate either false positive/negative
My best operational security advice is not to click shit in your inbox and navigate directly to the hostname you trust to do sensitive actions
It described me the whole page, explaining it was a login page to log in to bank X in country Y. He compared the URL with the bank's name, etc.
Then I modified one letter in the URL, changing "https://online.banking.com" (just an example) to "https://online.banklng.com" and asked ChatGPT 4o again.
He said it was a phishing attempt.
So, basically, you can, today, already have a screenshot automatically analyzed and have a model tell you if it's seemingly legit or not.