All the llama models, including the 70B one can run on consumer hardware. You might be able to fit GPT-3 (175B) at Q4 or Q3 on a Mac Studio, but that's probably the limit for consumer hardware. At 4-bit a 7B model requires some 4GB of ram, so that should probably be possible to run on a phone, just not very fast.
> Contains inappropriately sourced conjecture of OpenAI's ChatGPT parameter count from this http URL, a citation which was omitted. The authors do not have direct knowledge or verification of this information, and relied solely on this article, which may lead to public confusion
(the noted URL is a just a Forbes blogger with no special qualifications that would make what he claimed particularly credible).