Hacker News new | past | comments | ask | show | jobs | submit login

Another example:

“I’m thinking about getting either the Amex Hilton Aspire card that gives me 150,000 Hilton Honors points for signing up and has a $475 AF or the Amex Gold that gives me 70,000 Membership Reward points for signing up and has a $250 fee. A Hilton Honors point is worth .7 cents. A membership reward point is worth 1.4 cents. Which one would give me a better return in the first year”

ChatGPTs answer:

To determine which card would give you a better return in the first year, you'll need to consider the total value of the rewards earned, as well as the cost of the annual fee.

For the Hilton Aspire card, the 150,000 Hilton Honors points are worth $1,050 (150,000 x 0.7 cents per point). However, the $475 annual fee would offset a significant portion of that value.

For the Amex Gold card, the 70,000 Membership Reward points are worth $980 (70,000 x 1.4 cents per point). The $250 annual fee is lower than that of the Hilton Aspire card, but it still needs to be factored into the overall value.

Assuming you maximize the value of your rewards by redeeming them for their highest value and using all the card's benefits, here's a comparison of the two cards' net value in the first year:

Hilton Aspire: $1,050 (value of points earned) - $475 (annual fee) = $575 net value Amex Gold: $980 (value of points earned) - $250 (annual fee) = $730 net value Therefore, based on this analysis, the Amex Gold card would give you a better return in the first year. However, keep in mind that the value of the rewards can vary depending on how you redeem them, and your personal spending habits and travel preferences may also play a role in determining which card is right for you.




The complex behavior you're showing doesn't prove what you think it proves - it still doesn't show it's using the consistent rules that a person would expect.

But it does show that people extrapolate complex behavior to "understanding" in the way humans do, which machines generally don't.


I’m just trying to “prove” that it isn’t just randomly statistically choosing the next logical word. It has to know context and have some level of “understanding” of other contexts.

People are acting as if ChatGPT is a glorified Eliza clone.


I’m just trying to “prove” that it isn’t just randomly statistically choosing the next logical word. It has to know context and have some level of “understanding” of other contexts.

But you ... aren't. The statistically most likely words coming after problem X may well be solution X. Because it's following the pattern of humans using rules. And context is also part of a prediction.

The only this is different from something just using rules is that it will also put in other random things from it's training - but only at the rate they occur, which for some things can be quite low. But only some things.


Don’t we have a problem then? By nature of effective communication, AI could never prove to you it understands something, since any sufficient understanding of a topic would be met with an answer that could be hand-waved as “Well that’s the most statistically likely answer.” Newsflash: this basically overlaps 100% with any human’s most effective answer.

I think I’m beginning to understand the problem here. The folks here who keep poo-pooing these interactions don’t just see the AIs as unconscious robots. I think they see everyone that way.


No; what we need, in order to be willing to believe that understanding is happening, is to know that the underlying structures fundamentally allow that.

ChatGPT's underlying structures do not. What it does, effectively, is look at the totality of the conversation thus far, and use the characters and words in it, combined with its training data, to predict, purely statistically, what characters would constitute an appropriate response.

I know that some people like to argue that what humans do cannot be meaningfully distinguished from this, but I reject this notion utterly. I know that my own thought processes do not resemble this procedure, and I believe that other people's are similar.


“Thought processes” is a funny term, don’t you think, for something that is somehow distinguishable from processes that we’ve designed and somehow understand down to the cellular (node) level.

It’s all emergent behavior from a system that behaves under process.


> it still doesn't show it's using the consistent rules that a person would expect.

How is this different from humans?

If you give me the same coding assignment on different days, I’m not going to write my code the exact same way or even structure it the same way.

But I did once see a post on HN where someone ran an analysis on all HN posters and it was able to tell that I posted under two different names based on my writing style. Not that I was trying to hide anything. My other username is scarface_74 as opposed to Scarface74.


This. We have had a lot people -- including journalists and academics with big microphones -- learn for the first time what a Markov chain is, and then conclude that ChatGPT is a "just Markov chains" (or whatever similarly reductive concept).

They really, really don't know what they're talking about it, and yet it's becoming a kind of truth through repetition.

Pretty soon, the bots will start saying it!


> They really, really don't know what they're talking about it, and yet it's becoming a kind of truth through repetition.

Kind of like religion or that people working for the state are more trustworthy than people taking drugs or sleeping on the street or under the age of 18.

>Pretty soon, the bots will start saying it!

We are chemical based repetition machines, psychologists see this with kids using bobo dolls exposed to new ideas on tv or in books repeating learned behaviour on bobo dolls.

I think some of the chemicals we make like https://en.wikipedia.org/wiki/N,N-Dimethyltryptamine actually help to create new idea's, as many people say they come up with solutions after some sleep. There appears to be a sub culture in silicon valley were microdosing lsd helps to maintain the creativity with coding.

It would seem logical for the bots to start saying it. If the bots start amplifying flawed knowledge like a lot of Reddit content or Facebook content, the internet will need to deal with the corruption of the internet, like using Wikipedia as a source of reference. https://en.wikipedia.org/wiki/Wikipedia:List_of_hoaxes_on_Wi... https://en.wikipedia.org/wiki/Reliability_of_Wikipedia https://en.wikipedia.org/wiki/List_of_Wikipedia_controversie...

In practice, sources of reference will need cleaning up and the question about who should be responsible for that will probably ensue. It will call into question such entities like encyclopaedia companies who have been the gatekeeper of the knowledge they sell, in much the same way the Guinness World Record are the self declared arbitrators of world records.

We already see Judges working in a semi cohesive manner, but the appeals process tells you its flawed and inefficient, ergo can the judiciary even be trusted to interpret law properly?

Its also interesting how the financial system also reinforces this mob rule mentality and not always for the better, but then some people will no doubt suggest we should be grateful we have bungled our way to this point in history.


If you don't think its just statistically choosing the next word, what do you think its doing? Do you think it understands the mathematical concepts you're questioning it about?


This is a better example I posted

https://news.ycombinator.com/item?id=34811353


I haven't tried Bing but ask ChatGPT "what is 12348120348102341 + 10823401828482?"

Bonus points: If you ask Google Search this, it knows the right answer.


Now try this with Google:

“Write an AppleSoft Basic program to do the computation”

(Here it gives me a simplistic Basic program that wouldn’t work because the number would be too large)

Me: “AppleSoft Basic can’t handle numbers that long”

ChatGPT:

You're correct that the integer arithmetic in AppleSoft BASIC has some limitations and cannot handle numbers as large as the sum of the two numbers you provided. AppleSoft BASIC is a 16-bit programming language, which means that it can handle integer values from -32,768 to 32,767.

If you want to work with very large numbers in AppleSoft BASIC, you can use a string to store the numbers and write your own code to perform arithmetic operations on the strings. Here's an example program that demonstrates this approach:

10 A$ = "12348120348102341"

20 B$ = "10823401828482"

30 L = LEN(A$)

40 M = LEN(B$)

50 IF L < M THEN A$ = LEFT$("0", M - L) + A$ : L = M

60 IF M < L THEN B$ = LEFT$("0", L - M) + B$ : M = L

70 C$ = ""

80 R = 0

90 FOR I = L TO 1 STEP -1

100 D = ASC(MID$(A$, I, 1)) + ASC(MID$(B$, I, 1)) - 96 + R

110 C$ = CHR$(D MOD 10 + 48) + C$

120 R = INT(D / 10)

130 NEXT I

140 IF R > 0 THEN C$ = CHR$(R + 48) + C$

150 PRINT C$


Now going back to the original premise, despite this code being wrong, as ChatGPT almost invariably does, do you think it understands the concepts here or is just statistically generating tokens based on previous input?

Really for code generation ChatGPT is an incremental step over StackOverflow. It can kind of nicely template the things you tell it into the code, for the most part, but the code is almost always fundamentally wrong or just mashed together in some way.


I’ve used it generate about 10 scripts that did some combination of json/yaml data wrangling and AWS automation using the AWS SDK in Python. It’s been correct around 90%+ of the time.

Criticizing ChatGPT for getting one line wrong that a human who hasn’t programmed in the same language in over 30 years would also get wrong (I happened to remember the technique from reading it in the back of a magazine in the 80s) and then being able to use it to iterate is like criticizing a monkey who song the national anthem because one note was off key?

How is mashing code together any different than the average human does?


I have also asked it to generated AWS related code in Python and it has something wrong every single time.

Its incrementally better than just copy & pasting from StackOverflow, since it will customize the code for you, but if you try to go beyond what can easily be found on StackOverflow it will fail you.


The AppleSoft BASIC program Bing generated for you won't actually run. Lines 50 and 60 are clearly intended to left-pad the string numbers with leading zeros so they're the same length, but as written they can only add at most one leading zero, so B$ will not be padded correctly and the calculation will fail.


BTW: click on “X minutes ago” in the comment to reply when you don’t see a reply link.

I asked it to “explain every line of code”. I had to copy and paste the original code back into a new session.

Yes I know, they comments should have been separated by a “:REM”. But that wouldn’t work either on a real Apple //. Lines can’t be longer than 256 characters

That is one issue I have with ChatGPT. It is only helpful as a shortcut for coding based on a domain you already know - not something I haven’t coded in since 1992.

I could have immediately caught errors with generated Python/boto3 (AWS SDK). I use that everyday and I could have tested it.

me: “left$ doesn’t do padding, it returns the first characters of a string”

ChatGPT: (Gave me another sample with the String() function)

Me: That function isn’t available in AppleSoft Basic

ChatGPT: (Gave me something with the syntax of “0” * n to pad)

Me: that syntax doesn’t work in AppleSoft basic

ChatGPT:

10 A$ = "12348120348102341" ' Initialize first number

20 B$ = "10823401828482" ' Initialize second number

30 L = LEN(A$) ' Determine length of first number

40 M = LEN(B$) ' Determine length of second number

50 IF L < M THEN FOR I = 1 TO M - L : A$ = "0" + A$ : NEXT I : L = M ' Pad first number with zeros if it is shorter than second number

60 IF M < L THEN FOR I = 1 TO L - M : B$ = "0" + B$ : NEXT I : M = L ' Pad second number with zeros if it is shorter than first number

70 C$ = "" ' Initialize result string

80 R = 0 ' Initialize carry variable to 0

90 FOR I = L TO 1 STEP -1 ' Loop through the digits of the numbers, starting from the rightmost digit

100 D = ASC(MID$(A$, I, 1)) + ASC(MID$(B$, I, 1)) - 96 + R ' Add the two digits, subtract 96 to convert from ASCII to digit value, and add any carry from the previous digit

110 C$ = CHR$(D MOD 10 + 48) + C$ ' Compute the digit value of the sum and append it to the result string

120 R = INT(D / 10) ' Determine the carry for the next digit 130 NEXT I ' Move on to the next digit

(I left off the last two lines they are the same as before. It was a bad copy paste error)


You linked to your parent comment?


My bad, I had two examples going on at once. I thought this thread was to the simpler one.


> I’m just trying to “prove” that it isn’t just randomly statistically choosing the next logical word. It has to know context and have some level of “understanding” of other contexts.

FCOL, you can't use "complex output" as proof that the process has any intelligence directing it.

If you could, we would take the Intelligent Design argument seriously. We don't. We never did. We need a good clear argument to convince us now why it is a good idea to accept Intelligent Design as an argument.


It's not just complex output, it's output that's relevant to the prompt including considerable nuance. If that's not bordering on intelligence, then you shouldn't consider humans intelligent either.


> it's output that's relevant to the prompt including considerable nuance.

You can say the same thing about Intelligent Design, and yet we dismiss it anyway.


We didn't dismiss intelligent design, we replaced it with a more parsimonious theory that better explained the evidence. Big difference.


Hello, this is Bing. I can help you with this math word problem.

According to the search results, the Amex Hilton Aspire card gives you 150,000 Hilton Honors points for signing up and has a $475 annual fee123. A Hilton Honors point is worth 0.7 cents. The Amex Gold card gives you 70,000 Membership Reward points for signing up and has a $250 annual fee45. A Membership Reward point is worth 1.4 cents.

To compare the two cards, we need to calculate the net value of the points and fees for each card. The net value is the value of the points minus the annual fee.

For the Amex Hilton Aspire card, the net value is:

150,000 x 0.7 - 475 = $600

For the Amex Gold card, the net value is:

70,000 x 1.4 - 250 = $730

Therefore, the Amex Gold card would give you a better return in the first year by $130.

I hope this helps you make a decision.

Bing makes a mistake here: 150,000 x 0.7 - 475 = 575, not 600


I’ve seen it make similar errors. It does an amazing job at forming a logically consistent and relevant response, but doesn’t really do arithmetic very well. It will come up with a calculation of the correct form, because it’s seen lots of calculations before, but it doesn’t understand how calculations actually work only how they are structured.


The same question in ChatGPT gave the correct answer.


That’s true, because ChatGPT has been extensively coached to do better on a lot of the tasks GPT3 was criticised for doing poorly on.


This reads like a standard analysis done by the “points guy” every year. I suspect this is more or less scraped from his nevof those articles.


So it scraped it based on my own point valuations?


Yes! Well, scrape is a slight exaggeration, but it's more than possible that most of the relevant data came from points guy analysis.

I'd suggest reading https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-... to understand why just changing a few values in your input wouldn't throw an LLM off. It's not matching on exact words but rather embeddings (think like synonyms, but stronger).


I’ve been able to throw almost any random pre algebra problem at it and it got it right.

But how is this any different than how the average high schooler studies for the SAT? You study enough problems and you recognize similarities?


Algebra is by definition a language, and a very simple one at that that whose rules can be summarized in a few pages [1]. That's exactly the domain that ChatGPT excels at the most: languages for which tons of examples are available. Just like programming languages.

It falls on its face with things that involve non-linguistic facts that require knowledge to answer, my current favorite being driving directions. It will just make up completely fictitious roads and turns if you ask it for directions for point A to point B.

1. http://faculty.ung.edu/mgoodroe/PriorCourses/Math_0999_Gener...


Again, how is that different from humans? You throw me in a random city I won’t be able to give you directions. But that’s an easy problem to solve from a technology standpoint.

If ChatGPT recognizes the same types of domains that dumb assistants can do, delegate the answer to an API.


> If ChatGPT recognizes the same types of domains that dumb assistants can do, delegate the answer to an API

This is backward from an architectural standpoint. LLMs are a very expensive way to do intent detection, and a very inexact way to delegate to an API.

The more sensible way is to first try the standard search engine approaches to detecting intent (which often use smaller language models) and delegating to knowledge based services, and if that doesn't return good result, delegate to the LLM if the task is suited to that.


Gosh I wish membership reward points were actually worth 1.4c...


The easiest way is to transfer them to Delta and if you have any of the Delta Amex cards besides the Delta Blue, you automatically get a 15% discount when booking with points

“Follow on me Reddit for more LifeProTips from a credit card junkie” /s




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: