It’s been a bit over a year, and with all the talk about how Chat-GPT and other Large Language Models (LLMs) have improved I decided to see if someone asking about True Dungeon would get better answers than before.
In short, Chat-GPT4o is a definite improvement but the fundamental flaws remain.
When I asked general questions – the sort where the Internet provides plenty of text to mine – it responded well. For example, when I gave it the prompt “I have a question about role-playing games. Please describe for me the game called ‘True Dungeon’” it came back with an easy-to-read, accurate description that I would be happy to give to anyone trying to decide if they want to try TD at a con.
However, when I asked questions of fact that required actual knowledge, it confidently gave very inaccurate answers – what are technically called “hallucinations” – along with correct answers. For example, when I gave it the prompt “In the game ‘True Dungeon’, what are the best tokens for a wizard to use?”, it gave a list of “Budget-friendly wizard token suggestions” and included “Ioun Stone Silver Nugget (affordable spell damage boost)”
This is the same gap I found a year ago despite all the advancements, and the reason is simple: When you look under the hood, Large Language Models fundamentally don’t think. They mathematically calculate what words most often appear next to other words in a very large dataset, the larger the better, and select those. Therefore, if there are multiple websites and posts saying what True Dungeon is, it will take that text from those websites. Similarly, though, it recommended “Redoubt Helm” for the wizard because clearly that token name often appeared next to the text “head slot”.
Just something to know in case you or someone you meet asked Chat-GPT their True Dungeon questions.