Update on what ChatGPT is bad at doing.
OMG ChatGPT o1 is terrible. And 4.5 is not great. I keep going back to 4.o. At least it knows how to code.
I've experimented now for about a year with giving ChatGPT tasks. The below criticisms apply to ChatGPT 4o. I recently got access to ChatGPT 4.5 which is better at coding and following instructions (it at least takes the instruction to give steps one at a time, seriously, whereas ChatGPT 4o almost always ignores that). Anyway, this is what 4o is bad at:
- Making, interpreting, processing, editing spreadsheets. You have to convert the spreadsheet to CSV first (you lose all formulas and charts). It is ok at interpreting small data sets, e.g. 20-30 bits of simple information e.g. amount of money spent per month, with each month listed. It's no good at, for example, creating a Nett Present Value spreadsheet/analysis.
- Editing long documents or just searching documents for specific errors. Even basic requests like checking for double spaces do not work. In one case it saw a poem and thought that it was grammatically incorrect and had incorrect capitalisation because it did not recognise that it was a poem. It is OK at suggesting alternative wording but seems to prefer active voice American-style writing. It gets confused about academic passive-voice style and wants to change it.
- Generating table grids with correct contents. I tried asking it to create a Word file with a bingo sheet. Nope. It generates a generic table (in Word) and some of the bingo sheets have duplicate clues (they're supposed to be random and unique). Even though I told it the squares must be square, it ignored that. Only way to achieve this was to ask it to do it in HTML and embed a javascript randomiser. That worked well. You can see the result at https://www.ostrowick.co.za/bingo/.
- Interpreting PDFs. It really cannot do it properly, in the end. You have to convert the PDF to plain text and clean it up first by making chapter breaks clear with say, underscores, and deleting running headers and footers and page numbers. Even then it doesnt like long documents. I tried giving it a book to summarise in this manner. Nope, I had to do it chapter by chapter. It doesn't understand pagination or chapter headings. If you ask it for page numbers of errors it struggles to give accurate answers. If there's a table or graphic in the PDF, good luck with that.
- Code line numbers. Even if you give it a flat text file with code in it, it cannot tell you exactly what line number a bug is in. It tells you within about 10 or so lines, e.g. if the bug is in like 63 it might tell you the bug is in line 77. I suspect it is expecting UNIX linefeeds/carriage returns and gets confused by DOS ASCII. I'll have to test this theory.
- Generating images. It passes images to Dall-e which is a pretty lousy image generator. It does not generate vectors. Forget vector diagrams with labels, such as flowcharts and corporate graphics for powerpoints. It is useless at that. It is also heavily censored and won't generate images of politicians. I asked it to generate a satirical cartoon and it flat refused. Pretty sure satire is fair use.
- Generating powerpoints. It can generate powerpoints but they are very low quality copy/paste jobs with bullets mindlessly divided between pages/slides for no strong reason. No branding/background graphics etc., not even an attempt. No contents page, etc.
- Advanced Mathematics. It still makes mathematical errors. My test is to ask for the rotation by volume of a sine wave of 180 degrees. It does not generally give the right answer, it messes around for ages and you have to argue with it.
- Following instructions - specifically the instruction to give steps one at a time. I have to tell it approximately every half hour to give me step-by-step instructions for debugging code or installing complex software. It generally runs ahead and gives 3-20 steps without a pause. The trouble is, it assumes it will all work as planned, announcing at the end of the instructions, "WHY THIS WORKS". And so if you want to report a failure at say, step 2 out of 20, you have to scroll up screeds of instructions to find step 2 and click on it to reply and say "hey no it failed at this step with such and such error". I also find it quite arrogant with the way it says "why this works" when it gives an IT solution. It almost never works. I often have to go back to it and reprimand it for arrogance.