I think you're misunderstanding the point this paper is trying to make. They're interested in trying to distinguish whether AI is capable of solving new math problems or only capable of identifying existing solutions in the literature. Distinguishing these two is difficult, because self-contained math problems that are easy enough for LLMs to address (e.g. minor Erdos-problems) may have been solved already as subcomponents of other work, without this widely known. So when an AI makes progress on such an Erdos problem, we don't know if it had a new idea, or correctly identified an existing but obscure answer. This issue has been dogging the claims of AI solving Erdos problems.
Instead, here you get questions that extremely famous mathematicians (Hairer, Spielman) are telling you (a) are solvable in <5 pages (b) do not have known solutions in the literature. This means that solutions from AI to these problems would perhaps give a clearer signal on what AI is doing, when it works on research math.
I find it unbelievable that this question can't be settled themselves without posting this simply by asking the AI enough novel questions. I myself have little doubt that at least they can solve some novel questions (of course similarity of proofs is a spectrum so it's hard to draw the line at how original they are)
I settle this question for myself every month: I try asking ChatGPT and Gemini for help, but in my domains it fails miserably at anything that looks new. But, YMMV, that's just the experience of one professional mathematician.
Instead, here you get questions that extremely famous mathematicians (Hairer, Spielman) are telling you (a) are solvable in <5 pages (b) do not have known solutions in the literature. This means that solutions from AI to these problems would perhaps give a clearer signal on what AI is doing, when it works on research math.