Comments Page - AlphaProof's Greatest Hits: the coolest bits from the IMO 2024 solutions

« Back AlphaProof's Greatest Hits: the coolest bits from the IMO 2024 solutionsrishimehta.xyzSubmitted by rishicomplex 5 hours ago

sbierwagen 4 hours ago
More information about the language used in the proofs: https://en.wikipedia.org/wiki/Lean_(proof_assistant)
sincerely 4 hours ago
in the first question, why do they even specify ⌊n⌋ (and ⌊2n⌋ and so on) when n is an integer?
- rishicomplex 4 hours ago
  Alpha need not be an integer, we have to prove that it is
  sincerely 3 hours ago
  Should have read more carefully, thank you!
Robotenomics 3 hours ago
“Only 5/509 participants solved P6”
- nybsjytm 2 hours ago
  This has to come with an asterisk, which is that participants had approximately 90 minutes to work on each problem while AlphaProof computed for three days for each of the ones it solved. Looking at this problem specifically, I think that many participants could have solved P6 without the time limit.
  (I think you should be very skeptical of anyone who hypes AlphaProof without mentioning this - which is not to suggest that there's nothing there to hype)
  auggierose 2 hours ago
  Certainly an interesting information that AlphaProof needed three days. But does it matter for evaluating the importance of this result? No.
  nybsjytm an hour ago
  I agree that the result is important regardless. But the tradeoff of computing time/cost with problem complexity is hugely important to think about. Finding a proof in a formal language is trivially solvable in theory since you just have to search through possible proofs until you find one ending with the desired statement. The whole practical question is how much time it takes.
  Three days per problem is, by many standards, a 'reasonable' amount of time. However there are still unanswered questions, notably that 'three days' is not really meaningful in and of itself. How parallelized was the computation; what was the hardware capacity? And how optimized is AlphaProof for IMO-type problems (problems which, among other things, all have short solutions using elementary tools)? These are standard kinds of critical questions to ask.
  dash2 an hour ago
  Though, if you start solving problems that humans can't or haven't solved, then questions of capacity won't matter much. A speedup in the movement of the maths frontier would be worth many power stations.
  nybsjytm an hour ago
  For some time a 'superhuman math AI' could be useful for company advertising and getting the attention of VCs. But eventually it would be pretty clear that innovative math research, with vanishingly few exceptions, isn't very useful for making revenue. (I am a mathematician and this is meant with nothing but respect for math research.)
wslh 4 hours ago
If you were to bet on solving problems like "P versus NP" using these technologies combined with human augmentation (or vice versa), what would be the provable time horizon for achieving such a solution? I think we should assume that the solution is also expressible in the current language of math/logic.
- uptownfunk 3 hours ago
  The hard part is in the creation of new math to solve these problems not in the use of existing mathematics. So new objects (groups rings fields) etc have to be theorized, their properties understood, and then that new machinery used to crack the existing problems. I think we will get to a place (around 5 years) where AI will be able to solve these problems and create these new objects. I don’t think it’s one of technology I think it’s more financial. Meaning, there isn’t much money to be made doing this (try and justify it for yourself) and so the lack of focus here. I think this is a red herring and there is a gold mine in there some where but it will likely take someone with a lot of cash to fund it out of passion (Vlad Tenev / Harmonic, or Zuck and Meta AI, or the Google / AlphaProof guys) but in the big tech world, they are just a minnow project in a sea of competing initiatives. And so that leaves us at the mercy of open research, which if it is a compute bound problem, is one that may take 10-20 years to crack. I hope I see a solution to RH in my lifetime (and in language that I can understand)
  wslh 3 hours ago
  I understand that a group of motivated individuals, even without significant financial resources, could attempt to tackle these challenges, much like the way free and open-source software (FOSS) is developed. The key ingredients would be motivation and intelligence, as well as a shared passion for advancing mathematics and solving foundational problems.
  uptownfunk an hour ago
  Ok but how do you get around needing a 10k or 100k h100 cluster
  wslh 38 minutes ago
  It is well known that cloud services like Google Cloud subsidizes some projects and we don't even know if in a few years improvements will arise.
- zarzavat 3 hours ago
  Probably a bad example, P vs NP is the most likely of the millennium problems to be unsolvable, so the answer may be "never".
  I'll bet the most technical open problems will be the ones to fall first. What AIs lack in creativity they make up for in ability to absorb a large quantity of technical concepts.
  wslh 3 hours ago
  Thank you for the response. I have a follow-up question: Could these AIs contribute to advancements in resolving the P vs NP problem? I recall that the solution to Fermat’s Last Theorem relied on significant progress in elliptic curves. Could we now say that these AI systems might play a similar role in advancing our understanding of P vs NP?
  ccppurcell 14 minutes ago
  Just my guess as a mathematician. But if LLMs are good for anything it will be for finding surprising connections and applying our existing tools in ways beyond human search. There's a huge space of tools and problems, and human intuition and brute force searching can only go so far. I can imagine that LLMs might start to find combinatorial proofs of topological theorems, maybe even novel theorems. Or vice versa. But I find it difficult to imagine them inventing new tools and objects that are really useful.
- hiddencost 4 hours ago
  No one is focused on those. They're much more focused on more rote problems.
  You might find them used to accelerate research math by helping them with lemmas and checking for errors, and formalizing proofs. That seems realistic in the next couple of years.
  nybsjytm 2 hours ago
  There are some AI guys like Christian Szegedy who predict that AI will be a "superhuman mathematician," solving problems like the Riemann hypothesis, by the end of 2026. I don't take it very seriously, but that kind of prognostication is definitely out there.
nybsjytm 2 hours ago
Why have they still not released a paper aside from a press release? I have to admit I still don't know how auspicious it is that running google hardware for three days apiece was able to find half-page long solutions, given that the promise has always been to solve the Riemann hypothesis with the click of a button. But of course I do recognize that it's a big achievement relative to previous work in automatic theorem proving.
- whatshisface an hour ago
  I don't know why so few people realize this, but by solving any of the problems their performance is superhuman for most reasonable definitions of human.
  Talking about things like solving the Reimman hypothesis in so many years assumes a little too much about the difficulty of problems that we can't even begin to conceive of a solution for. A better question is what can happen when everybody has access to above average reasoning. Our society is structured around avoiding confronting people with difficult questions, except when they are intended to get the answer wrong.
  zeroonetwothree an hour ago
  Well, multiply two large numbers instantly is a superhuman feat a calculator can do. I would hope we are going for a higher bar, like “useful”. Let’s see if this can provide proofs of novel results.
  rowanG077 5 minutes ago
  That superhuman capability of "multiplying two large numbers instantly" has transformed human society like not even plumbing has. I really can't see how this you could not consider this useful.
  whatshisface 27 minutes ago
  The ability to multiply two numbers superhumanly has already transformed society.
  nybsjytm an hour ago
  It's worth emphasizing that it's been possible for years to use an automatic theorem prover to prove novel results. The whole problem is to get novel interesting results.
  GregarianChild an hour ago
  We know that any theorem that is provable at all (in the chosen foundation of mathematics) can be found by patiently enumerating all possible proofs. So, in order to evaluate AlphaProof's achievements, we'd need to know how much of a shortcut AlphaProof achieved. A good proxy for that would be the total energy usage for training and running AlphaProof. A moderate proxy for that would be the number of GPUs / TPUs that were run for 3 days. If it's somebody's laptop, it would be super impressive. If it's 1000s of TPUs, then less so.
  Onavo an hour ago
  > We know that any theorem that is provable at all (in the chosen foundation of mathematics) can be found by patiently enumerating all possible proofs.
  Which computer science theorem is this from?
  kadoban an hour ago
  It's a direct consequence of the format of a proof. They're finite-length sequences of a finite alphabet of symbols, so of course they're enumerable (the same algorithm you use to count works to enumerate them).
  If you want a computer to be able to tell that it found a correct proof once it enumerates it, you need a bit more than that (really just the existence of automated proof checkers is enough for that).
  E_Bfx 16 minutes ago
  I guess it is tautological from the definition of "provable". A theorem is provable by definition if there is a finite well-formulated formula that has the theorem as consequence (https://en.wikipedia.org/wiki/Theorem paragraph theorem in logic)
  zeroonetwothree an hour ago
  It’s just an obvious statement. If a proof exists, you will eventually get to it.
  nybsjytm an hour ago
  > A better question is what can happen when everybody has access to above average reasoning. Our society is structured around avoiding confronting people with difficult questions, except when they are intended to get the answer wrong.
  What does this have to do with a hypothetical automatic theorem prover?
  whatshisface 25 minutes ago
  Logic is pretty much absent from our culture and daily life, but that could be due to its limited supply.
  nybsjytm 18 minutes ago
  Being logical in social life is pretty much completely different from being logical in a mathematical argument, especially in a formal theorem proving environment. (Just try to write any kind of cultural proposition in a formal language!)
- GregarianChild an hour ago
  Google stopped publishing interesting AI work since they had their AI lead taken away by OpenAI, and mostly with tech that was pioneered, but not monetised by Google like transformers.
  I imagine they are under pressure not to make this mistake again.