Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Terence Tao
Terence Tao
@tao@mathstodon.xyz  ·  activity timestamp last week

This experiment (authored by several well-known mathematicians) revives an archaic practice (last seen in the era of Gauss) of posting encrypted proofs before revealing them: https://arxiv.org/abs/2602.05192 . Here, the challenge is to see whether 10 research-level problems (that arose in the course of the authors research) are amenable to modern AI tools within a fixed time period (until Feb 13).

The problems appear to be out of reach of current "one-shot" AI prompts, but were solved by human domain experts, and would presumably a fair fraction would also be solvable by other domain experts equipped with AI tools. They are technical enough that a non-domain-expert would struggle to verify any AI-generated output on these problems, so it seems quite challenging to me to have such a non-expert solve any of these problems, but one could always be surprised. It will be interesting to see if there were any notable outcomes to this experiment by the expiration of the time linit.

arXiv.org

First Proof

To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time.
  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.27 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct