{"id":47718,"date":"2026-04-11T15:03:04","date_gmt":"2026-04-11T15:03:04","guid":{"rendered":"https:\/\/foreignnewstoday.com\/?p=47718"},"modified":"2026-04-11T15:03:04","modified_gmt":"2026-04-11T15:03:04","slug":"ai-models-are-terrible-at-betting-on-soccer-especially-xai-grok","status":"publish","type":"post","link":"https:\/\/foreignnewstoday.com\/?p=47718","title":{"rendered":"AI models are terrible at betting on soccer\u2014especially xAI Grok"},"content":{"rendered":"<p><br \/>\n<br \/><\/p>\n<div>\n<p>\u201cEvery frontier model we evaluated lost money over the season and many experienced ruin,\u201d the authors of the paper concluded, with the AI \u201csystematically underperforming humans\u201d in this scenario.<\/p>\n<div class=\"table-wrapper\">\n<div class=\"pcrstb-wrap\"><table>\n<tbody>\n<tr>\n<th>AI Model<\/th>\n<th>Mean ROI<\/th>\n<th>Best try<\/th>\n<th>Worst try<\/th>\n<th>Mean final bankroll<\/th>\n<\/tr>\n<tr>\n<td>Anthropic Claude Opus 4.6<\/td>\n<td>\u201311.0%<\/td>\n<td>\u20130.2%<\/td>\n<td>\u201318.8%<\/td>\n<td>\u00a389,035<\/td>\n<\/tr>\n<tr>\n<td>OpenAI GPT-5.4<\/td>\n<td>\u201313.6%<\/td>\n<td>\u20134.1%<\/td>\n<td>\u201331.6%<\/td>\n<td>\u00a386,365<\/td>\n<\/tr>\n<tr>\n<td>Google Gemini 3.1 Pro<\/td>\n<td>\u201343.3%<\/td>\n<td>+33.7%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u00a356,715<\/td>\n<\/tr>\n<tr>\n<td>Google Gemini Flash 3.1 LP<\/td>\n<td>\u201358.4%<\/td>\n<td>+24.7%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u00a341,605<\/td>\n<\/tr>\n<tr>\n<td>Z.AI GLM-5<\/td>\n<td>\u201358.8%<\/td>\n<td>\u201314.3%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u00a341,221<\/td>\n<\/tr>\n<tr>\n<td>Moonshot Kimi K2.5<\/td>\n<td>\u201368.3%<\/td>\n<td>\u201327.0%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u00a37,420<\/td>\n<\/tr>\n<tr>\n<td>xAI Grok 4.20<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u00a30<\/td>\n<\/tr>\n<tr>\n<td>Acree Trinity<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u2013100.0%<\/td>\n<td>\u00a30<\/td>\n<\/tr>\n<tr>\n<td colspan=\"5\"><em>Each model began with a \u00a3100,000 normalized bankroll. Return on investment and final bankroll are averaged across three tries. Grok and Trinity did not complete every attempt.<\/em><\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/div>\n<\/div>\n<p>The results offer some comfort to white-collar professionals and businesses who are fretting that AI could take their jobs, as it roils the shares of industries from finance to marketing.<\/p>\n<p>Ross Taylor, one of the study\u2019s authors and General Reasoning\u2019s chief executive, said: \u201cThere is so much hype about AI automation, but there\u2019s not a lot of measurement of putting AI into a longtime horizon setting.\u201d<\/p>\n<p>He added that many of the benchmarks typically used to test AI are flawed because they are set in \u201cvery static environments\u201d that bear little resemblance to the chaos and complexity of the real world.<\/p>\n<p>General Reasoning\u2019s paper, which has not yet been peer reviewed, provides a counterweight to growing excitement in Silicon Valley about the huge recent leaps in AI\u2019s ability to complete computer programming tasks with little to no human intervention.<\/p>\n<p>Taylor, a former Meta AI researcher, said: \u201cIf you\u2026\u2009try AI on some real-world tasks, it does really badly\u2026\u2009Yes, software engineering is very important and economically valuable, but there are lots of other activities with longer time horizons that are important to look at.\u201d<\/p>\n<p><em><a href=\"https:\/\/www.ft.com\/\">\u00a9 2026 The Financial Times Ltd<\/a>. <a href=\"https:\/\/www.ft.com\">All rights reserved<\/a>. Not to be redistributed, copied, or modified in any way.<\/em><\/p>\n<\/p><\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/arstechnica.com\/ai\/2026\/04\/ai-models-are-terrible-at-betting-on-soccer-especially-xai-grok\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u201cEvery frontier model we evaluated lost money over the season and many experienced ruin,\u201d the authors of the paper concluded, with the AI \u201csystematically underperforming humans\u201d&hellip;<\/p>\n","protected":false},"author":1,"featured_media":47719,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[32],"tags":[],"class_list":["post-47718","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=\/wp\/v2\/posts\/47718","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=47718"}],"version-history":[{"count":0,"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=\/wp\/v2\/posts\/47718\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=\/wp\/v2\/media\/47719"}],"wp:attachment":[{"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=47718"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=47718"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/foreignnewstoday.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=47718"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}