Josh Wolfe+ Your Authors @wolfejosh Co-Founder @Lux_Capital Trustee @SfiScience Santa Fe Inst Chair @CiPrep Coney Island Prep (Brooklyn) Co-Founder of Carson, Quinn & Bodhi w/ @ltwolfe Jul. 13, 2019 1 min read + Your Authors

1/ The virtue of PLAY—for humans + machines

~10yrs ago psychologists cheered PLAY for learning, exploring, social bonding, stress management

Humans have cheered PLAY to train machines + AI:
checkers, chess, go, jeopardy, video games

—and now poker...

2/ Checkers, chess, go etc are two-person zero-sum games with Nash equilibrium strategies. Many multi-player video games + multi-player poker have more complexity with HIDDEN INFORMATION.

CMU researchers + FB published in Science on training AI to exploit opponent weakness

3/ Here is description of the goal and technique of “Pluribus” trained on self-play to beat past versions of itself—which is notable because...

4/ The (wise) technique Jeff Bezos advocates in human decision making is “regret minimization”

—most poker AIs also use algorithm of “counterfactual regret minimization”. This one uses a variation that used CPUs instead of GPUS and cost estimated $144 of compute.

5/ Full paper here:
 https://science.sciencemag.org/content/early/2019/07/10/science.aay2400/tab-pdf 


You can follow @wolfejosh.



Bookmark

____
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Enjoy Threader? Sign up.

Since you’re here...

... we’re asking visitors like you to make a contribution to support this independent project. In these uncertain times, access to information is vital. Threader gets 1,000,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Your financial support will help two developers to keep working on this app. Everyone’s contribution, big or small, is so valuable. Support Threader by becoming premium or by donating on PayPal. Thank you.