Josh Wolfe @wolfejosh VC, entrepreneur, fund manager @Lux_Capital; Chair @CiPrep Coney Island Prep; Trustee @SfiScience Santa Fe Inst; CoFounder of Carson, Quinn & Bodhi w/ @ltwolfe Jul. 13, 2019 1 min read

1/ The virtue of PLAY—for humans + machines

~10yrs ago psychologists cheered PLAY for learning, exploring, social bonding, stress management

Humans have cheered PLAY to train machines + AI:
checkers, chess, go, jeopardy, video games

—and now poker...

2/ Checkers, chess, go etc are two-person zero-sum games with Nash equilibrium strategies. Many multi-player video games + multi-player poker have more complexity with HIDDEN INFORMATION.

CMU researchers + FB published in Science on training AI to exploit opponent weakness

3/ Here is description of the goal and technique of “Pluribus” trained on self-play to beat past versions of itself—which is notable because...

4/ The (wise) technique Jeff Bezos advocates in human decision making is “regret minimization”

—most poker AIs also use algorithm of “counterfactual regret minimization”. This one uses a variation that used CPUs instead of GPUS and cost estimated $144 of compute.

5/ Full paper here:
 https://science.sciencemag.org/content/early/2019/07/10/science.aay2400/tab-pdf 


You can follow @wolfejosh.



Bookmark

____
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Enjoy Threader? Sign up.

Threader is an independent project created by only two developers. The site gets 500,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Running this space is expensive and time consuming. If you find Threader useful, please consider supporting us to make it a sustainable project.