AlphaZero 다음은 MuZero, 그 다음은 ?

bory.io 2021. 2. 28. 08:50

AlphaGo	Neural Networks + Tree Search Knowledge : Human data + Domain knowledge + Known rules
AlphaGo Zero	Human knowledge 없이 자체적으로 학습 Knowledge : Known rules
AlphaZero	모든 게임에 적용할 수 있는 알고리즘 이용 Knowledge : 없음
MuZero	Value, Policy, Reward로 환경요소를 모델링 Knowledge : 없음

결론 : MuZero 알고리즘이 공개되어, 이제는 규칙을 만들 필요없는 세계를 대비해야 한다. 모든 소프트웨어, 모든 하드웨어 등에 근본적인 변화가 예상된다.

관련 코드 : The Arcade Learning Environment is available open source at https://github.com/mgbellemare/Arcade-Learning-Environment. The Go and chess environments are available open source in OpenSpiel at https://github.com/deepmind/open_spiel. The pseudocode for the MuZero algorithm can be found in the file pseudocode.py in the Supplementary Information. All the neural architecture details and hyperparameters are described in Methods.

인공지능 음성, 언어, 영상 분석/처리 전문기업 bory.io/