-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
多卡训练&强化学习相关问题 #11
Comments
感谢关注🙏
|
比如可以在数据class定义一个buffer列表,每次__getitem__的时候从buffer取出一条数据,buffer为空就跑一下牌谱往buffer添加牌谱数据? |
好的,谢谢作者~ |
可以先试试监督学习的效果~我用了八万个半庄的对局,效果已经比较不错了。你有多卡的话可以试试增加数据量😇 |
好滴,其实之前弃牌模型和立直模型都已经训好了,我历年数据都down了,但是里面xml文件解析好多还是有bug,有些数据情况比较奇怪,我加了一些条件判断过滤掉了这些数据,每个epoch都用了大概10万个随机半庄,训了10个epoch,还没有测试效果 |
改好了,回头试一下,如果没问题的话可以pr一下→_→ |
强化学习只是训练的话自然不用前端代码 后端里面也可以忽略socket通信什么的(我是真不会写游戏,写的是一坨屎山💩。 |
想用ddp训练,似乎要改Dataset类,但以弃牌为例,一个xml文件对应很多条数据,这里要怎么改比较好?
另外代码里现在没有提供self-play强化学习训练的代码是吗?这部分添加的话要加哪些逻辑呢?
The text was updated successfully, but these errors were encountered: