HomeExploreself-play

Self Play Collection

Repositories tagged with "self-play"

RARE

TCG-style cards with ATK/DEF/SPD stats

UNCOMMON

⭐4.5kHP

◆

📦Normal

★★

alpha-zero-general

suragnair

Jupyter Notebookalpha-zeroalphago

“A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more”

★

4.5k

1.2k

4.5k

1.2k forks

ATK

DEF

SPD

GitPedia #446

2/5

View wiki →𝕏

GitPedia

Repository Card

UNCOMMON

★

4.5k

1.2k

4.5k

UNCOMMON

⭐3.6kHP

◆

🔮Psychic

★★

DI-engine

opendilab

Pythonataridistributed-reinforcement-learning

“OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.”

★

3.6k

436

3.6k

436 forks

ATK

DEF

SPD

GitPedia #716

2/5

View wiki →𝕏

GitPedia

Repository Card

UNCOMMON

★

3.6k

436

3.6k

UNCOMMON

⭐1.6kHP

◆

🔮Psychic

★★

LightZero

opendilab

Pythonalpha-beta-pruningalphazero

“[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)”

★

1.6k

193

1.6k

193 forks

ATK

DEF

SPD

GitPedia #650

2/5

View wiki →𝕏

GitPedia

Repository Card

UNCOMMON

★

1.6k

193

1.6k

UNCOMMON

⭐1.4kHP

◆

🔮Psychic

★★

DI-star

opendilab

Pythonartificial-intelligencedeep-learning

“An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.”

★

1.4k

125

1.4k

125 forks

ATK

DEF

SPD

GitPedia #572

2/5

View wiki →𝕏

GitPedia

Repository Card

UNCOMMON

★

1.4k

125

1.4k

UNCOMMON

⭐1.2kHP

◆

🔮Psychic

★★

SPIN

uclaml

Pythondeep-learningfine-tuning

“The official implementation of Self-Play Fine-Tuning (SPIN)”

★

1.2k

105

1.2k

105 forks

ATK

DEF

SPD

GitPedia #704

2/5

View wiki →𝕏

GitPedia

Repository Card

UNCOMMON

★

1.2k

105

1.2k

UNCOMMON

⭐588HP

◆

🔮Psychic

★★

SPPO

uclaml

Pythondeep-learningfine-tuning

“The official implementation of Self-Play Preference Optimization (SPPO)”

★

588

48 forks

ATK

DEF

SPD

GitPedia #728

2/5

View wiki →𝕏

GitPedia

Repository Card

UNCOMMON

★

588

COMMON

⭐364HP

◆

🔮Psychic

★

TimeChamber

inspirai

Pythondeep-reinforcement-learningisaac-gym

“A Massively Parallel Large Scale Self-Play Framework”

★

364

38 forks

ATK

DEF

SPD

GitPedia #067

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

364

COMMON

⭐196HP

◆

🔮Psychic

★

spiral

spiral-rl

Pythonlarge-language-modelsmulti-agent-reinforcement-learning

“SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning”

★

196

22 forks

ATK

DEF

SPD

GitPedia #203

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

196

COMMON

⭐167HP

◆

⚡Thunder

★

osrs-pvp-reinforcement-learning

Naton1

Javaartificial-intelligencedeep-learning

“Train a neural network to PvP in Old School RuneScape using reinforcement learning.”

★

167

63 forks

ATK

DEF

SPD

GitPedia #056

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

167

COMMON

⭐154HP

◆

📦Normal

★

gym-continuousDoubleAuction

ChuaCheowHuan

Jupyter Notebookdouble-auctionfinancial-engineering

“A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.”

★

154

31 forks

ATK

DEF

SPD

GitPedia #389

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

154

COMMON

⭐103HP

◆

🔮Psychic

★

SSP

Alibaba-Quark

Pythonagentalibaba

“Search Self-Play: Pushing the Frontier of Agent Capability without Supervision”

★

103

8 forks

ATK

DEF

SPD

GitPedia #631

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

103

COMMON

⭐91HP

◆

🔮Psychic

★

alpha-zero

blanyal

Pythonalpha-zeroalphago-zero

“AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm" by DeepMind.”

★

28 forks

ATK

DEF

SPD

GitPedia #422

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

COMMON

⭐65HP

◆

🔮Psychic

★

alpha-zero-general

cestpasphoto

Pythonalphagoalphago-zero

“A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available”

★

22 forks

ATK

DEF

SPD

GitPedia #927

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

COMMON

⭐58HP

◆

🔮Psychic

★

gym-backgammon

dellalibera

Pythonartificial-intelligencebackgammon

“Backgammon OpenAI Gym”

★

16 forks

ATK

DEF

SPD

GitPedia #567

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

COMMON

⭐57HP

◆

🔮Psychic

★

football-paris

seungeunrho

Pythongfootballkaggle

“The exact codes used by the team "liveinparis" at the kaggle football competition ranked 6th/1141”

★

12 forks

ATK

DEF

SPD

GitPedia #816

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

COMMON

⭐53HP

◆

🔮Psychic

★

MARSHAL

thu-nics

Pythonagentllm

“[ICLR'26] MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs”

★

3 forks

ATK

DEF

SPD

GitPedia #139

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★

COMMON

⭐52HP

◆

🔮Psychic

★

td-gammon

dellalibera

Pythonartificial-intelligencebackgammon

“TD-Gammon implementation”

★

12 forks

ATK

DEF

SPD

GitPedia #696

1/5

View wiki →𝕏

GitPedia

Repository Card

COMMON

★