Is Chess a Language ? Lets Test.
In my last post (https://firstname.lastname@example.org/what-do-you-speak-chess-8620905dd18a), I proposed the idea of considering chess moves of a game as a natural language. This time, I’ll describe my process of testing the hypothesis. To test, I solve a problem; stated as follows,
Given a list of chess moves of a game, can the result of the game be predicted ?
Input: Nf3 d5 d4 Nf6 c4 c6 Nc3 dxc4 a4 Bf5 e3 e6 Bxc4 Bb4 O-O…
Output: 1 or 0 (1: Win, 0:Loss)
The form of this problem is similar to sentiment analysis, so I use the same code here.
The first step is to mine the data. I used the data from http://ccrl.chessdom.com/ccrl/
The first step was the mine the data. PGN files are well structured, so extracting sequences wasn’t very difficult. Lines either start with ‘[’, ‘\n’or a number. Using this idea.. I wrote the following code
Then I separate the game and its outcomes and remove all unnecessary details from moves of a game. The game string is converted into a list of move(similar to words in Natural language). I’ve only considered games that results in either a win or a loss. This makes the outcome binary.
Now we have the game sequence and it’s corresponding outcome.
If you notice games ‘0’ ,‘1’ and ‘3’, ‘4’ looks similar. It’s because many chess generally starts with the same sequence of steps. You can consider it as a ‘greeting’ part of a conversation.
The next step is pre processing.
First, convert the string tokens to numeral form. The next step is to convert all the chess sequences of constant length which allows parallel operations in a batch making the training faster. This raised many red flags for me. But I tested with different game length and with both post and pre padding to test if the outcomes are specific to a game state(open, end). While the outcome is not so much dependent on the post/pre padding but (obviously) dependent on the length of game that is taken into consideration.
The first operation converts the moves into numerals, and the second converts the all games to a fixed length.
The next step to partition data into training, test and validation set.
The validation set will help to detect any over fitting between epochs. However while training, I saw that the the data would over fit very easily. Progressing just to the 2nd epoch results in a spike in validation set loss value (I’m not sure of the exact cause, but I think chess rules are much more simpler to learn than rules of natural language).
This takes care of all the pre processing, we now have everything to train a model.
Since the data is sequential, in order to learn the temporal dependencies, I use recurrent LSTM networks to model the data.
As for the depth of the network and width of each layer, I simply determined by playing around with different configurations until the model started learning properly.
Time to fit the model onto the data.
The best accuracy that I got so far was 87.53 % after training on a data set of 392430 games for 1 epoch. I chipped the games at 100 moves. The network consisted of 1 embedding layer(32), and 4 hidden layers with LSTM nodes. (512, 512, 256, 256) and 1 node at output layer (sigmoid)
Observations & Conclusion:
- The result suggests that the language models can be applied to chess sequences. Next up: Generative language model on chess.
- The length of game considered is directly related to the accuracy of the model. This model can be used to give real time prediction of a game using the probability generated in the output layer.
My end goal is to generate valid moves but the model itself can be used in a chess engine.
In a standard chess engine, the game tree is generated and the leaf nodes are evaluated using a board evaluation function. This function is a linear combination of heuristic values for different attributes of a board.
Instead of employing the this function, the model can be used to score the leaf nodes, by using the sequence of steps till the leaf node in the game tree and continue on with Alpha-Beta. Here the scores represent the probability of winning.
In my next post, I will try to generate/predict moves.
Until next time. If you have any questions, comments or suggestions, leave them below.