Deep Q networks have proven to be an easy to implement method for solving control problems in both continuous or large discrete state spaces. The action-specific deep recurrent Q network (ADRQN) introduces an intermediate LSTM layer for remembering action-observation pairs when dealing with partial observability. This post explores a compact PyTorch implementation of the ADRQN including small scale experiments on classical control tasks.