I wonder how he collected that corpus. Did he collect it in one batch during a short period of time or in small batches during a long time?
Did he follow the tweets of all English users or just selected one? And which users/topics did he follow?