someone's doing it in Spanish:
I'm working on a 400 million word corpus of English tweets from Twitter, as well as 100-200 million from Spanish and Portuguese.
Mark Davies: Corpus Linguistics, BYU