From a Smoking Gun to Spent Fuel: Principled Subsampling Methods for Building Big Language Data Corpora from Monitor Corpora Peer-Reviewed Articles by jacquehetteltidwellApril 5, 2019