« BackFineWeb2: Adapting Pre-Training Data Processing to Every Languagearxiv.orgSubmitted by hynky 2 days ago