Data from the arXiv can be obtained from Kaggle and through AWS-S3.

We have extracted a corpus that allows us to mine for both structural and textual elements as well as metadata of mathematical publications deposited to the arXiv.

