We encourage friendly collaboration with colleagues, partners from abroad, and students.

  • Henrik Kragh Sørensen, PI, professor, DSE
  • Mikkel Willum Johansen, associate professor, DSE
  • Josefine Pallavicini, MA, research assistant, DSE
  • Anton Kristian Suhr, MSc, research assistant, DSE
  • Mikkel Tvorup Moseholm, MA, DSE
  • Chris Søndergaard Gassner Nielsen, MSc student, MATH/DSE
  • Stefan Gottlieb Kramer, BSc student, DIKU/DSE
  • Cæcilie Bøje Pedersen, BSc student, MATH/DSE
  • Kristoffer Rank Rasmussen, MSc student, MATH/DSE
  • Sophie Kjeldbjerg Mathiasen, MSc
  • Laura Søvsø Thomasen, PhD, Royal Danish Library


We collaborate on international, interdisciplinary projects.

  • Hester Breman (Maastricht) and Renee Hoekzema (Oxford) on visual thinking in mathematics
  • Vincent Coumans (Nijmegen) and Ned Wontner (Amsterdam) on evaluations of definitions in mathematics
  • Irida Altman (Zürich) on evolution of Heegaard diagrams

If you are interested in collaboration, see our Contributor- and Authorship guidelines and Contact Us.

Data sources

We can integrate data from a large number of sources:

  • Metadata, pdf-files and LaTeX sources from arXiv
  • Pdf files of research publications
  • Threads from lists on StackExchange
  • Threads in mailing lists such as FOM
  • Reviews from Mathematical Reviews (MathSciNet)
  • Public Twitter feeds
  • Publication networds from Clarivate Web Of Science


We work with a number of corpora with associated pipelines:

  • A pipeline for detecting and measuring mathematical diagrams in pdf files
  • A pipeline for extracting context-structured text from arXiv LaTeX sources
  • A pipeline for accessing metadata and reviews from the Mathematical Reviews
  • A pipeline for analysing 'threaded corpora' such as MathOverflow, mailing lists, etc.


We deploy a variety of big-data and ML-tools, including:

  • Object detection
  • POS-tagging and linguistic features
  • Dimension reduction (UMAP, PCA)
  • Topic modeling
  • Sentiment analysis


Most of our code is written in python, and we rely on a set of key libraries for LaTeX and XML parsing, NLP, neural networks, statistical analysis, documentation etc.


Recent publications

Pence, Charles, and Henrik Kragh Sørensen. 2022. “Extending Ourselves? On the Concept and Future of Digital Humanities.” SPSP Newsletter 17 (June).

Johansen, Mikkel Willum, and Josefine Lomholt Pallavicini. 2022. “Entering the Valley of Formalism: Trends and Changes in Mathematicians’ Publication Practice — 1885 to 2015.” Synthese 200 (3).

If you are interested in our work, please do not hesistate to contact us.