We encourage friendly collaboration with colleagues, partners from abroad and students.

  • Henrik Kragh Sørensen, PI, professor, DSE
  • Mikkel Willum Johansen, associate professor, DSE
  • Josefine Pallavicini, MA, research assistant, DSE
  • Anton Kristian Suhr, MSc, research assistant, DSE
  • Mikkel Tvorup Moseholm, MA, DSE
  • Chris Søndergaard Gassner Nielsen, MSc student, MATH/DSE
  • Kristoffer Rank Rasmussen, MSc student, MATH/DSE
  • Sophie Kjeldbjerg Mathiasen, MSc
  • Laura Søvsø Thomasen, PhD, Royal Danish Library


We collaborate on international, interdisciplinary projects.

  • Hester Breman (Maastricht) and Renee Hoekzema (Oxford) on visual thinking in mathematics
  • Vincent Coumans (Nijmegen) and Ned Wontner (Amsterdam) on evaluations of definitions in mathematics
  • Irida Altman (Zürich) on evolution of Heegaard diagrams

Data sources

We can integrate data from a large number of sources:

  • Metadata, pdf-files and LaTeX sources from arXiv
  • Pdf files of research publications
  • Threads from lists on StackExchange
  • Threads in mailing lists such as FOM
  • Reviews from Mathematical Reviews (MathSciNet)
  • Public Twitter feeds
  • Publication networds from Clarivate Web Of Science


We work with a number of corpora with associated pipelines:

  • A pipeline for detecting and measuring mathematical diagrams in pdf files
  • A pipeline for extracting context-structured text from arXiv LaTeX sources
  • A pipeline for accessing metadata and reviews from the Mathematical Reviews
  • A pipeline for analysing 'threaded corpora' such as MathOverflow, mailing lists, etc.


We deploy a variety of big-data and ML-tools, including:

  • Object detection
  • POS-tagging
  • Dimension reduction (UMAP, PCA)
  • Topic modeling
  • Sentiment analysis


Most of our code is written in python, and we rely on a set of key libraries for LaTeX and XML parsing, NLP, neural networks, statistical analysis, documentation etc.


Upcoming events


Recent publications

Sørensen, Henrik Kragh, and Mikkel Willum Johansen. 2020. “Counting Mathematical Diagrams with Machine Learning.” In Diagrammatic Representation and Inference: 11th International Conference, Diagrams 2020, Tallinn, Estonia, August 24–28, 2020, edited by Ahti-Veikko Pietarinen, Peter Chapman, Leonie Bosveld-de Smet, Valeria Giardino, James Corter, and Sven Linker, 26–33. Lecture Notes in Computer Science (LNAI) 12169. Springer.

If you are interested in our work, please do not hesistate to contact us.