The Martin Heidegger Corpora

Texts for Natural Language Processing

Just the text reduced to sentences. Stripped of footnotes, page headers and footers, translators' and editors' forwards and afterwords, glossaries, indexes, Table of Contents, section titles, and other extraneous text.

I wrote some tools for creating the Heidegger corpora in Python. The tools create the copora files below from the latest pages on Removing footnotes remains a manual process. The Python code is on GitHub. The code may be of interest to anyone using the Natural Language Toolkit (NLTK) with Heidegger texts. Let me know.

The texts below are hosted on the Voyant text analysis web site.


Send comments to info at


Created 2021/12/20
Last updated 2022/2/12