How to Generate a Table of Contents in Docx with Python

Tables of contents (TOCs) are an essential part of long documents, providing a quick overview of the document's structure and enabling easy navigation. While most Office software packages offer built-in TOC generation mechanisms, they can be difficult to call from code.

This blog post will show you how to generate a TOC in Docx with Python. We will use the following steps:

  1. Create a TOC placeholder in the Docx document.
  2. Refresh the TOC.

1. Create a TOC placeholder in the Docx document

The first step is to create a TOC placeholder in the Docx document. This can be done using the following Python code:

We will use the python-docx package.

pip install python-docx
from docx.oxml.ns import qn
from docx.oxml import OxmlElement

def insert_toc(d, levels="1-3"):
      Insert "Table of Contents" to Document


      d: Document Object

      levels: string
              default "1-3"
      根据 addheading 更新目录
      sdt = OxmlElement('w:sdt')
      sdtpr = OxmlElement('w:sdtPr')
      docpartobj = OxmlElement('w:docPartObj')
      docpartgallery = OxmlElement('w:docPartGallery')
      docpartgallery.set(qn('w:val'), 'Table of Contents')
      docpartunique = OxmlElement('w:docPartUnique')
      docpartunique.set(qn('w:val'), 'true')

      sdtcontent = OxmlElement('w:sdtContent')

      p = OxmlElement('w:p')
      r = OxmlElement('w:r')
      t = OxmlElement('w:t')
      t.text = 'Contents'

      fldChar = OxmlElement('w:fldChar')  # creates a new element
      fldChar.set(qn('w:fldCharType'), 'begin')  # sets attribute on element
      instrText = OxmlElement('w:instrText')
      instrText.set(qn('xml:space'), 'preserve')  # sets attribute on element
      instrText.text = f'TOC \\o "{levels}" \\h \\z \\u'   # change 1-3 depending on heading levels you need

      fldChar2 = OxmlElement('w:fldChar')
      fldChar2.set(qn('w:fldCharType'), 'separate')
      # fldChar3 = OxmlElement('w:t')
      # fldChar3.text = "Right-click to update field."
      fldChar3 = OxmlElement('w:updateFields')
      fldChar3.set(qn('w:val'), 'true')

      fldChar4 = OxmlElement('w:fldChar')
      fldChar4.set(qn('w:fldCharType'), 'end')

      p2 = OxmlElement('w:p')
      r2 = OxmlElement('w:r')

      d._element.body.insert_element_before(sdt, *('w:sectPr',))

      return d  


2. Generate the TOC

We now have a TOC section in our document, but it's empty. How do we refresh the TOC? There are two ways to do this:

2.1. Method 1: Using a LibreOffice macro

  1. Create Macro Module
REM  *****  BASIC  *****

 Option Explicit

 Sub UpdateIndexes(path As String)
     '''Update indexes, such as for the table of contents''' 
     Dim doc As Object
     Dim args()

     doc = StarDesktop.loadComponentFromUrl(convertToUrl(path), "_default", 0, args())

     Dim i As Integer

     With doc ' Only process Writer documents
         If .supportsService("") Then
             For i = 0 To .getDocumentIndexes().count - 1
             Next i
         End If
     End With ' ThisComponent

 End Sub ' UpdateIndexes  


  1. Import the Macro Module
$ mv ~/.config/libreoffice/4/user/basic ~/basic_backup
$ cp basic ~/.config/libreoffice/4/user/ -r  
  1. Run the Command
$ soffice --headless "macro:///Standard.YourModuleName.UpdateIndex(/path/to/file.odt)"

2.2. Method 2: Using Unoserver

  1. Pull the Unoserver Docker Image
docker pull chanmo/unoserver
  1. Run the Unoserver container
docker run -p 5000:5000 chanmo/unoserver
  1. Update the TOC Using HTTPie
http -f POST :5000/convert/docx file@/path/to/demo.docx -o demo.docx

The disadvantage of using a LibreOffice macro is that the server needs to have LibreOffice installed. If you don't want to install LibreOffice, you can use the Unoserver method instead.