:title: Fixing Sphinx Search :date: 2025-11-24 :tags: sphinx :identifier: 20251124T230046 :signature: 1=3 Fixing Sphinx Search ==================== When transitioning over to using denote-style filenames for this site, I :ref:`changed ` the output filenames generated by Sphinx so that corresponding URLs were shorter and more concise. This unfortunately had the side-effect of breaking searches since the search results were still using the true docnames! Finding the Problem ------------------- After playing with the search results page and reading some of the Sphinx code I eventually found out the following. - The search index is stored in a JavaScript file ``searchindex.js`` - The search index is mostly incomprehensible to me... *but* there is a field called ``docnames`` which contained the true docnames - The search index is built using the ``IndexBuilder`` in the ``sphinx.search`` module. Implementing a Fix ------------------ .. warning:: This involves monkey-patching the ``IndexBuilder`` object which is definitely **not** part of Sphinx's public API, you should expect this to break eventually! To fix the search results we "just" have to update the docnames so that they are consistent with the ones we generated as part of the build. It took me a few attempts but I eventually came up with the following .. code-block:: python class DenoteHTMLBuilder(DirectoryHTMLBuilder): ... def dump_search_index(self): if (builder := self.indexer) is not None: builder.freeze = partial(rewrite_indexed_docnames, freeze=builder.freeze) super().dump_search_index() Here I'm just overwriting the method responsible for generating the data written to ``searchindex.js``, where ``rewrite_indexed_docnames`` is defined as .. code-block:: python def rewrite_indexed_docnames(*, freeze: Callable[[], dict[str, Any]]) -> dict[str, Any]: """Rewrite the docname in the search index so that they align to urls generated by the DenoteHTMLBuilder.""" index = freeze() docnames = [] for docname in index["docnames"]: if not docname.startswith("content/"): docnames.append(docname) continue if (record := Record.parse(docname.replace("content/", ""))) is None: docnames.append(docname) continue docnames.append(record.url) index["docnames"] = tuple(docnames) return index