Threads Taxonomy

threadsdatataxonomy

Threads Taxonomy β€” Sub-Tag Treemap & Audit

Page: /glyphary/threads/taxonomy Source file: src/pages/glyphary/threads/taxonomy.astro Data source: post-tags.json (tag_distribution, sub_tag_distribution) Corpus: 37,912 posts classified into 20 primary tags and 35 sub-tags


Overview

The Taxonomy page provides a spatial and tabular decomposition of the 20-tag discourse classification system. Where the main Threads page shows tags in time series, this page freezes time and examines the taxonomic structure itself: how large each category is, how sub-tags nest within parents, and what percentage of posts in each category have been sub-classified.

The page title is β€œThe Taxonomy of Discourse.”


Three-tab inline navigation shared across all three sub-pages:

  • Taxonomy (active) β€” This page
  • Network β€” Knowledge graph (/glyphary/threads/network)
  • Discourse β€” 9-category deep-dive (/glyphary/threads/discourse)

Section 1: Territorial Map

Visualization: Squarified treemap (SVG, 720x480)

The primary visualization. Each of the 20 discourse tags is rendered as a colored rectangle with area proportional to post count. The squarified layout algorithm minimizes aspect ratios so no rectangle is excessively elongated.

Nested sub-tags: Within each parent rectangle, sub-tag bands are drawn as smaller nested rectangles with higher opacity. Labels show sub-tag names and counts where space permits. Tags without sub-classifications show only the parent block.

Technical details:

  • Custom squarified treemap implementation in the frontmatter (not a library)
  • Tags sorted descending by count before layout
  • Color mapping from the 20-tag TAG_COLORS palette
  • Sub-tag nesting uses a stacked vertical layout within each parent cell
  • Labels truncated with ellipsis when cell width is too narrow

Section 2: Sub-Classification Distribution

Visualization: Horizontal bar chart (SVG, 700px wide)

All 35 sub-tags rendered as horizontal bars, grouped by parent tag. Each group is separated by a header line showing the parent tag name. Bars show:

  • Sub-tag label (left)
  • Absolute post count (right of bar)
  • Percentage relative to parent tag total (right)

This chart answers the question: within each parent category, how are posts distributed across sub-classifications?

Technical details:

  • Row height: 22px per sub-tag
  • Left padding: 180px for labels
  • Bar width proportional to count, scaled to the maximum across all sub-tags
  • Groups separated by parent tag headers with colored dot indicators

Section 3: Classification Audit

Visualization: HTML table with inline bar chart

A coverage audit showing how completely each parent tag has been sub-classified. Columns:

ColumnDescription
CategoryParent tag name with colored dot
Total PostsTotal posts in this primary tag
Sub-TaggedNumber of posts that received at least one sub-tag
Sub-TagsCount of distinct sub-tags defined for this category
CoveragePercentage of posts sub-classified, with inline bar

Coverage percentage indicates taxonomic resolution β€” how much of the parent category is broken down into meaningful sub-classifications. Categories with 0% coverage have no sub-tags defined.


Section 4: Sub-Tag Compendium

Visualization: Repeating detail cards (one per parent tag with sub-tags)

For each parent tag that has sub-classifications, a detail card displays:

  1. Header: Parent tag name with colored dot and total post count
  2. Mini bar chart: Percentage distribution of sub-tags within the parent, rendered as a horizontal stacked bar
  3. Sub-tag list: Each sub-tag with:
    • Label (italicized, left-aligned)
    • Absolute count (monospace, right-aligned)
    • Percentage of parent (monospace, right-aligned)

This section provides the detailed breakdown that the treemap and bar chart summarize visually.


Data Pipeline

The taxonomy data flows from:

threads/posts.json (raw)
  β†’ information-theory.mjs (20-tag LLM classification)
  β†’ sub-classifiers.mjs (9 parents β†’ 35 sub-tags via keyword regex)
  β†’ post-tags.json (tag_distribution + sub_tag_distribution)
  β†’ taxonomy.astro (treemap + bars + audit table)

The 35 sub-tags are generated by sub-classifiers.mjs which applies keyword regex patterns to 9 parent categories. Not all 20 tags have sub-classifications β€” only the 9 with defined regex classifiers do.


Technical Notes

  • All visualizations are pure SVG generated at build time
  • The squarified treemap is a custom implementation, not a D3 or library dependency
  • No client-side JavaScript β€” the page is fully static after build
  • The page reads only post-tags.json (no raw post data needed)