Interpreting Results

Interpreting the Animation

Understanding the Visualization

The value at each node is cumulative, so circles grow but never shrink. This design choice lets you track growth over time while maintaining visual stability.

What You’re Seeing

Each node represents a letter prefix (1-6 letters)
Node size reflects how many words with that prefix have appeared up to that year
Node color encodes depth in the prefix tree
Edges connect a prefix to its parent prefix
Layout is fixed over time, making it easy to track changes

Reading the Frames

Early frames show foundational families—the core vocabulary that existed by 1800. These are the prefixes that anchor the visualization.

Later frames reveal new growth in specific branches. Watch how different first letters expand at different rates, and how certain deeper prefixes (three to six letters) rise quickly in particular eras. That is where domain trends show up.

What Stands Out

What stands out depends on what you look for:

Early decades grow steadily—the foundational vocabulary expands
The twentieth century accelerates—rapid vocabulary growth
Certain families expand quickly in specific eras—domain-specific terminology emerges
Stable positions let you pause on any year and compare it to earlier frames without re-orienting

Visual Elements

Node Size

Node radius follows the square root of the cumulative count. This prevents large values from overwhelming the visualization while still showing relative differences.

Color Encoding

Colors encode depth in the prefix tree:

Base letters (depth 1) typically use warmer colors
Deeper prefixes use cooler colors
This helps distinguish hierarchy at a glance

Edge Fading

Edges fade when a node and its parent have no value that year. This keeps the visualization clean and focuses attention on active branches.

Labels

Short labels appear for base letters where there is space. The year and simple totals overlay on each frame.

Patterns to Look For

Broad Shifts

One-letter nodes tell you broad shifts in vocabulary. Watch how different starting letters grow at different rates over time.

Fine Structure

Deeper nodes (3-6 letters) show fine structure without losing the big picture. These reveal:

Domain-specific terminology (e.g., “tele-“ for telecommunications)
Cultural shifts (e.g., technology-related prefixes)
Scientific advances (e.g., “bio-“ for biology terms)

Temporal Patterns

Because positions are stable, you can:

Compare the same prefix across different years
Track how specific families grow
Identify periods of rapid vocabulary expansion
See when certain prefixes become prominent

Limitations and Considerations

Remember that:

Google Books reflects its corpus, not the whole world
OCR and metadata add noise to the underlying data
Frequency is a proxy metric, not a legal attestation
Smoothing and thresholds trade sensitivity for stability
The method aims for robustness, not perfection

Using the Data

The visualization is a starting point. You can:

Explore specific prefixes by focusing on particular branches
Compare time periods by examining frames from different eras
Load into Neo4j for graph-based queries and analysis
Export frames for detailed examination of specific years

Example Observations

Some things you might notice:

Steady early growth: Foundational vocabulary expands gradually
20th century acceleration: Rapid vocabulary growth, especially in technology
Domain clusters: Certain prefixes cluster in specific time periods
Cultural markers: Vocabulary reflects historical events and cultural shifts

Next Steps

Load data into Neo4j for graph-based exploration
Customize visualization parameters to highlight specific aspects
Share your observations with the community