Advanced Tuning

This guide covers parameter tuning and customization options for fine-tuning your visualization.

First-Year Inference Parameters

When inferring first years from Google Books 1-grams, you can adjust several parameters:

--tau (Threshold)

Raises or lowers the minimum frequency threshold. Higher values mean stricter requirements for word appearance.

--guard (Persistence Requirement)

Requires more non-zero support after the candidate year. This ensures the word persists rather than appearing as a one-off.

--window (Smoothing Window)

Controls the moving average window size for smoothing frequency data.

--start-year and --end-year

Define the time range for analysis.

Rendering Parameters

Resolution

--width 1920 --height 1080

Adjust output resolution. Higher values produce sharper images but take longer to render.

Node Sizing

--min-radius 10 --max-radius 120

Control the size range of nodes. Node radius scales with the square root of cumulative count.

Font Sizing

--title-font-size 112 --detail-font-size 42

Adjust text overlay sizes for year labels and statistics.

Edge Appearance

--base-edge-alpha 25 --edge-depth 6

Animation Timing

--start-progress 0.25 --end-progress 1.0

Control the eased scale animation:

Lower start values make early frames more readable but may hide early growth.

Encoding Parameters

Frame Rate

--fps 7.333 --gif-fps 12

GIF Optimization

--gif-width 1280

Reduce GIF width to decrease file size while maintaining reasonable quality.

Prefix Depth

The default depth is 6 letters, which balances detail with performance. You can adjust this:

--depth 6

Layout Options

Radial Layout

The default radial layout provides a circular, hierarchical view:

python -m src.viz.render_frames \
  artifacts/trie/prefix_counts.jsonl \
  outputs/frames

Rectangular Layout

For deterministic positioning with labels:

python -m src.viz.layout \
  artifacts/trie/prefix_counts.jsonl \
  artifacts/layout/prefix_positions.json

python -m src.viz.render_frames_rectangular \
  artifacts/trie/prefix_counts.jsonl \
  artifacts/layout/prefix_positions.json \
  outputs/frames \
  --label-limit 8 --label-depth 4 --label-spacing 20

Rectangular-specific parameters:

Performance Optimization

Caching

The pipeline writes intermediate artifacts. If you work iteratively, you can skip stages that are already complete:

Memory Considerations

Network Optimization

Download time dominates on fresh runs. Consider:

Custom Extensions

To take this project further, you could:

See Also