Algorithmic Flow

PageRank Dynamics, Crawl Budget, and the Physics of Authority

By Selim Reggabi, SEO Engineer

The Pragmatism of Flow

"Indexation is a physical law. Semantics is just a language."

If the structure blocks the flow, break the structure.

Most SEO discussions focus on content: keywords, topics, semantic relevance. These matter. But they matter only after the physical prerequisites are met.

A perfect article that Googlebot can't reach doesn't rank. A mediocre article that receives strong authority flow often does. Understanding the mechanics of how search engines discover, crawl, and index content requires moving beyond semantic theory to examine the actual infrastructure and technical foundations that enable or prevent pages from entering the index in the first place, which is why forcing Google indexation demands both strategic submission and proper structural accessibility.

This guide covers the physics of SEO: how authority actually flows, how crawlers actually behave, and how to engineer systems that work with these physical realities.

PageRank: The Fluid Dynamics of Authority

The Original Formula

PR(A) = (1-d) + d * (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Where:

The formula reveals three fundamental truths:

Principle 1: Authority Flows Through Links

No link = no authority transfer. Every internal link is a pipe carrying ranking potential. The absence of links is the absence of flow.

Principle 2: Division Dilutes

A page with PR=10 and 10 outbound links passes ~0.85 per link. The same page with 100 outbound links passes ~0.085 per link. More links = less authority each.

Principle 3: Distance Decays

The damping factor (d≈0.85) means each hop loses ~15% of passing authority. A page 5 clicks from an authority source receives only 0.85^5 ≈ 44% of what a depth-1 page receives.

The Water Metaphor

PageRank behaves like water in a plumbing system:

This metaphor reveals why blocking internal links (strict silos) creates problems: water pressure builds up, seeks alternative paths, eventually creates uncontrolled leaks. This fundamental insight into authority distribution is precisely why semantic cocoon methodologies fail in practice, as they attempt to dam the natural flow of PageRank rather than channeling it strategically through well-designed internal link architectures that respect the hydraulic nature of how search engines distribute and evaluate ranking authority across interconnected page networks.

Authority Distribution Patterns

Pattern 1: The Homepage Cascade

The homepage typically receives the most external links. How it distributes that authority determines the entire site's ranking potential.

Configuration Links from Homepage PR per Link (PR=100 homepage)
Minimal Navigation 10 links ~8.5 PR each
Standard Navigation 50 links ~1.7 PR each
Mega Menu 200 links ~0.4 PR each

Implication: Every link you add to the homepage dilutes every other link. Add links strategically to your highest-value targets.

Pattern 2: Hub and Spoke

Create intermediate hub pages that:

This creates concentrated authority flow to target clusters while keeping the homepage focused. Implementing this pattern effectively requires careful attention to how you structure and execute your strategic internal linking system to ensure that hub pages receive sufficient authority from high-value sources while distributing that equity efficiently to cluster members through contextually relevant, well-anchored links that provide both navigational utility and semantic clarity for search engine interpretation.

Pattern 3: The Pyramid

           Homepage (1 page)
              /    \
         Hubs (5-10 pages)
        /    |    \
   Clusters (50-100 pages each)

Each level down receives less authority but serves more specific intent. The structure naturally prioritizes higher-level pages for broader queries.

Crawl Budget: The Resource Constraint

What is Crawl Budget?

Crawl Budget = the number of pages Googlebot will crawl on your site in a given timeframe. It's determined by:

  • Crawl Rate Limit: How fast Google can crawl without overloading your server
  • Crawl Demand: How much Google wants to crawl (based on popularity, freshness)

Who Needs to Worry?

Site Size Crawl Budget Concern
Under 10,000 pages Rarely an issue (unless severe technical problems)
10,000 - 100,000 pages Monitor and optimize for important sections
> 100,000 pages Critical concern requiring active management
E-commerce with variants High risk (faceted navigation, parameters)

Crawl Efficiency Ratio

Crawl Efficiency = Important Pages Crawled / Total Pages Crawled

If your ratio is below 0.5, you're wasting half your crawl budget on low-value pages.

Optimization Strategies

  1. Block low-value URLs: robots.txt for crawl budget, noindex for index pollution
    • Internal search results
    • Filter/sort variations
    • Thin tag/archive pages
    • Out-of-stock product variants
  2. Consolidate duplicates: Use canonicals properly, implement URL parameters in Search Console
  3. Improve server response: Target under 200ms. Slow servers = reduced crawl rate limit
  4. Surface important content: Link important pages from high-crawl-frequency pages
  5. Fresh content signals: Updated pages get prioritized for re-crawl

The Five Principles of Flow

Principle 1: Flow Follows Structure

Authority flows through links. No link = no flow. Your site architecture IS your authority distribution system. Before optimizing content, optimize structure.

Principle 2: Damping Creates Distance Cost

Each hop costs ~15% authority. A page 5 clicks deep receives 44% of what a depth-1 page receives. For important pages, minimize distance from authority sources.

Principle 3: Division Dilutes

More outbound links = less authority per link. Strategic link reduction on key pages amplifies flow to targets. But don't sacrifice usability for marginal gains.

Principle 4: Crawlers Have Budgets

Googlebot visits limited pages per session. Every crawl spent on low-value pages is a crawl not spent on high-value pages. Structure to surface important content early.

Principle 5: Flow is Measurable

Theory must match observation. Crawl logs show what Googlebot actually does. Rankings show the result. If your "optimized" structure doesn't improve metrics, the theory was wrong.

Common Flow Problems

Problem: Orphan Pages

Symptom: Pages with zero or minimal internal links

Cause: Poor site architecture, content not integrated into navigation

Solution: Audit internal links, ensure every important page has 5+ internal links from relevant pages

Problem: Authority Sinks

Symptom: Pages receiving many links but providing no ranking value

Cause: Excessive linking to non-ranking pages (legal, login, archives)

Solution: Reduce internal links to sinks, noindex where appropriate, consider consolidation. This optimization becomes particularly critical when operating at scale, which is why sites managing extensive content networks need robust ghost network infrastructure that can systematically identify and minimize authority waste across thousands of pages while maintaining necessary legal and functional elements without allowing them to consume disproportionate crawl budget or link equity that should flow to revenue-generating content.

Problem: Deep Pages

Symptom: Important pages at depth 4+

Cause: Over-hierarchical structure, lack of shortcuts

Solution: Add hub pages, homepage links to deep content, breadcrumb shortcuts, HTML sitemaps

Problem: Link Dilution

Symptom: Key pages not ranking despite good content

Cause: Too many outbound links from linking pages

Solution: Audit high-PR pages, reduce unnecessary links, focus link equity on targets

Problem: Crawl Waste

Symptom: Important pages crawled infrequently, junk pages crawled often

Cause: Parameter URLs, duplicate content, crawlable low-value pages

Solution: Block with robots.txt, fix canonicals, use URL parameters in GSC

Measuring Flow

Crawl Log Analysis

Your server logs contain the truth about how Googlebot sees your site:

Key Metrics

Metric Good Concerning
Avg crawl frequency for important pages Daily > 7 days
% of crawl to valuable content > 70% Under 50%
Crawl errors Under 1% > 5%
Average crawl depth Under 3 > 5

Internal Link Analysis

  1. Crawl your site with Screaming Frog or similar
  2. Export internal link counts per page
  3. Compare with ranking goals: are target pages receiving adequate links?
  4. Identify orphans: pages with fewer than 3 internal links
  5. Identify sinks: non-ranking pages with many links

Flow Optimization Workflow

  1. Audit current state
    • Crawl site, map all internal links
    • Calculate crawl depth for each page
    • Estimate PageRank distribution
    • Identify orphans and sinks
  2. Define targets
    • List pages you want to rank
    • Current link count and depth for each
    • Gap analysis: what needs to improve
  3. Design improvements
    • Add links from high-PR pages to targets
    • Create hub pages to reduce depth
    • Reduce links to sinks
    • Block crawl-wasting URLs
  4. Implement incrementally
    • Change in batches to measure impact
    • Document each change
  5. Measure results
    • Crawl stats: frequency, coverage
    • Rankings: position changes
    • Indexation: page count changes
  6. Iterate
    • What worked? Do more.
    • What didn't? Revise theory.

Conclusion

SEO is often discussed in semantic terms: keywords, topics, user intent. These matter. But underneath the semantic layer is a physical layer: links, crawlers, authority flow.

The best content in the world doesn't rank if Googlebot can't reach it efficiently. The most semantic structure fails if authority pools in the wrong places. This fundamental reality underpins the entire Doctrine Mesh approach to site architecture, which prioritizes measurable flow optimization over theoretical semantic purity, recognizing that search engines operate according to physical laws of crawling and indexation that must be respected before any higher-level optimization strategies can deliver meaningful results.

Doctrine: "Treat algorithms as physical systems. Measure what they actually do, not what you think they should do. If the structure blocks the flow, break the structure."
— Selim Reggabi

Master the physics first. Then layer on the semantics.

Flow is measurable. Rankings are the result. The gap between theory and observation is the opportunity.

About the Author

Selim Reggabi is a technical SEO engineer managing infrastructure of 1000+ sites. His approach prioritizes measurable flow optimization over semantic theory. OWAG.fr demonstrates these principles with 1600+ indexed pages and consistent crawl efficiency above 80%.

Learn more about Selim Reggabi