editor - VIBE score
E - Evidence Score
What is it? Why did we make it? How to get the most out of it.
Last, but certainly not least is the evidence score.
This is at it's core a "pics or it didn't happen".
Prove it. Where is your evidence that what you wrote about is true.
Not a fact check. We cannot do that.
But, there is language people use when they faithfully talk about their experiences, and how they did the things they did.
This is how the evidence score works.
It's broken into 2 parts.
External evidence - can you cite your sources.
Personal evidence - how are you talking about the actions you took to learn what you're talking about
External Evidence
What it is
Signals that you’re grounding claims in peer-reviewed research (citations, DOIs, named journals/conferences).
What it looks for- Presence of formal citations (inline or references section).
- Recognizable venues (journal names, conferences).
- DOIs/PMIDs/ISSNs and outbound links.
Why it matters
Readers use external validation to judge accuracy. Classic persuasion work shows source credibility amplifies acceptance; meta-analyses confirm the effect across contexts.
Research anchors
Hovland & Weiss: source credibility increases persuasion.
Pornpitakpan (meta-analysis) on credibility effects.
Scoring gist:
How to improve (tactics)- Cite one primary and one corroborating peer-reviewed source per key claim.
- Add DOIs/PMIDs and the journal/conference name.
- Put a short “Why this source?” clause the first time you cite it.
Statistics
What it is
Clear, checkable numbers (rates, base rates, denominators). We look for a very clear definition of a "statistic". Which is a number, in a context, with an outcome.
It is not just adding a % in a sentence.
Quick ChecklistNamed measure (mean/median/rate/etc.)
Numerator + denominator (or n)
Timeframe/cohort (when applicable)
Method note if not obvious (how it was measured)
Uncertainty when appropriate (CI/SE/IQR)
What it looks for
We check in 3 passes. A number, over a timeframe, with an outcome. All within 30 tokens.
Why it matters
Numbers, when well-framed, improve decisions—especially with visual aids and natural frequencies; poor numeracy/framing breaks comprehension.
Research anchors
- Garcia-Retamero et al: visual aids improve risk understanding.
- Scalia et al.: icon arrays vs bar charts—visuals can aid comprehension.
Scoring gist:
Reward presence + clarity: denominators, time window, uncertainty; light bonus for visualization.
How to improve
What doesn’t count (sounds stat-ish, but nope)
“Open rates are up.” (no number, no window, no outcome size)
“We have thousands of users.” (vague count, no time or outcome)
“Churn improved last quarter.” (no number, no baseline)
Looks like it could, but doesn’t (yet) → then fixed
“Open rate is 42%.” (missing timeframe + context)
→ “October open rate was 42%, up from 38% in September.”“Churn is 12%.” (missing cohort/window)
→ “30-day churn was 12% for the July sign-up cohort.”“Conversion increased 12%.” (relative but ambiguous; missing base + window)
→ “Signup conversion rose 12% (3.1% → 3.5%) Jan → Feb.”“NPS 68.” (missing n + window)
→ “NPS 68 from 412 responses in March.”
Does count (clean, article-ready)
“In October, our newsletter open rate hit 42%, up from 38% in September.”
“Over the last 30 days, 21% of trial users upgraded to paid.”
“During a 2-week test, checkout completion increased to 58% (from 51%) with the new flow.”
“Across Q3, weekly active users averaged 6,420, up 12% vs Q2.”
“In the past 90 days, 113 of 540 new users (21%) churned within 30 days of signup.”
“Our median first-reply time was 14 minutes over the last 30 days (n=2,104 tickets).”
“October’s bug rate dropped to 1.7 per 1,000 sessions, from 2.4 in September.”
“Across 120 builds last week, a fresh compile averaged 18.4s; the new cache cut that to 11.2s.”
“Email opt-in on the pricing page was 3.9% over the last 14 days, up from 3.1%.”
“Over H1, average refund rate was 0.8% of orders (n=28,903), down from 1.3% in H2 last year.”
Rule of thumb for writers:
Write it as: [Number] over [time window] showing what happened (to/from or up/down).
Sources of authority
What it is
Signals that who or where the evidence comes from is reputable. .
What it looks for
Relevant authorities to your section topic.
- Named experts/institutions.
- Reputable venues (society journals, top conferences).
- Author role (“professor of X”, “clinical trialist”, etc.).
Why it matters
Authority is a robust persuasion cue (credibility → acceptance), observed across decades and domains. Are you referencing the right people in your space to back up the claims you make in the article?
Research anchors
- Hovland & Weiss; Pornpitakpan meta-review.
Scoring gist:
Reward explicit authority signals attached to claims; small bonus for diverse authorities (not one source repeated).
How to improve
Attribute by name + role (“…says Dr. ___, epidemiologist at ___”).
Prefer society/flagship venues when available.
Add one line on why that source is qualified.
Flow Consistency
What it is
Keeps cognitive load steady—no whiplash from very simple to very dense.
What it looks for
Similar readability across adjacent chunks; if complexity rises, it ramps.
Why it matters
Consistent difficulty supports comprehension; readability metrics are a practical proxy.
Research anchors
- Flesch/F-K: word/sentence factors and perceived difficulty. Limitations noted, but still useful as a signal.
Scoring gist:
- Slice into 100-word chunks; compute readability per chunk.
- Penalize high variance; lower variance = higher score.
How to improve
If a section must be technical, add a plain-language setup sentence.
Replace bursty jargon clusters with one term + quick definition.
Keep sentence length roughly steady within a dense section.
Voice of Experience
What it is
First-hand, I/We-did-X evidence with concrete actions, outcomes, and time depth.
Can the reader tell you actually learnt what you're claiming by actually doing it?
What it looks for- First-person past-tense action verbs (“we shipped…”, “I migrated…”).
- Outcome markers (what changed, metric moved).
- Chronology (before/after, iteration).
Why it matters
Personal narratives increase transportation (attention + persuasion), and vivid case evidence strongly sways readers (for better or worse).
Research anchors
- Green & Brock (narrative transportation).
- Hamill, Nisbett & Borgida (vivid case impact).
Scoring gist:
- Reward action + outcome + timeline; heavier weight when all three co-occur in a window.
How to improveUse I/We + did + result (“We flipped nav → reduced click-depth by 32%”).
Anchor with when (“In 2018… after 3 iterations…”).
Prefer measurable outcomes (even ranges/estimates).
Use I/We + did + result (“We flipped nav → reduced click-depth by 32%”).
Anchor with when (“In 2018… after 3 iterations…”).
Prefer measurable outcomes (even ranges/estimates).
Advice Strength
What it is
The difference between passive "you should do this", to a more involved "you could do this", to the best "I would do this, if I was doing it again"
What it looks for- First-person past-tense action verbs (“we shipped…”, “I migrated…”).
- Outcome markers (what changed, metric moved).
- Chronology (before/after, iteration).
Why it matters
Stance markers (hedges/boosters) shape perceived confidence and author commitment; decisive guidance is easier to follow.
Research anchors
- Green & Brock (narrative transportation).
- Hamill, Nisbett & Borgida (vivid case impact).
Scoring gist:
- Reward imperatives + clarity phrases; soft-penalize if the exact action is undercut by “maybe/possibly” in the same line.
How to improve- Use I/We + did + result (“We flipped nav → reduced click-depth by 32%”).
- Anchor with when (“In 2018… after 3 iterations…”).
- Prefer measurable outcomes (even ranges/estimates).
Method Transparency
What it is
Showing how you got your result—steps, criteria, and enough detail to replicate at a high level.
What it looks for- Step/sequence language (first/then/finally; numbered steps).
- Process terms (criteria, sample, pipeline, checklist).
- Key parameters/assumptions (timeframe, inclusion/exclusion).
Why it matters
Transparent reporting → higher trust & reproducibility; it’s the core of CONSORT/PRISMA and broader reproducibility initiatives.
Research anchors
CONSORT 2025 (transparent trial reporting). Cambridge University Press & Assessment
PRISMA 2020 (systematic review reporting). cienciacognitiva.org
Munafò et al., “Manifesto for reproducible science.”riskliteracy.org
Scoring gist:
- Reward explicit steps + named criteria + inputs/parameters; small bonus for artifact links (template, sheet).
How to improve- Add a short “How we did this” box with 3–5 steps.
- Name criteria (“included posts from 2022–2024; excluded <500 words”).
- Link a lightweight artifact (checklist, schema, sheet).
Limitation Awareness
What it is
Owning the edge cases, constraints, and unknowns—without undermining your main advice.
What it looks for- Explicit limits (“doesn’t generalize to ___”, “requires ___”).
- Uncertainty language when appropriate (preliminary, incomplete).
- Trade-offs (“faster but less flexible”).
Why it matters
Stating caveats increases perceived honesty and can maintain or increase trust when done clearly; it’s a formal expectation in research reporting.
Research anchors
- CONSORT & PRISMA require limitations/risks/uncertainty disclosure.
Scoring gist:
- Reward concrete constraints and pragmatic boundaries; micro-deduct if the limitation contradicts the same-sentence action (self-negation).
How to improve
Add a “Works best when…” / “Not a fit if…” pair.
Name one trade-off per big recommendation.
Keep caveats adjacent to the claim they qualify (so readers don’t miss them).

