Our Methodology

How Exiqus analyses developers through their GitHub evidence, not arbitrary scores

Fundamental Approach: Evidence, Not Scores

Exiqus has adopted a purely evidence-based approach to developer assessment. We DO NOT assign numerical scores, ratings, or percentages to developers. Instead, we extract observable patterns from public repositories and present them as evidence for hiring decisions.

No Numerical Scoring

No "8.5/10 code quality" scores. Only observable evidence like "126 test files for 9 code files."

Observable Patterns Only

We report facts like "27 bug fix commits (54% of total)" not subjective assessments.

Context-Aware

Same data, different insights based on your company context (Startup, Enterprise, Agency, Open Source) and seniority level (Junior, Mid-level, Senior).

Critical Data Limitations

We're transparent about what we cannot assess. Understanding these limitations is crucial for making fair hiring decisions.

Public Data Only

  • We analyse public repositories only. Private company work is invisible to us.
  • Gaps in public activity likely represent private work, not absence of skill.
  • Professional experience may be significantly greater than public evidence shows.

What We Cannot Measure

  • Actual job performance or productivity
  • Soft skills, communication, or teamwork quality
  • Problem-solving under pressure or in meetings
  • Cultural fit, learning speed, or innovation capacity

This is ONE data point for hiring decisions. Use it alongside interviews, references, and your professional judgment.

Filling the Academic & Market Gap

Exiqus is pioneering a research gap that exists in both academic literature and the commercial market: systematically linking GitHub evidence to hiring outcomes.

Academic Landscape

  • The paper "Improving Evidence-Based Tech Hiring with GitHub-Supported Resume Matching" (SANER 2025) is the first peer-reviewed work proposing GitHub analysis for hiring—and it only demonstrates feasibility, not outcome correlation
  • Prior studies (2022-2024) analysed developer reputation or productivity via GitHub metrics, not hiring outcomes
  • No peer-reviewed dataset connects GitHub activity → hiring decision → on-job success

Market Landscape

PlatformFocusGitHub IntegrationOutcome Tracking
HackerRank / CodilityPuzzle-scoringNoneNo
SeekOut / HireEZSearch/reach-outMetadata onlyNo
Hirable / TuringPortfolio marketingMinimal metricsNo
ExiqusEvidence frameworkFull repo + PR analysisIn Development (First in Market)

We're uniquely positioned as the first platform systematically building a dataset that links GitHub evidence patterns to hiring outcomes—filling a gap that exists in both academia and the market.

Evidence Validation: The Science

Traditional hiring methods vary widely in predictive validity. Work sample tests(standardized exercises) show 0.33 validity. But Exiqus doesn't test candidates—we analyse years of real work. No research yet validates GitHub portfolio analysis for hiring. We're building the first dataset to find out if it works.

Hiring MethodPredictive Validity (r)What This Means
Work-sample tests
0.33
Standardized exercises like take-homes
Structured behavioral interview
0.42
Strong when standardized
Coding puzzles / technical panels*
Unknown
No published validity research
Unstructured interview
0.22
Nearly random—gut-feel bias
Résumé / reference review
0.18
Filtering tool only

Source: Schmidt & Oh 2016 meta-analysis; Roth et al. 2005; updated IO research 2020–2024

* No published research validates coding puzzle interviews (e.g., LeetCode, HackerRank challenges) for predicting software engineering job performance. Estimates of 0.30-0.40 are extrapolations from general cognitive ability tests (0.31) and job knowledge tests (0.40), but these are different assessment types.

† Schmidt & Oh 2016 consensus is 0.42; some studies show ranges up to 0.51-0.64 depending on structure and implementation.

Traditional Pipeline vs Evidence-First Hiring

Traditional 5-Stage Process
  • 1.Résumé screen (r ≤ 0.18)
  • 2.Phone screen (r ≈ 0.23)
  • 3.Coding challenge (r = Unknown)
  • 4.Technical panel (r = 0.23–0.42)
  • 5.Culture fit (r ≈ 0.23)
Signal Quality:Mixed (0.18–0.42)
Time:28–34 hours
Cost:$4,700
Evidence-First with Exiqus
  • 1.ATS / basic filter (r ≤ 0.18)
  • 2.Exiqus evidence-driven interview (validity unknown—first to measure)
  • 3.Optional: Final team fit conversation
Hypothesis:>0.33 (better than tests)
Time:6–10 hours
Cost:<$1,500

Higher signal. Lower cost. Better experience.

What We Analyze

We extract observable patterns from public GitHub data. These are factual observations, not judgments about code quality or developer ability.

Repository Data

Code Structure

File counts, language distribution, directory depth, and organization patterns

Commit History

Frequency patterns, timing, message conventions, and work distribution

Testing Evidence

Test file ratios, CI/CD configurations, and automated quality checks

Documentation

README presence, comment density, docs folders, and code explanation patterns

Repository Activity

Branch patterns, fork counts, stars, and community engagement signals

PR Contribution Data

PR Metadata

Counts, states, merge dates, and contribution frequency patterns

Code Changes

Additions, deletions, commit counts per PR, and change scope patterns

Quality Gates

Review decisions (APPROVED, CHANGES_REQUESTED) and approval patterns

Work Categorization

GitHub labels (feature, bug, docs) and PR type classification

Review Engagement

Comment volume, merge rates, and code review participation patterns

Collaboration Markers

Contributor counts, issue references, co-authored commits, and team coordination

What We DON'T Analyze

Transparency matters. Here's what our analysis explicitly does NOT cover:

Repository Limitations
  • Private Data: Private repositories (coming soon)
  • Runtime Performance: Actual execution speed or resource usage
  • Personal Metrics: Individual productivity or time tracking
  • Subjective Quality: "Good" vs "bad" code judgments
PR Analysis Limitations
  • Private PRs: Only public PRs are accessible
  • Comment Content: Review decision counts only, not text analysis
  • Code Quality: No static analysis or runtime testing
  • Behavioral Inferences: No personality traits or "cultural fit"

Context-Aware Analysis

We understand that different roles require different evaluation. A startup needs builders who can experiment and iterate. An enterprise needs architects who consider scale and maintainability. A junior developer shows different evidence patterns than a senior technical leader. We tailor our analysis accordingly.

Company Context

Startup Context

For experimental projects, we explore innovation, learning agility, and rapid prototyping skills. Perfect for evaluating builders and early-stage contributors.

Enterprise Context

For production-ready code, we assess architectural decisions, team collaboration, and maintainability practices.

Open Source Context

For community projects, we evaluate contribution quality, documentation, and collaborative development skills.

Agency Context

For client-ready developers, we assess versatility, professional practices, and ability to deliver under constraints.

Seniority Context

Junior (0-2 years)

We look for learning fundamentals, code comprehension, and growth trajectory. Evidence of experimentation, following established patterns, and building confidence through repetition.

Mid-Level (3-5 years)

We assess independent problem-solving, technical decision-making, and ownership of features. Evidence of navigating ambiguity, balancing tradeoffs, and contributing without heavy guidance.

Senior (5+ years)

We examine technical leadership, architectural thinking, and long-term impact. Evidence of mentorship, system design, cross-functional collaboration, and guiding team-level technical decisions.

Transparency Note: If a repository has limited patterns for a specific context, we'll tell you. An experimental notebook might generate fewer enterprise-focused questions - and that's honest feedback, not a limitation.

The Evidence Hierarchy

Important: Insights are Repository-Dependent, Not Fixed

The number of insights generated is deterministic based on repository content, not a fixed quota. A minimal repo might generate only 1-2 insights, while feature-rich repos like tinygrad or facebook/react can generate 25-30 insights. We never generate artificial insights just to hit a number - every insight is based on actual evidence found in the repository.

Direct Observations

Highest Confidence

File counts and sizes • Language percentages • PR counts and merge status • Review decisions (APPROVED, CHANGES_REQUESTED)

Derived Patterns

High Confidence

Test coverage ratios • Commit frequency trends • Merge success rates • Review cycle patterns

Development Patterns

Medium Confidence

Collaboration style (co-authored commits) • Code maintenance habits • Assignment patterns (PRs assigned vs authored) • Technical scope (surgical vs architectural)

Contextual Insights

Requires Human Interpretation

Domain expertise markers • Feature ownership patterns • Long-term commitment (sustained contributions) • Community engagement (external vs internal)

Repository Size Limits by Tier

Our system analyses repositories of all sizes, with tier-based limits to ensure optimal performance:

FREE/STARTER

Up to 500MB

Standard projects and libraries

GROWTH

Up to 2GB

Large frameworks and applications

SCALE

Up to 5GB

Enterprise systems and major projects

Note: If a repository exceeds your tier limit, the system will suggest upgrading to analyse larger repositories. We never attempt partial or incomplete analysis.

Handling Edge Cases

Minimal/Empty Repos

When a repository has less than 10KB of code and fewer than 5 files, we honestly tell you it lacks sufficient content for meaningful analysis rather than generating fluff

Monorepos

Smart sampling ensures efficient analysis without timeouts, maintaining accuracy despite size (up to 5GB on Scale tier)

Documentation-Only

Properly classified with no code quality claims, focusing solely on documentation evidence

Interview Question Generation

Available across all tiers, our AI generates questions that are:

Evidence-Based

Reference specific repository data

Practice-Focused

Focus on technical approaches

Context-Aware

Tailored to your hiring situation

Open-Ended

Encourage discussion, not yes/no

Example Question: "I noticed 27 of your commits were bug fixes. Walk me through your approach to debugging in a fast-moving startup environment."

The Human Element

Our analysis is designed to augment human judgment, not replace it:

Provide evidence for discussion, not decisions
Generate questions for interviews, not answers
Surface patterns for exploration, not conclusions
Augment expertise with data, not replace it

Last Updated: October 2025

Methodology Version: 2.0 - Evidence-Based Approach