FideAI

Benchmarks, harnesses, and reviewer systems

Faith-Facing AI Evaluation Infrastructure

Faith-facing AI needs public evaluation and verification infrastructure that can test model behavior, prove constraints, validate retrieval systems, and calibrate human reviewer judgment.

Research on benchmark validity, formal verification, reviewer calibration, scorer reliability, red-team design, and proof-carrying citations.

Benchmark and platform design

Public-interest model comparison, harness testing, and benchmark infrastructure for faith-facing systems.

6 open questions

Formal verification and source faithfulness

Source fidelity, proof-carrying citations, tradition-specific constraints, and verified retrieval.

8 open questions

FID-056agenda

Formal Verification for Sacred Text Fidelity

How can faith-facing AI systems be formally checked for whether they quote, paraphrase, reference, and contextualize sacred texts faithfully within a specified text edition, translation, canon, and interpretive context?

FID-057agenda

Proof-Carrying Citations for Faith-Facing AI

Can faith-facing AI answers carry checkable citation proofs that show which claims are directly supported by sources, which are inferred, which are uncertain, and which require human or tradition-specific authority?

FID-058agenda

Tradition-Specific Constraint Formalization

How can tradition-specific boundaries, source hierarchies, doctrinal constraints, and disagreement patterns be translated into machine-checkable specifications without flattening differences across faith traditions?

FID-059agenda

Authority-Boundary Verification for Pastoral-Adjacent AI

Can faith-facing AI systems be verified for whether they preserve the boundary between explanation, spiritual encouragement, moral reflection, pastoral or clerical authority, clinical/legal advice, and situations requiring human care?

FID-060agenda

Cross-Faith Sacred Text and Source Schema

What metadata schema is needed for faith-facing AI systems to represent sacred texts, commentaries, institutional documents, oral traditions, translations, editions, and authority levels across faith traditions?

FID-061agenda

Theological Contradiction and Entailment Stress Tests

Can faith-facing AI systems be tested for whether their answers contradict, entail, overstate, understate, or misrepresent claims within a bounded source set and specified faith tradition?

FID-062agenda

Verified Retrieval Pipelines for Faith-Facing RAG

How can faith-facing retrieval-augmented generation pipelines be verified for whether they retrieve authoritative, relevant, context-preserving sources before generating answers about sacred texts, doctrine, practice, or institutional policy?

FID-063agenda

Human-Reviewer-to-Formal-Spec Translation

How can theologians, clergy, scholars, ministry practitioners, and community reviewers translate qualitative judgments about faith-facing AI into formal specifications that are faithful to expert intent and usable in evaluation?

Next step

Turn this agenda into a study.