Powered by open-source models on Cloudflare Workers AI. No API key required.
First run downloads the embedding model (~25 MB), then it is cached in your browser.
Position a TESL abstract in seconds.
Paste an article abstract. ReviewLens reads it with an open-source model, then positions it against 2,788 SSCI abstracts: what the study is, where it sits in the field, what it is most like, which journals fit, and whether the area is crowded or open.
What you'll get
- 1What is this study?
- 2Where it fits in the field
- 3What is it most like?
- 4Which journals fit best?
- 5Is this crowded or novel?
Try a sample
Runs on open-source models on Cloudflare Workers AI. No API key. Your abstract is embedded locally in your browser; only short snippets ship.
Verdict
Synthesized from grounded extraction and the 25 nearest of 2,788 SSCI abstracts. Open each section below for the evidence.
What is this study?
Grounded extraction from the abstract, with verbatim evidence.
Full extraction table
Where it fits in the field
Your abstract's place in the applied-linguistics landscape.
Nearest fine topic and candidates
Show the field map (2D)
Each dot is one of 2,788 SSCI abstracts, positioned by a 2D UMAP projection of its MiniLM embedding; the axes are not interpretable. Your abstract's marker is interpolated from its nearest neighbours, not a direct projection.
What is it most like?
Closest published abstracts by embedding similarity.
More similar abstracts (4–10)
Snippets are ≤200 characters. Full text is available from the source journal.
Which journals fit best?
Journals of the most similar abstracts, similarity-weighted. A positioning signal, not a recommendation.
Full ranking and method
Is this crowded or novel?
Density of close neighbours in the corpus. Crowded means incremental; sparse means potentially novel.
How this profile compares to the field
Field-benchmark shares are a heuristic profile comparison, not a quality judgment.
Bulk screening for many abstracts. Same per-abstract pipeline, flat results table.
Separate each abstract with one or more blank lines. Up to 50 abstracts per batch; each abstract is limited to 6000 characters. You can also load a plain-text (.txt) file.
Powered by open-source models on Cloudflare Workers AI. No API key required.
Blank-line delimited. Batch cap: 50 items.
First run downloads the embedding model (~25 MB), then it is cached in your browser.
Loading validation data…
About / Methods
ReviewLens-for-TESL is a research prototype that positions a TESL article abstract against a reference corpus of 2,788 SSCI abstracts. It opens with a synthesized verdict, then answers five questions in order: what the study is (a grounded structured extraction with verbatim evidence quotes), where it fits in the field (a 2D field-positioning map, with macro-theme and fine-topic placement as supporting detail), what it is most like (nearest neighbours), which journals fit (a similarity-weighted ranking), and whether the area is crowded or open (a corpus-density novelty signal). The extraction backend runs entirely on open-source models hosted on Cloudflare Workers AI (no user API key is required). It mirrors the validation-transparency framing of Mizumoto (2025), Automated Error Analysis (SSLA).
Pipeline
- Embedding: client-side, in your browser. The abstract is embedded with sentence-transformers all-MiniLM-L6-v2 via transformers.js (pinned to @huggingface/transformers@4.2.0), mean-pooled and L2-normalized.
- Structured extraction: server-side, via open-weight models on Cloudflare Workers AI (default Qwen3 30B-A3B), selectable at run time; no API key required. Each construct and claim may carry a short verbatim evidence quote, which is verified server-side to be a near-substring of the input abstract (otherwise dropped), so the extraction is grounded and auditable.
- Positioning: cosine nearest neighbours and nearest topic/macro centroid are computed in the browser against the shipped derived embeddings.
- Verdict: the headline sentence and chips are a synthesis of values already produced by the extraction and positioning steps; every clause is restated with evidence in one of the five sections below. It is not a new, unaudited claim.
- Batch screening: the same per-abstract pipeline is run over many abstracts with a small client-side concurrency cap; it calls the same extraction endpoint once per abstract.
Approximations and caveats
- The 2D field-positioning marker for your abstract is interpolated from the 2D positions of its nearest neighbours (a similarity-weighted centroid). It is an approximate position, not a direct UMAP projection of your abstract.
- The 2D map uses a UMAP projection of the corpus embeddings. UMAP axes are not interpretable, so no numeric axes are shown.
- The crowded/novel signal is a heuristic proxy from neighbour density over a corpus of 2,788 SSCI abstracts, which is not exhaustive; treat it as a positioning cue, not a novelty verdict.
- LLM extraction can be imperfect; treat fields as a first-pass screen to verify.
Validation status
Reliability numbers (human inter-rater agreement and LLM-vs-human agreement) are pending the n = 100 double-coding study. See the Validation tab for the protocol and current status.
Copyright and data
Copyright (c) 2026 Jewoong Moon and Wong Wei Lun. All rights reserved. Only derived assets (embeddings, 2D coordinates, metadata) and snippets of at most 200 characters are shipped; full abstract text remains with the source publishers.