The scientific PDF parser ML engineers already trust. Now a one-line API. No CUDA, no 5GB model downloads, no GPU server to babysit.
2 free papers/month · No credit card · API key in 30 seconds
You just don't want to run a GPU server to use it.
We're running Marker, Mistral OCR, LlamaParse, Docling, and OpenDataLoader on 10 real papers across arxiv, biorxiv, and published journals. Publishing equation accuracy, citation linking, and table fidelity. No cherry-picking.
Benchmark drops next week. Follow along or subscribe to get the results.
Not just arxiv. Real research comes from journals, preprint servers, pharma reports, and technical PDFs that live outside arxiv. We parse all of them.
curl -X POST \
https://rapid-api-host/parse-paper \
-H "X-RapidAPI-Key: $KEY" \
-F "url=https://arxiv.org/pdf/1706.03762"
# → {"call_id": "fc-01K...", "status": "queued"}
curl https://rapid-api-host/parse-paper/$ID \
-H "X-RapidAPI-Key: $KEY"
# → {
# "status": "done",
# "result": {
# "title": "Attention Is All You Need",
# "markdown": "# ...$$...$$...",
# "char_count": 47112
# }
# }
Typical parse: 60–180 seconds. Async polling pattern means no 30s timeouts to fight.
Display math $$...$$ and inline math $...$ extracted exactly. No formula-not-decoded placeholders.
Inline references in the body link to entries in the References section. No manual matching.
Real markdown tables, not images, not lost. Drop straight into your vector store.
Title, abstract, and headings extracted automatically. Perfect for chunking.
Submit, get a call_id, poll. Built for the realities of parsing 50-page papers.
Scored ~10.5/12 on equation extraction. Closest open-source alternative scored ~5/12.
Free to start. Pay only for papers you actually parse.
Free tier. No credit card. 5 papers to test it on your real workflow.
Get Your API Key →