Free, open search API powered by BM25 ranking and AI reranking via llama3.2:1b. No API key required for public use.
—
Pages indexed
—
Unique terms
—
Domains
Checking...
Crawler
Base URL
https://luisearch.pages.dev
All endpoints accept JSON responses. CORS is enabled — you can call this API from any browser or server.
Authentication
Most endpoints work without authentication. An API key gives you attribution in the query logs and may unlock higher priority as the service grows.
How to include your key
Pass your key using either method:
# Query parameter
GET /api/search?q=linux&key=ls_your_key_here
# Authorization header
GET /api/search?q=linux
Authorization: Bearer ls_your_key_here
1
Request a key
Go to /register and fill in your name, email, and what you're building.
2
Wait for approval
Keys are manually reviewed and approved by the admin. You'll receive a key in ls_... format immediately — it just won't be active until approved.
3
Use your key
Pass it as ?key=ls_... or Authorization: Bearer ls_... on any request.
Passing an invalid or pending key returns 401 Unauthorized. Omitting the key entirely works fine — unauthenticated requests are allowed.
Endpoints
GET/api/search
Search the index. Results are scored with BM25 + title boost, then AI-reranked by llama3.2:1b for better relevance. Returns up to 20 results.
Query Parameters
Name
Type
Description
q required
string
The search query
key optional
string
Your API key (or use Authorization header)
Example Request
GET https://luisearch.pages.dev/api/search?q=linux+kernel
Example Response
{
"results": [
{
"id": 42,
"url": "https://kernel.org",
"title": "The Linux Kernel Archives",
"snippet": "The Linux kernel is the core of the Linux operating system..."
},
// up to 20 results, ordered by relevance
]
}
Try it
GET/api/stats
Returns index statistics. Useful for showing users how large the index is.
Example Request
GET https://luisearch.pages.dev/api/stats
Example Response
{
"pages": 5000,
"terms": 123004
}
GET/api/hosts
Returns all indexed domains with their page counts, sorted by most pages. Up to 500 domains.
Save your key immediately after registering — it's only shown once. Use it as ?key=ls_... once approved.
Response Schema
Each result object in the /api/search response contains these fields:
Field
Type
Description
id
integer
Internal page ID in the index
url
string
Full URL of the indexed page
title
string
Page title from the HTML <title> tag
snippet
string
First 300 characters of the page content
Error Codes
Status
Code
Description
400
name and email required
Missing fields in /api/keys/register
401
Invalid or pending API key
Key doesn't exist or isn't approved yet
401
Unauthorized
Admin endpoint called without valid session token
404
Not found
Endpoint doesn't exist
All errors are returned as JSON: {"error": "message here"}
How it works
1
Deep Crawler
Starts from 400+ seed URLs and follows every link it finds via BFS traversal. Stores page title, content, and URL into SQLite on an external SSD. No per-host limit — it crawls entire domains.
2
BM25 + Title Boost
When you search, pages are scored using BM25 (the same algorithm Elasticsearch and Solr use). Pages where query terms appear in the title get a +3 score bonus per matching term.
3
AI Reranking
The top 20 BM25 results are sent to llama3.2:1b running locally via Ollama, which reorders them by relevance to your specific query. This adds semantic understanding on top of keyword matching.
using System.Net.Http.Json;
using System.Text.Json;
record SearchResult(int Id, string Url, string Title, string Snippet);
record SearchResponse(SearchResult[] Results);
async Task<SearchResult[]> LuiSearch(string query, string? apiKey = null)
{
using var client = new HttpClient();
var url = $"https://luisearch.pages.dev/api/search?q={Uri.EscapeDataString(query)}";
if (apiKey != null) url += $"&key={apiKey}";
var resp = await client.GetFromJsonAsync<SearchResponse>(url);
return resp?.Results ?? Array.Empty<SearchResult>();
}
var results = await LuiSearch("linux kernel");
foreach (var r in results) Console.WriteLine($"{r.Title}\n {r.Url}\n");
// build.gradle: implementation("com.squareup.okhttp3:okhttp:4.12.0")
// implementation("org.json:json:20240303")
import okhttp3.OkHttpClient
import okhttp3.Request
import org.json.JSONObject
import java.net.URLEncoder
fun luisearch(query: String, apiKey: String? = null): List<Map<String, String>> {
val encoded = URLEncoder.encode(query, "UTF-8")
var url = "https://luisearch.pages.dev/api/search?q=$encoded"
if (apiKey != null) url += "&key=$apiKey"
val client = OkHttpClient()
val req = Request.Builder().url(url).build()
val body = client.newCall(req).execute().use { it.body!!.string() }
val data = JSONObject(body)
val results = data.getJSONArray("results")
return (0 until results.length()).map { i ->
val r = results.getJSONObject(i)
mapOf("title" to r.getString("title"), "url" to r.getString("url"))
}
}
fun main() {
luisearch("linux kernel").forEach { r ->
println("${r["title"]}\n ${r["url"]}\n")
}
}
Embed Widget
Add a search box to your own site with a few lines of HTML. Live preview below:
Yes, completely free. No rate limits, no paywalls. This is a personal project and the API is open to everyone.
Do I need an API key? ▼
No. The search endpoint works without any key. A key gives you attribution in the logs and may provide benefits in the future as the service evolves. If you pass an invalid key, you'll get a 401 — just don't pass a key at all if you don't have one.
How is the index built? ▼
A custom Python crawler starts from 400+ seed URLs and follows links recursively using BFS. Every discovered page has its title and content extracted and stored in SQLite on an external SSD. The index currently has 5,000+ pages across 400+ domains.
What is AI reranking? ▼
After BM25 picks the top 20 candidates, LuisSearch sends them to llama3.2:1b (a small LLM running locally via Ollama) and asks it to reorder them by relevance to your query. This adds semantic understanding without the cost of a large model.
Why is the first search slow sometimes? ▼
The llama3.2:1b model needs a few seconds to warm up on first use. After the first query it's fast. If AI reranking times out, the raw BM25 results are returned instead — you still get results, just without the AI reorder.
Can I request specific sites to be crawled? ▼
Not yet. The crawler is manually seeded. If you want a site added, contact the admin.
What does BM25 mean? ▼
BM25 (Best Match 25) is a probabilistic ranking algorithm used by Elasticsearch, Solr, and Lucene. It scores documents by how often your search terms appear, adjusted for document length and how rare the terms are across the whole index. It's the industry standard for text search.
Is the source code available? ▼
Not yet publicly, but the engine is built in plain Python (~300 lines) with no external search dependencies. The core uses SQLite for storage, BM25 computed at query time, and Ollama for local LLM inference.