LuisSearch
Public API
v1.0
Live

API Reference

Free, open search API powered by BM25 ranking and AI reranking via llama3.2:1b. No API key required for public use.

Pages indexed
Unique terms
Domains
Checking...
Crawler

Base URL

https://luisearch.pages.dev

All endpoints accept JSON responses. CORS is enabled — you can call this API from any browser or server.

Authentication

Most endpoints work without authentication. An API key gives you attribution in the query logs and may unlock higher priority as the service grows.

How to include your key

Pass your key using either method:

# Query parameter GET /api/search?q=linux&key=ls_your_key_here # Authorization header GET /api/search?q=linux Authorization: Bearer ls_your_key_here
1
Request a key
Go to /register and fill in your name, email, and what you're building.
2
Wait for approval
Keys are manually reviewed and approved by the admin. You'll receive a key in ls_... format immediately — it just won't be active until approved.
3
Use your key
Pass it as ?key=ls_... or Authorization: Bearer ls_... on any request.
Passing an invalid or pending key returns 401 Unauthorized. Omitting the key entirely works fine — unauthenticated requests are allowed.

Endpoints

GET /api/search

Search the index. Results are scored with BM25 + title boost, then AI-reranked by llama3.2:1b for better relevance. Returns up to 20 results.

Query Parameters

NameTypeDescription
q requiredstringThe search query
key optionalstringYour API key (or use Authorization header)

Example Request

GET https://luisearch.pages.dev/api/search?q=linux+kernel

Example Response

{ "results": [ { "id": 42, "url": "https://kernel.org", "title": "The Linux Kernel Archives", "snippet": "The Linux kernel is the core of the Linux operating system..." }, // up to 20 results, ordered by relevance ] }

Try it

GET /api/stats

Returns index statistics. Useful for showing users how large the index is.

Example Request

GET https://luisearch.pages.dev/api/stats

Example Response

{ "pages": 5000, "terms": 123004 }
GET /api/hosts

Returns all indexed domains with their page counts, sorted by most pages. Up to 500 domains.

Example Request

GET https://luisearch.pages.dev/api/hosts

Example Response

[ { "host": "en.wikipedia.org", "count": 124 }, { "host": "github.com", "count": 113 }, // ... ]
GET /api/crawl-status

Returns the URL currently being crawled, or null if the crawler is idle.

Example Response (active)

{ "url": "https://wiki.archlinux.org/title/Pacman", "crawling": true }

Example Response (idle)

{ "url": null, "crawling": false }
POST /api/keys/register

Request an API key. The key is generated immediately but starts in pending status until manually approved by an admin.

Request Body (JSON)

FieldTypeDescription
name requiredstringYour name or app name
email requiredstringYour email address
usecase optionalstringWhat you're building

Example Request

POST https://luisearch.pages.dev/api/keys/register Content-Type: application/json { "name": "Luis", "email": "luis@example.com", "usecase": "personal search widget" }

Example Response

{ "key": "ls_a1b2c3d4e5f6...", "status": "pending", "message": "Your key is pending admin approval." }
Save your key immediately after registering — it's only shown once. Use it as ?key=ls_... once approved.

Response Schema

Each result object in the /api/search response contains these fields:

FieldTypeDescription
idintegerInternal page ID in the index
urlstringFull URL of the indexed page
titlestringPage title from the HTML <title> tag
snippetstringFirst 300 characters of the page content

Error Codes

StatusCodeDescription
400name and email requiredMissing fields in /api/keys/register
401Invalid or pending API keyKey doesn't exist or isn't approved yet
401UnauthorizedAdmin endpoint called without valid session token
404Not foundEndpoint doesn't exist
All errors are returned as JSON: {"error": "message here"}

How it works

1
Deep Crawler
Starts from 400+ seed URLs and follows every link it finds via BFS traversal. Stores page title, content, and URL into SQLite on an external SSD. No per-host limit — it crawls entire domains.
2
BM25 + Title Boost
When you search, pages are scored using BM25 (the same algorithm Elasticsearch and Solr use). Pages where query terms appear in the title get a +3 score bonus per matching term.
3
AI Reranking
The top 20 BM25 results are sent to llama3.2:1b running locally via Ollama, which reorders them by relevance to your specific query. This adds semantic understanding on top of keyword matching.

Stack: Python · SQLite · BM25 · Ollama · llama3.2:1b · Cloudflare Pages

Code Examples

JavaScript
TypeScript
Python
curl
Go
Rust
PHP
Bash
C++
Java
Swift
Ruby
C#
Kotlin
// Fetch search results async function search(query, apiKey = null) { const url = new URL('https://luisearch.pages.dev/api/search'); url.searchParams.set('q', query); if (apiKey) url.searchParams.set('key', apiKey); const res = await fetch(url); if (!res.ok) throw new Error(await res.text()); return res.json(); } const data = await search('linux kernel'); data.results.forEach(r => console.log(r.title, r.url));
interface SearchResult { id: number; url: string; title: string; snippet: string; } interface SearchResponse { results: SearchResult[]; } async function search(query: string, apiKey?: string): Promise<SearchResponse> { const url = new URL('https://luisearch.pages.dev/api/search'); url.searchParams.set('q', query); if (apiKey) url.searchParams.set('key', apiKey); const res = await fetch(url.toString()); if (!res.ok) throw new Error(await res.text()); return res.json(); } const data = await search('linux kernel'); data.results.forEach(r => console.log(r.title, r.url));
import requests def search(query: str, api_key: str = None) -> list: params = {"q": query} if api_key: params["key"] = api_key res = requests.get( "https://luisearch.pages.dev/api/search", params=params, timeout=15 ) res.raise_for_status() return res.json()["results"] results = search("linux kernel") for r in results: print(r["title"], "-", r["url"])
# Basic search curl "https://luisearch.pages.dev/api/search?q=linux+kernel" # With API key (query param) curl "https://luisearch.pages.dev/api/search?q=linux&key=ls_your_key" # With API key (header) curl -H "Authorization: Bearer ls_your_key" \ "https://luisearch.pages.dev/api/search?q=linux" # Pretty print with jq curl -s "https://luisearch.pages.dev/api/search?q=linux" | jq '.results[].title'
package main import ( "encoding/json" "fmt" "net/http" "net/url" ) type Result struct { ID int `json:"id"` URL string `json:"url"` Title string `json:"title"` Snippet string `json:"snippet"` } type Response struct { Results []Result `json:"results"` } func search(query string) ([]Result, error) { params := url.Values{"q": {query}} resp, err := http.Get("https://luisearch.pages.dev/api/search?" + params.Encode()) if err != nil { return nil, err } defer resp.Body.Close() var data Response json.NewDecoder(resp.Body).Decode(&data) return data.Results, nil } func main() { results, _ := search("linux kernel") for _, r := range results { fmt.Printf("%s\n %s\n\n", r.Title, r.URL) } }
// Cargo.toml: reqwest = { features = ["json"] }, tokio, serde, serde_json use serde::Deserialize; #[derive(Deserialize, Debug)] struct SearchResult { url: String, title: String, snippet: String, } #[derive(Deserialize)] struct SearchResponse { results: Vec<SearchResult>, } #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { let resp: SearchResponse = reqwest::get( "https://luisearch.pages.dev/api/search?q=linux+kernel" ).await?.json().await?; for r in &resp.results { println!("{}\n {}\n", r.title, r.url); } Ok(()) }
<?php function luisearch(string $query, string $apiKey = ''): array { $url = 'https://luisearch.pages.dev/api/search?q=' . urlencode($query); if ($apiKey) $url .= '&key=' . urlencode($apiKey); $ctx = stream_context_create(['http' => ['timeout' => 15]]); $raw = file_get_contents($url, false, $ctx); $data = json_decode($raw, true); return $data['results'] ?? []; } $results = luisearch('linux kernel'); foreach ($results as $r) { echo $r['title'] . ' - ' . $r['url'] . "\n"; } ?>
#!/bin/bash QUERY="${1:-linux}" API_KEY="${2:-}" URL="https://luisearch.pages.dev/api/search?q=$(python3 -c "import urllib.parse,sys; print(urllib.parse.quote(sys.argv[1]))" "$QUERY")" [ -n "$API_KEY" ] && URL="${URL}&key=${API_KEY}" curl -s "$URL" | python3 -c " import json, sys data = json.load(sys.stdin) for r in data['results']: print(r['title']) print(' ', r['url']) print() "
import Foundation struct SearchResult: Codable { let id: Int let url: String let title: String let snippet: String } struct SearchResponse: Codable { let results: [SearchResult] } func luisearch(_ query: String) async throws -> [SearchResult] { var components = URLComponents(string: "https://luisearch.pages.dev/api/search")! components.queryItems = [URLQueryItem(name: "q", value: query)] let (data, _) = try await URLSession.shared.data(from: components.url!) return try JSONDecoder().decode(SearchResponse.self, from: data).results } let results = try await luisearch("linux kernel") for r in results { print(r.title, r.url) }
require 'net/http' require 'uri' require 'json' def luisearch(query, api_key: nil) uri = URI('https://luisearch.pages.dev/api/search') params = { q: query } params[:key] = api_key if api_key uri.query = URI.encode_www_form(params) res = Net::HTTP.get_response(uri) raise "Error #{res.code}" unless res.is_a?(Net::HTTPSuccess) JSON.parse(res.body)['results'] end results = luisearch('linux kernel') results.each { |r| puts "#{r['title']}\n #{r['url']}\n" }

Embed Widget

Add a search box to your own site with a few lines of HTML. Live preview below:

<!-- LuisSearch embed widget --> <div id="ls-widget"> <input id="ls-q" type="text" placeholder="Search..."> <button onclick="lsSearch()">Search</button> <div id="ls-results"></div> </div> <script> async function lsSearch() { const q = document.getElementById('ls-q').value; const res = await fetch('https://luisearch.pages.dev/api/search?q=' + encodeURIComponent(q)); const { results } = await res.json(); document.getElementById('ls-results').innerHTML = results.map(r => `<div><a href="${r.url}" target="_blank">${r.title}</a><br>${r.snippet}</div>` ).join(''); } </script>

FAQ

Is LuisSearch free to use?
Yes, completely free. No rate limits, no paywalls. This is a personal project and the API is open to everyone.
Do I need an API key?
No. The search endpoint works without any key. A key gives you attribution in the logs and may provide benefits in the future as the service evolves. If you pass an invalid key, you'll get a 401 — just don't pass a key at all if you don't have one.
How is the index built?
A custom Python crawler starts from 400+ seed URLs and follows links recursively using BFS. Every discovered page has its title and content extracted and stored in SQLite on an external SSD. The index currently has 5,000+ pages across 400+ domains.
What is AI reranking?
After BM25 picks the top 20 candidates, LuisSearch sends them to llama3.2:1b (a small LLM running locally via Ollama) and asks it to reorder them by relevance to your query. This adds semantic understanding without the cost of a large model.
Why is the first search slow sometimes?
The llama3.2:1b model needs a few seconds to warm up on first use. After the first query it's fast. If AI reranking times out, the raw BM25 results are returned instead — you still get results, just without the AI reorder.
Can I request specific sites to be crawled?
Not yet. The crawler is manually seeded. If you want a site added, contact the admin.
What does BM25 mean?
BM25 (Best Match 25) is a probabilistic ranking algorithm used by Elasticsearch, Solr, and Lucene. It scores documents by how often your search terms appear, adjusted for document length and how rare the terms are across the whole index. It's the industry standard for text search.
Is the source code available?
Not yet publicly, but the engine is built in plain Python (~300 lines) with no external search dependencies. The core uses SQLite for storage, BM25 computed at query time, and Ollama for local LLM inference.
LuisSearch API · Built by Luis Get API Key · Search