Semantic query processing on multimodal data
Get It!ThalamusDB processes queries with semantic operators using LLMs.
SELECT H.pic FROM HolidayPictures H, ProfilePictures P WHERE P.name in ('Alice', 'Bob') AND NLFILTER(H.pic, 'this is a picture of the beach') AND NLJOIN(H.pic, P.pic, 'the same person appears in both pictures');
This query retrieves beach pictures showing Alice or Bob. It uses NLFILTER to filter out beach pictures. NLJOIN matches pictures showing the same person.
ThalamusDB processes tables and many unstructured data types.
ThalamusDB processes queries on tables, supporting all SQL types.
ThalamusDB analyzes text according to natural language instructions.
ThalamusDB analyzes images in PNG, JPG, and JPEG format via LLMs.
ThalamusDB processes audio data in WAV and MP3 format via LLMs for audio.
Simply store paths to pictures and audio files in text columns. ThalamusDB automatically detects the data type of referenced files and selects a suitable LLM for processing.
ThalamusDB reduces costs by approximate processing.
Users can set bounds on per-query processing costs. ThalamusDB generates the best possible result with bounded overheads.
Users can set constraints on result error. ThalamusDB tries to minimize overheads while satisfying those constraints.
During processing, ThalamusDB regularly displays partial results, based on processing a part of the entire database.
Learn about ThalamusDB in the documentation and papers.
Dive deep into the technical ideas behind ThalamusDB by reading the latest paper here.
You can obtain ThalamusDB in multiple ways.
Run the following commands in the terminal:
pip install thalamusdb thalamusdb [PathToDuckDBDatabase]
These commands install ThalamusDB and start the ThalamusDB console.
Run the following commands in the terminal:
git clone https://github.com/itrummer/thalamusdb cd thalamusdb pip install -r requirements.txt
These commands download the ThalamusDB code and install its requirements.