This workflow vectorizes the TUSS (Terminologia Unificada da Saúde Suplementar) table by transforming medical procedures into vector embeddings ready for semantic search.
It automates the import of TUSS data, performs text preprocessing, and uses Google Gemini to generate vector embeddings. The resulting vectors can be stored in a vector database, such as PostgreSQL with pgvector, enabling efficient semantic queries across healthcare data.
Searching for medical procedures using traditional keyword matching is often imprecise. This workflow enhances the search experience by enabling semantic similarity search, which can retrieve more relevant results based on the meaning of the query instead of exact word matches.
You can adapt the preprocessing logic to your own language or domain-specific terms.
Swap Google Gemini with another embedding model, such as OpenAI or Cohere.
Adjust the chunking logic to control the granularity of semantic representation.
Prepare a source (database or CSV) with TUSS data. You need at least two fields:
CD_ITEM (Medical procedure code)
DS_ITEM (Medical procedure description)
Configure your Oracle or PostgreSQL database credentials in the Credentials section of n8n.
Make sure your PostgreSQL database has pgVector installed.
Replace the placeholder table and column names with your actual TUSS table.
Connect your Google Gemini credentials (via OpenAI proxy or official connector).
Run the workflow to vectorize all medical procedure descriptions.