Probabilistic Record Linkage Transformer Models

Unlock the Power of Connected Data

TMLink connects disparate data records across systems using advanced machine learning β€” so you can gain a complete, accurate view of your customers, patients, and citizens.

Book a Free Demo See How It Works
3
Stage Process
AI
Transformer Models
4+
Industry Sectors
99%
Match Accuracy

The Challenge of Data Fragmentation

In today's data-driven world, organizations face a significant challenge: data fragmentation. With information scattered across multiple sources, systems, and formats, it's difficult to get a complete and accurate view of the data.

TMLink helps overcome this challenge by connecting the dots β€” enabling businesses and institutions to uncover new insights, improve decision-making, and drive innovation.

πŸ₯
Healthcare DB
🏦
Finance DB
πŸ›οΈ
Gov Records
πŸ“‹
CRM Data
πŸ”—
TMLink
πŸ“Š
Analytics
Connected & Unified
Disparate sources β†’ Single source of truth

How TMLink Works in Three Stages

A streamlined, three-stage pipeline that turns fragmented datasets into connected intelligence.

01

Upload Your Dataset

Securely log in and upload your record linkage dataset. TMLink supports various entity information including names, birthdates, and IDs. The platform's frontend, powered by Streamlit, uses APIs to manage backend logic and data storage.

Secure Β· Fast Β· Scalable
02

Link the Records

Select the columns needed for linkage and deploy TMLink's advanced probabilistic models fused with transformer embeddings. Records are matched in minutes based on names, addresses, dates of birth, and more β€” with processing time proportional to dataset size.

ML-Powered Β· Probabilistic
03

Search & Discover

Use basic or advanced search to query your linked data. TMLink generates ad hoc queries based on probabilistic linkage thresholds and supports Text-to-SQL for advanced users. Spelling variations and minor discrepancies don't hinder the linkage process.

Basic & Advanced Search

Benefits That Drive Real Results

Connecting data unlocks insights that improve every corner of your organization.

βœ…

Improved Data Quality

Eliminate duplicates, correct errors, and enhance overall data accuracy by linking records across sources into a single, authoritative view.

πŸ‘οΈ

Enhanced Customer Insights

Gain a comprehensive 360Β° view of customer behavior, preferences, and needs by linking records across all your data systems.

πŸ“ˆ

Better Decision-Making

With accurate and complete data, organizations can make informed decisions and drive business growth backed by reliable intelligence.

πŸ›‘οΈ

Compliance & Risk Management

Meet regulatory requirements and manage risk by identifying potential data breaches, security threats, and compliance gaps through linked records.

Applications Across Industries

TMLink powers connected data strategies across sectors where accuracy and completeness are mission-critical.

πŸ₯

Healthcare

Link patient records across hospitals, clinics, and labs to improve care coordination and patient outcomes β€” even when records use different formats or contain data entry errors.

πŸ’Ό

Financial Services

Connect customer data across accounts, transactions, and institutions to enhance risk management, fraud detection, and regulatory compliance.

πŸ›οΈ

Government

Link citizen data across departments and agencies to improve public services, policy-making, and program delivery at scale.

πŸ“£

Marketing & Customer Analytics

Understand customer behavior and preferences across channels to drive targeted campaigns, personalization, and business growth.

Workflow, Setup, and Operating Guide

A concise guide to the current TMLink runtime, application stages, merge behavior, and operating commands.

TMLink Documentation

Streamlit frontend, FastAPI backend, Docker runtime

This documentation covers the application flow from CSV upload through record linkage and entity search, along with the current container-based startup model.

Frontend: Streamlit Backend: FastAPI Runtime: Docker Workflow: 3 Stages

At a Glance

TMLink runs as a Streamlit frontend with a FastAPI backend inside Docker. The product flow stays intentionally simple:

  1. Upload and validate one or more compatible CSV files.
  2. Run tmlink against the (merged) dataset.
  3. Search linked entities after processing completes.

Accepted uploads are merged before linkage runs, so the workflow operates on the current combined dataset rather than only the last file added.

Quick Start

Use master_setup.sh as the main entry point. It rebuilds the application packages, builds the Docker image, starts the container, and waits for the services to come up.

chmod +x master_setup.sh
bash master_setup.sh
  • http://localhost:8501 serves the Streamlit frontend.
  • http://localhost:8000 serves the FastAPI backend.
  • Heavy ML dependencies are installed during image build, not deferred to startup.

Workflow Stages

Stage 1: File Upload

Users upload one or more CSV files. Compatible files are merged into the current working dataset and the summary updates immediately.

  • Schema mismatches are blocked.
  • Duplicate uploads are ignored.
  • Deleting a file removes it from the merged totals.
Stage 2: Link records

The Start TMLink action runs against the (merged) Stage 1 dataset. Users must choose between 5 and 10 linkage fields.

  • Requires equivalent first name and last name fields after standardization.
  • Stores the result for later search operations.
Stage 3: Search Entity

After linkage completes, users can search linked records using basic search or advanced query generation backed by the saved Stage 2 similarity output.

Useful Commands


          

# Build image
docker build -t taiwotman/tmlink:latest .

# Run container
docker run -d -p 8501:8501 taiwotman/tmlink

Troubleshooting

Port 8501 already allocated

This means an older container is still bound to the frontend port. Stop and remove the old container before starting a new one.

  • If Stage 1 blocks a file, check that the incoming CSV columns match the current schema.
  • If Stage 2 warns about unsuitable columns, verify that first name and last name equivalents are included.
  • If Docker startup is slow, wait for the image build to finish the dependency installation layer.

Book a Free Demo

Discover how TMLink can help your organization unlock the power of connected data. Schedule a personalized walkthrough with our team.

Schedule Your Demo

Fill in the form and we'll get back to you within 24 hours to confirm a time that works for you.

By submitting, you agree to be contacted by TMLink / BigCodeGen. No spam, ever.

πŸŽ‰ Thank you! We'll be in touch within 24 hours to confirm your demo time.

Get in Touch

Reach out directly

🌐
🐳
✍️

Technical Blog

Medium β€” TMLink

Send us a Message

βœ‰οΈ Message sent! We'll reply to your email within 1-2 business days.