Hi HN, I made a data lineage tool for AI data pipeline, as a companion to open source ETL framework cocoindex.
After months in private beta (and lots of love from early users), we’re excited to officially launch it today.
It offers:
- Before/after of the data are available at every transformation node
- Every output field can be traced back to the exact set of input fields and operations that created it
- Lineage is first-class
- Zero pipeline data retention, connecting seamlessly to on-prem CocoIndex server
This tool is free, and you can get start by running
cocoindex server -ci main.py
with any of the cocoindex projects https://github.com/cocoindex-io/cocoindex/tree/main/examples
Thanks!
Linghua