Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've recently written a related library - given a DataFrame it'll run sklearn's RandomForest to check which columns predict other columns. The goal is to learn which relationships exist within a DataFrame. Typically in the exploratory process in machine learning we want to learn how the data holds together - this tool helps with that discovery exercise. It'll auto-LabelEncode text and allows classification or regression. There are two example Notebooks (Titanic & Boston) to show what it is doing. Correlations (Pearson, Spearman, Kendall) can also be calculated. The RandomForest result can show non-linear relationships that aren't exposed by correlations. https://github.com/ianozsvald/discover_feature_relationships


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: