I liked the boldness of this idea.
But 'something' needs to select the sklearn model, tune its hyper-params - how long can you keep it all hidden away from the user?
The training phase can be considerably long. Have you thought of some kind of an async wrapper that Smart Fruit might provide or will the user be expected to code it up?
This is more of a user experience comment - when the interface is designed to feel as if one is interacting with a DB / ORM the user may come to assume that the outcomes will be deterministic... While the returned results will remain deterministic given the training data, model and hyper-parameters remain the same - it won't feel as deterministic when either of these is updated... I am not sure if I communicated my concern clearly. I am trying to understand who the intended end-user is, of this package...
I would propose a potential user as someone interested in some of the meta considerations and patterns of statistical reasoning, aka machine learning. There are is a vast amount of particulars the second hand on my watch operates (e.g. vibrating quartz, digital), but I can use that mostly reliable device to investigate higher level phenomenom, like calculating distance of planets by timing their movement. This library opens a direct line to these algorithims such that one might intuit, and apply, their high level behavior; as I could not time planets if consumed with the fidelitity and reliability of resonating quartz, it would slow my ability to explore this kind of reasoning if concerned with the minutiae.
That said, all points taken. If this sparks interest in someone, as is stands, it would be on them to dig in to all the considerations you've outline.
The training phase can be considerably long. Have you thought of some kind of an async wrapper that Smart Fruit might provide or will the user be expected to code it up?
This is more of a user experience comment - when the interface is designed to feel as if one is interacting with a DB / ORM the user may come to assume that the outcomes will be deterministic... While the returned results will remain deterministic given the training data, model and hyper-parameters remain the same - it won't feel as deterministic when either of these is updated... I am not sure if I communicated my concern clearly. I am trying to understand who the intended end-user is, of this package...