We need living datasets for computer vision. There is no current mechanism for u...

fullstackchris · on April 14, 2023

> the data drifts

It's in fact the data itself is the closest to reality, not the ML model...

It's a conundrum in the flow of reality for me:

Reality -> data that reflects reality (hopefully as best it can) -> train model based on data (typically takes a non-trivial time scale!) -> make decision / do stuff based on algo with newest data

If you ask me, even with all known technology we have, seems like it's impossible to simultaneously aquire realtime data and train a model to operate and or make decisions on exactly that data. It's a catch 22, there will always be a lag

Even as humans with super big brains we can't hope to do this outside of extremely simple tasks like "throw and catch the ball"

grumple · on April 14, 2023

It's not exactly a hard problem to continuously train a model - although it may be costly. You can even train the model based on every interaction it has; but this quickly leads to degradation because users provide it with data that is of low quality, for example when they intentionally try to make chatbots says racist things, etc.

RicoElectrico · on April 14, 2023

HuggingFace is almost there. Uploading a re-trained or fine-tuned model is trivial. Same with datasets.