Mistaken understanding of how GPUs are used in AI.
GPUs can be used for training but there is no requirement for the model to be executed on GPU. For example, a network can simply be created for incrementing a counter which could be what a "choose the best sort" network ends up doing.
It begins with quicksort, it switches to heapsort when the recursion depth exceeds a level based on (the logarithm of) the number of elements being sorted and it switches to insertion sort when the number of elements is below some threshold.