Well, the software will almost always fail because it is giving species names when almost no azalea, except a few in botanical gardens, are species.
I tested PlantNet and it gave the most dominant species in the hybrids I took pictures from as a top5 candidate. But never no.1. I don't know how the software works. You would think it is a deep learning algorithm. But that needs to be trained with a high quality dataset. I don't think the users are providing that, generally speaking. And in the case of azalea, it is not clear which species to label the cultivar as.
But you are right that many azalea hybrids have identical leaves while different flowers. For satsuki Kozan, Nikko, Nyohozan, they all have identical leaves. But not identical flowers. So if the software would perform 100% perfectly, it would give a 100% match for Kozan, a 100% match for Nikko, and the same for Nyohozan, Goko, Tensho, etc. But it would give a 98% match for Hakurei, for example. So they may not be 100% identical, but they look identical even to experts. And then many look superficially similar.
But the added value of such a software is still clear to me. You can take a picture of just leaves and then really narrow down what it could be. If you have no idea, you at least get a few names of which one is very near to correct.
How deep learning works? That's a big topic. How a trained network is able to recognize features in the pictures, that is way harder to figure out than it is to get a deep learning network to function. But important because of a famous example of a deep learning network being able to recognize if a picture was a wolf or a dog. But after training it with wolf and dog pictures, it turned out that all it did was look for snow/white background, because all the wolf pictures used to train the network had snow in it, while the dog pictures didn't.
But back to 10 000 cultivar and training a deep learning neural network. You can't train it with species as outcomes, because all your input are hybrids. But you can also not have all the 10 000 hybrids as output. So what you could do instead is give percentages. Which is a bit iffy. So in a sense, putting out a species is maybe still the way to go.