Thanks Rupert, this is really helpful context. I can expand the scope of the confusion matrix and the experiment but I was worried the charts might get really hard to interpret.
I think your instincts are correct: what I’m trying to do is build a baseline that I can test with, to try to show within reason that adding a hybrid node like a x b does not degrade the models visual concept of a or b. Comparing the results of the toy models against the results of our 2.22 model should show this, one way or another.
I’ll train up these toy models next week, and I’ll also re-run my Jupyter notebook that made the confusion matrix stuff with predictions from 2.22 across the whole clade. You may have to get a magnifying glass out!