Bear with me. I swear this has something to do with the natural world.

I’ve been trying to figure out how biologists/ecologists/others determine if a macroscopically identical species has been introduced into a population of their lookalikes. The math required to figure it out baffles my brain. I need help.

Before you go any further I am not a student, nor am I an expert trying to shortcut the hard work. I’m just a nerd, with questionable math skills, who ponders odd questions and probably won’t be able to sleep well until I figure it out…

Here is the problem

Imagine a geographically isolated island that has a population of bears (species A). A macroscopically identical species of bear (species B) exists off the island. The two species can only be distinguished by microscopy or genetics. Established wisdom says that all the bears on the island are species A. A contrarian posits that there is at least one example of species B on the island. She wants to know how many of the bears need to be tested to prove her true.

To keep the math within bounds let’s assume there are only 100 bears on the island.

Here are the two questions I struggle to answer;

If there is one specimen of species B on the island (99 of species A and just one B) how many of the bears does she need to test to have a better than 50/50 chance of detecting species B?

Assuming we don’t know if there is even a single specimen of species B on the island how many of the 100 bears does she need to test to have a better than 50/50 chance of correctly assuming there are no specimens of species B present?

I do understand that the best option is to test all the bears but budgets are limited. In both cases, our biologist needs to test the minimum number of bears to have a better than 50 percent chance of being correct.

Purely mathwise, assuming the first 98 tests returned as species A, at that point there would be one A remaining and one B remaining, giving a 50% chance of detecting B on either of the remaining two bears. If you need a better than 50/50 chance as you indicated, then the only answer is 99, with all other bears proving species A and the last remaining bear giving you a 100% chance of detecting species B.

I can’t do the maths side. But researchers test from scat, or fur? Or autopsies.
Our Urban Caracal Project collects all dead caracals - if people notify them.

You’re thinking in terms of statistics, like surveys. Statistically, it works “30% of people don’t like bears”- a couple centuries of statistics based on distribution have demonstrated that properly conducted surveys and polls actually do reflect the real world.

That said, all real surveys have a factor of error that’s published with it; this is a mathematically calculated + / -, so in my example, the real number of people who don’t like bears would almost certainly be between 27% and 32%…BUT there’s a chance, albeit tiny, that it’s actually 35%.

BUT you asked “determine if a macroscopically identical species has been introduced into a population of their lookalikes” and that is not statistics. The statistics above can tell you the probability of any given bear being the introduced bear IF you know the number of both bears. What you’re asking for though is an absolute: determine if ANY of the bears are introduced; the only way to do that is to test all the bears, every last one of them.

That’s presuming of course they haven’t interbred, which is what I’m fighting with right now.

On your second sub-question, I see no value in the 50% probability of being correct. That’s a coin toss.

It is a coin toss but I’m sure researchers have to draw the line somewhere. Testing all available bears would give an absolute answer but it is not practical for something like identical-looking insect species. What got me pondering the probability questions were confidently espoused statements along the lines of “Anything west of the continental divide is species A and anything to the east is species B.” Obviously we can’t know that with any degree of certainty unless we test a significant portion of the population on both sides.

If 50/50 is the sticking point we could change the confidence level to 80%.

Hmmm – Maybe most taxonomy is, in a sense, playing the odds.

Imagine that there are 10,000 bears on the island and you genetically test (genetic barcode) 10 of them. Receiving a poor peer-review on the resulting paper ( ) you return the next season and test 1000 of them. OK, obviously the more you test the more closely you approach 100% certainty about the genetic makeup of the population of bears. You arrive at a mix of A and B bears, with larger or smaller error bars depending on how many you test and how definitive your genetic testing is.

But there is some argument around genetic barcodes versus classical morphological taxonomy. In some peoples’ view (eg Rob DeSalle, a genomics expert at the American Museum of Natural History) the two taxonomic systems delimit species differently.

So you go back NEXT season and do a third study, a careful biomorphic evaluation of 1000 bears ( ). It mostly confirms your other studies, but it gives a small but statistically significant difference in the distribution of A and B bears. What now?

In the construction of this amateurish scenario I’ve probably stepped in a bunch of logical and taxonomic cowpies for which I should be taken to task. Go ahead! I’m just riffing, I’m not an expert.

My point, I guess, is that it seems to me that biological nature is a mix of the discrete and the smoothly variable. I, personally, want a box called “species” to put biological beings into. Some beings don’t fit into the boxes very well. I don’t like ambiguity, but in the absence of perfect knowledge I think I’m stuck with it.

One might take a different approach depending upon whether one wanted to know only if there was at least one outside bear in the population, or also wanted to know which one. If you just want to know if the outside bear exists, you could identify an allele that the mainland bear would have that the native bears don’t then taking a pooled sample (a little bit of DNA from each individual, all in the same test tube) test for the presence of that indicative allele. If you also needed to know which bear was from the mainland, you’d need to test each sample separately, which would be considerably more expensive.

Are you allowed to kill the bears to test them? ;)
With no prior knowledge at all (you’re not even sure whether a B-Bear exists on the island!), better pack 100 cartridges before leaving for Bear Island. And some sandwiches.

Oh, but think of all the papers you can write based on all of the possibilities!! Such a question, if properly pursued, will at least get you tenure at your institution of choice, if not the editorship of a respected journal in the field. I mean, does the overlap/clash between classical morphological taxonomy and barcodes change significantly on the mainland vs. an island? With the (alleged) evolutionary distance between Bear A and Bear B vs. Bear A and Bears C, D, and E? And let’s not be vertebrally centric here; what about Cicada A vs. Cicada B? And plants - plants are just weird evolutionarily, you know. You’ll need a good-sized lab and at least two major grants, a hustle or tango or a whole cotillion of grad students, no teaching duties, of course, and the sort of Spousal Equivalent who is more than willing to deal with the plumber/the electrician/the emotional labor re your relatives/the food/the laundry/the lawn/the native-plant-replacement-for-the-lawn, not to mention the upkeep thereof when the invasives take over, which of course they will do at the slightest provocation…

Sorry, I got a little carried away there (but it was fun). Let’s blame it on the unseemly tropical weather we’ve been having here recently.