Trial Threads – Better Life Owing to Goodness

Another concept that I recognized when trying to model code in 2d was Trial Threads. These are parallel threads trying to complete the same task with different algorithms. When one of them completes, it tells the other thread objects and they do not complete.

You want to be able to model the IO when you do this, or your logs will look confusing.

The first objection is, it sounds inefficient to have parallel threads racing to finish. You could do it when you have two algorithms. One is O(n*log(n)) and the other is O(n*n). If the constants on the n*log(n) algorithm are sufficiently high, for small n, you should pick the other one. How small? And what if other variables confound how long it takes to complete?

Trial Threads can both run and whoever finishes first wins and the other terminates. It’s guaranteed not to be more than O(n*log(n)); you’re doubling the constant inside the O.

It might be considerably faster to run the O(n*n) algorithm. We can run them both simultaneously and end when one finishes first.

Alternatively, consider an RDBMS, a search on two columns where both are O(log(n)). The optimizer might not know which is the better index. You could start both and run them in parallel. Which one returns first with the rows you’re looking for wins. Still guaranteed to run in O(log(n)).

You can create threads that work in step with another process. For instance, think of an optimizer designed to only run as much as the queries it’s optimizing. You can model the query as taking N steps to complete and have the optimizer run N steps after your query returns.

More concretely, consider an index in a database. You could apply indexes willy-nilly. It’s more efficient to apply indexes that your queries need. You know the index is useful as the queries happen.

When you get queries that look like SELECT * FROM table1 WHERE col1 =?, you know an index on col1 would be useful. The first query on it will be O(n). After the query, the indexer does n steps to split the rows in the table into above and below the median value. The next time you see a similar query, you can return in half the number of steps.

This continues with each query searching on col1, the optimizer does as many steps as the query took to build the index. After log(n) queries the searches will be much faster and you know the index you have is useful. After a full n queries, every search should return in O(log(n)) time.

A lot of this we could implement in current programming languages. It feels like it would get in the way most of the time, though.

A 2-dimensional graph frees us from text and could model it more clearly, and use colour contrast to de-emphasize the irrelevant.