Concrete Challenge Scenarios

To indicate the range of possibilities just look at the following preliminary list of into account, one can distinguish a multitude of scenarios, each with different dominating challenge scenarios:

  • Black-box systems with equivalence queries are not relevant in practive, but ideal for learning. Usually only approximations of equivalence queries can be generated.
  • For fast black-box systems (e.g. simulated ones) the number of membership queries isn't as important as in other scenarios. While seemingly of only little practical use some use cases, e.g., learning behavior of APIs, can come close.
  • In scenarios with black-box systems with high-cost membership queries (e.g., systems that are slow to respond or generate actual cost per query) it makes much sense to limit the amount of generated queries and to investigate ways how to decrease effective costs.
  • Scenarios where data values have to be considered are common and motivate the introduction of means to abstract from infinite value domains. While manually generated and fixed abstractions may be sufficient in many cases it may be necessary to refine such an abstraction during the learning process, if possible in an automated fashion according to system output.
  • All these come in two flavours: man-made systems (the predominant case in practice) and randomly generated systems (frequently used for evaluation, e.g., within the ZULU challenge.

It is not surprising that these different scenarios do not only need different learning technology but also different ways of ranking their quality. E.g., the coincidence of the response pattern wrt. some large test set measured in percent as proposed by the ZULU challenge may be well suited for randomly generated systems, but it is inadequate for man-made systems, where finding the spurious behaviour of the system is the central objective.

It is one of our fist goals to establish good criteria and measurements for future competitions devoted to enhace the practicality of active learning.