Posts Tagged ‘Enumeration

System Topology Enumeration Using CPUID Extended Topology Leaf

The algorithm of system topology enumeration can be summarized as three phase of operation:

  1. Derive “mask width” constants that will be used to extract each Sub IDs.
  2. Gather the unique APIC IDs of each logical processor in the system, and extract/decompose each APIC ID into three sets of Sub IDs.
  3. Analyze the relationship of hierarchical Sub IDs to establish mapping tables between OS’s thread management services according to three hierarchical levels of processor topology.

Table 1 Modular Structure of Deriving System Topology Enumeration Information

Table 1 shows an example of the basic structure of the three phases of system wide topology as applied to processor topology and cache topology.  Figure 1 outlines the procedures of querying CPUID leaf 11 for the x2APIC ID and extracting sub IDs corresponding to the “SMT”, “Core”, “physical package” levels of the hierarchy.

Figure 1 Procedures to Extract Sub IDs from the x2APIC ID of Each Logical Processor

System topology enumeration at the application level using CPUID involves executing CPUID instruction on each logical processor in the system. This implies context switching using services provided by an OS. On-demand context switching by user code generally relies on a thread affinity management API provided by the OS. The capability and limitation of thread affinity API by different OS vary. For example, in some OS, the thread affinity API has a limit of 32 or 64 logical processors. It is expected that enhancement to thread affinity API to manage larger number of logical processor will be available in future versions.

Tags : , , , , , , , ,

Enumeration Architectures

An enumeration algorithm must pick an inexpensive execution plan for a given query by exploring the search space. The System-R join enumerator that was designed to choose only an optimal linear join order. A software engineering consideration is to build the enumerator so that it can gracefully adapt to changes in the search space due to the addition of new transformations, the addition of new physical operators (e.g., a new join implementation) and changes in the cost estimation techniques. More recent optimization architectures have been built with this paradigm and are called extensible optimizers. Building an extensible optimizer is a tall order since it is more than simply coming up with a better enumeration algorithm. Rather, they provide an infrastructure for evolution of optimizer design. However, generality in the architecture must be balanced with the need for efficiency in enumeration.

We focus on two representative examples of such extensible optimizers: Starburst and Volcano/Cascades briefly. Despite their differences, we can summarize some of the commonality in them:(a) Use of generalized cost functions and physical properties with operator nodes. (b) Use of a rule engine that allows transformations to modify the query expression or the operatortrees. Such rule engines also provide the ability to direct search to achieve efficiency. (c) Many exposed “knobs” that can be used totune the behavior of the system.

Starburst

Query optimization in the Starburst project at IBM Almaden begins with a structural representation of the SQL query that isused throughout the lifecycle of optimization. This representation is called the Query Graph Model (QGM). In the QGM, a box represents a query block and labeled arcs between boxes represent table references across blocks. Each box contains information onthe predicate structure as well as on whether the data stream is ordered. In the query rewrite phase of optimization, rules are used to transform a QGM into another equivalent QGM. Rules are modeled as pairs of arbitrary functions. The first one checks the condition for applicability and the second one enforces the transformation. A forward chaining rule engine governs the rules. Rules may be grouped in rule classes and it is possible to tune the order of evaluation of rule classes to focus search. Since any application of a rule results in a valid QGM, any set of rule applications guarantee query equivalence (assuming rules themselves are valid). The query rewrite phase does not have the cost information available. This forces this module to either retain alternatives obtained through rule application or to use the rules ina heuristic way (and thus compromise optimality).

The second phase of query optimization is called plan optimization. In this phase, given a QGM, an execution plan(operator tree) is chosen. In Starburst, the physical operators(called LOLEPOPs) may be combined in a variety of ways to implement higher level operators. In Starburst, such combinations are expressed in a grammar production-like language. The realization of a higher-level operation is expressed by its derivation in terms of the physical operators. In computing such derivations, comparable plans that represent the same physicaland logical properties but have higher costs, are pruned. Each plan has a relational description that corresponds to the algebraic expression it represents, an estimated cost, and physical properties (e.g., order). These properties are propagated as plans are built bottom-up. Thus, with each physical operator, a function is associated that shows the effect of the physical operator on each of the above properties. The join enumerator in this system is similar to System-R’s bottom-up enumeration scheme.

Volcano/Cascades

The Volcano and the Cascades extensible architecture evolved from Exodus. In these systems, rules are used universally to represent the knowledge of search space. Two kinds of rules are used. The transformation rules map an algebraic expression into another. The implementation rules map an algebraic expression into an operator tree. The rules may have conditions for applicability. Logical properties, physical properties and costs are associated with plans. The physical properties and the cost depend on the algorithms used to implement operators and its input data streams. For efficiency,Volcano/Cascades uses dynamic programming in a top-down way(“memoization”). When presented with an optimization task, it checks whether the task has already been accomplished by looking up its logical and physical properties in the table of plans that have been optimized in the past. Otherwise, it will apply a logical transformation rule, an implementation rule, or use an enforcer to modify properties of the data stream. At every stage, it uses the promise of an action to determine the next move. The promise parameter is programmable and reflects cost parameters.

The Volcano/Cascades framework differs from Starburst in its approach to enumeration: (a) These systems do not use two distinct optimization phases because all transformations are algebraic and cost-based. (b) The mapping from algebraic to physical operators occurs in a single step. (c) Instead of applying rules in a forward chaining fashion, as in the Starburst query rewrite phase, Volcano/Cascades does goal-driven application of rules.

Tags : , , , , , , , , , , , , ,