Tutorials Logic, IN +91 8092939553 info@tutorialslogic.com
FAQs Support
Navigation
Home About Us Contact Us Blogs FAQs
Tutorials
All Tutorials
Services
Academic Projects Resume Writing Interview Questions Website Development
Compiler Tutorials

Query Processing in DBMS

Steps in Query Processing

When a SQL query is submitted to a DBMS, it goes through several stages before results are returned:

  1. Parsing: The query is checked for syntax errors and converted into an internal representation (parse tree).
  2. Translation: The parse tree is translated into a relational algebra expression (query tree).
  3. Optimization: The query optimizer generates multiple execution plans and selects the most efficient one based on cost estimates.
  4. Evaluation: The chosen execution plan is executed by the query evaluation engine, and results are returned.
StageInputOutputKey Activity
ParsingSQL stringParse treeSyntax check, tokenization
TranslationParse treeRelational algebra expressionSemantic check, schema lookup
OptimizationRelational algebra expressionExecution planCost estimation, plan selection
EvaluationExecution planQuery resultPhysical operators, I/O

Query Tree (Relational Algebra Tree)

A query tree (also called a query evaluation tree) is a tree data structure that represents a relational algebra expression. Leaf nodes are relations (tables), and internal nodes are relational algebra operations (σ, π, ⋈, etc.).

Example: For the query SELECT name FROM students WHERE age > 20:

  • Leaf node: students (relation)
  • Internal node: σage > 20 (selection)
  • Root node: πname (projection)

Heuristic Optimization transforms the query tree to improve efficiency before cost-based analysis:

  • Push selections down: Apply σ (selection) as early as possible to reduce the number of tuples.
  • Push projections down: Apply π (projection) early to reduce tuple size.
  • Combine selections with Cartesian products: Convert σ(R × S) into a join (R ⋈ S).
  • Reorder joins: Perform joins that produce smaller intermediate results first.

Cost-Based vs Heuristic Optimization

AspectHeuristic OptimizationCost-Based Optimization
ApproachApply rules (push selections down, etc.)Estimate cost of multiple plans, pick cheapest
Statistics neededNoYes (table sizes, index info, cardinality)
QualityGood for simple queriesBetter for complex queries
SpeedFast (no cost computation)Slower (evaluates many plans)
Used byOlder/simpler systemsPostgreSQL, Oracle, SQL Server, MySQL

Join Ordering

For a query joining n tables, there are O(n!) possible join orderings. The optimizer uses dynamic programming or greedy algorithms to find a good order without evaluating all possibilities.

  • Left-deep trees: One operand of each join is always a base relation. Allows pipelining — output of one join feeds directly into the next.
  • Right-deep trees: One operand is always the result of a previous join. Requires more memory.
  • Bushy trees: Both operands can be intermediate results. Most flexible but hardest to optimize.

Key principle: Perform the join that produces the smallest intermediate result first. Use selectivity estimates (from statistics) to predict result sizes.

Execution Plans

An execution plan specifies the exact physical operations to execute a query. You can view it using:

  • MySQL: EXPLAIN SELECT ... or EXPLAIN ANALYZE SELECT ...
  • PostgreSQL: EXPLAIN (ANALYZE, BUFFERS) SELECT ...
  • SQL Server: SET SHOWPLAN_ALL ON or graphical execution plan in SSMS
Plan ComponentDescription
Seq ScanFull table scan — reads every row
Index ScanUses an index to find rows
Index Only ScanSatisfies query entirely from index (no table access)
Nested Loop JoinFor each row in outer table, scan inner table. Good for small tables.
Hash JoinBuild hash table from smaller relation, probe with larger. Good for large unsorted tables.
Merge JoinBoth inputs sorted on join key. Very efficient for sorted data.
SortSorts rows for ORDER BY or merge join
AggregateComputes GROUP BY, COUNT, SUM, etc.

Ready to Level Up Your Skills?

Explore 500+ free tutorials across 20+ languages and frameworks.