Hierarchical Data Sets
Matthew Turland wrote on his blog, “[the nested set] model is simple and intuitive, but it has a drawback: to obtain any significant portion of the hierarchy stored in a table can take up to as many queries as there are levels of depth in the hierarchy.” I like the theory that Matthew talks about and the implementation for Oracle provided by developer.com got me thinking. I wasn’t so sure that some of the analytic functions in Oracle couldn’t make this happen faster/better/cheaper, so I set out to figure it out.
One of the primary concerns with the nested set model is the requirement that N number of rows would have to be updated with each new item. This is a deal-breaker in my job and wouldn’t fly, so I’ve added an additional column to assist in the solution that identifies the level at which the item occurs.
Given the following data set:
| ID | PARENT_ID | SITE_AREA | SITE_LEVEL |
|---|---|---|---|
| 1 | NULL | Home | 0 |
| 2 | 1 | Sporting Goods | 1 |
| 3 | 2 | Tennis | 2 |
| 4 | 3 | Racquets | 3 |
| 5 | 4 | Prince | 4 |
| 6 | 4 | Wilson | 4 |
| 7 | 4 | Head | 4 |
| 8 | 4 | Yonex | 4 |
| 9 | 2 | Golf | 2 |
| 10 | 9 | Drivers | 3 |
| 11 | 10 | Ping | 4 |
| 12 | 10 | TaylorMade | 4 |
Here’s the query I came up with:
SELECT max(ltrim(sys_connect_by_path(site_area, ' > '),' > ')) FROM ( SELECT site_area, site_link, row_number() over (partition by 1 order by site_level) level_index FROM hierarchical_data START WITH id=8 CONNECT BY PRIOR parent_id=id ) START WITH level_index = 1 CONNECT BY level_index = prior level_index + 1;
And the resulting data set:
| BREAD_CRUMBS |
|---|
| Home > Sporting Goods > Tennis > Racquets > Yonex |
Let me know what you think… and be nice.
Posted: October 15th, 2007 under Oracle.
Comments: 1
Comments
Comment from Matthew Turland
Time: October 16, 2007, 3:30 am
Nice.
I was actually already aware of Oracle’s hierarchical data features, but hadn’t actually gotten around to producing the equivalent query using those features. At some point, I hope to have a driver-based component in Forkr that implements the adjacency list and nested set models as drivers, as well as a combined adjacency list/nested set model approach, an “ordered depth” approach (somewhat similar to nested set), and additional drivers for cases like Oracle where the database server supports hierarchical data sets in some way. Thanks for the trackback!


Write a comment