When you hand-write database access code, you can control very precisely what gets loaded and when. The upside of this approach is that you can be very efficient about loading as much as you need for the task at hand, and no more. The downside is everything else in the whole world, from writing SQL to materialising objects to tracking updates to being pwned by a SQL injection attack and being fired for leaking the confidential details of 4.3 million customers.
Wise developers therefore prefer to use an object-relational mapper, which does all the hard work for them and allows them to focus on core business tasks such as knocking off early and going down the pub. However, using an ORM sacrifices that fine control over exactly what data gets loaded, so you might find yourself loading bulky data or associated entities that you don’t happen to need right now. You won’t get fired for this, of course, but you may find yourself hunted through the darkened cubicles by a masked and robed figure with a viciously sharp knife and an expensive DBA certification.
The wisest developers of all therefore use LightSpeed, which provides a handy way to control what gets loaded when you load an entity, and to load different data depending on the task at hand. This feature is called named aggregates.
Eager and lazy loading
Let’s review how LightSpeed normally loads data. By default, when you load an entity into a unit of work:
Now sometimes one of the entity’s columns contains very bulky data like a high-resolution image or video, which you rarely need and is expensive to load. In this case you can mark this field as lazy loaded. Then LightSpeed will not load the field when you load the entity, but only if and when you access it, as a separate database query.
Conversely, sometimes you know that whenever you load a certain kind of entity you’re always going to be using an associated entity or entities. In this case you can mark the association as eager load. Then LightSpeed will load all the required entities in a single database round-trip instead of issuing separate queries for each of the associated entities.
So we have the following additional options:
I just can’t decide
But what if you want to use different load strategies in different contexts? Consider a video sharing Web site. When the user searches or browses for videos, you want to be able to present them with a list containing names and maybe low-res preview images, but you don’t want to load the video data itself for every item in the list, because it would take ages and you’re not going to use it. So you want to mark the video data field as lazy-loaded. But when the user selects a video to play, you want to load all the properties, including the video data. Now lazy loading the video data field would mean you ended up with two database accesses instead of one, so you want the video data to be eagerly loaded.
A similar idea applies to associations. When the user asks to see a list of contributors, you need to load the contributors, but you don’t need to load their videos (lazy loading the Videos association). When the user drills in on a particular contributor to see their details, you know you’re going to display a list of their videos, so you want to load the videos along with the contributor in a single database access (eager loading the Videos association).
The solution to this is to make the sometimes-lazy-sometimes-eager things part of a named aggregate. A field or association which is part of a named aggregate will normally be lazy loaded. But by specifying the AggregateName property on a Query object, or using the LINQ WithAggregate() operator, you can tell them to be eager-loaded in a particular query.
The WithAggregate() operator doesn’t affect fields or associations which are always eager — that is, normal fields and eager-loaded associations. Similarly, it doesn’t affect fields or associations which are always lazy — that is, lazy-loaded fields and normal associations. WithAggregate() affects only fields or associations which are part of a named aggregate.
How would this look in our video sharing example? In the first case (loading the video data field), we always want to load the Video.Title field, but only optionally Video.Data. Well, ‘always load’ is the default for fields, so we don’t need to do anything to Video.Title. But we want Video.Data to be conditionally loaded, so we’ll specify it to be part of a named aggregate we’ll call “WithDetails”. This will mean it is lazy-loaded by default, but we can eager-load it by specifying the “WithDetails” aggregate in a query:
// This *doesn't* load the Data field var videos = unitOfWork.Videos.Where(v => v.Title.StartsWith("Belgian")); // This *does* load the Data field var video = unitOfWork.Videos.WithAggregate("WithDetails").First(v => v.Title == "Belgian Jam-Wrestling Penguins");
Similarly, we could mark the Contributor.Videos association as belonging to the “WithDetails” aggregate, which would give us the option to load the video list or not:
// This *doesn't* load the Videos list var contributors = unitOfWork.Contributors.Take(10); // This *does* load the Videos list -- but watch out! var video = unitOfWork.Contributors.WithAggregate("WithDetails").First(v => v.Id == userId);
However, we probably don’t want to do this as written! You see, the query aggregate propagates across associations. This is important because it allows you to load a whole object graph, not just the immediately associated objects, in one go. But in the example above, the WithDetails aggregate would get applied to the Contributor.Videos list, and you’d find yourself loading the big binary video data for every video associated with the contributor at hand! What was that stealthy footfall? Was that light glinting off a deadly blade? So in this example we’d be better off using different aggregate names for the two cases: maybe “WithVideoData” and “WithContributionList”:
// This *does* load the Videos list, but not the Video.Data fields var video = unitOfWork.Contributors.WithAggregate("WithContributionList").First(v => v.Id == userId);
Better living through additional aggregates
A field or association can be part of multiple aggregates so if you did also have a scenario where you needed to get a contributor’s videos with their full data you could define a “WithAllContributionData” aggregate and apply that to both the Contributor.Videos and Video.Data members. Then, in your query, you’d either omit WithAggregate(), specify WithAggregate(“WithContributionList”), or specify WithAggregate(“WithAllContributionData”), depending on the requirements of the page at hand, and how many counties were between you and the DBA.
Summary
So here’s the deal for those of you who for some reason don’t feel the urge to wade through 1200 words on the exciting topic of database load strategies:
Want to try it out? LightSpeed Express is free to download, or you can get the full Professional Edition from the store.
Efficient entity data loading with named aggregates…
Thank you for submitting this cool story – Trackback from DotNetShoutout…
… and so you will finally end up with a lot of aggregate names.
To the resucue, you can track the loading graph of the entites via the “@aggregate” filter at the designer – but you didn’t have any option to see which properties (fields) are part of the aggregate – you need to click and check each and every property to get an idea which properites are loaded. And be sure the VS property window is always to small to see all the aggregate names.
Feature request:
* Optional display the aggregate names beside the properties at the designer
* while filtering the entities via a aggregate name, hide those properties which won’t be loaded via the specified aggregate name.
Don’t get me wrong – the Aggregate feature is very powerfull. The handling at the designer can be better.
Sörnt
Cool ideas, Soernt — not sure how easy it will be to implement them, but we will definitely look into them. Thanks!
I thought once more about my feature request:
* while filtering the entities via a aggregate name, hide those properties which won’t be loaded via the specified aggregate name.
would be better:
* while filtering the entities via a aggregate name, draw those properties which will be loaded via the specified aggregate name with a different color (highlight).
So that I can still edit the properties which are not part of the aggregate – f.e. edit the aggregate name to add it to the loading graph.
Leave a Reply