Home » Blog

rounded header

When should I use LINQ in LightSpeed 2?

tag icon Tagged as LightSpeed

One of our LightSpeed 2 Early Adopter customers posted a query in the forums about the tradeoffs between using LINQ compared to the traditional LightSpeed Find method to implement the Repository pattern. We provide a sample of this pattern that looks something like this:

public interface ICatalogRepository : IRepository
{
  IEnumerable<Category> Categories { get; }
  IEnumerable<Supplier> Suppliers { get; }
  // other members elided for clarity
}
 
public class CatalogRepository : RepositoryBase<NorthwindUnitOfWork>, ICatalogRepository
{
  public IEnumerable<Category> Categories
  {
    get { return UnitOfWorkScope.Current.Find<Category>(); }
  }
  // other members elided for clarity
}

The customer had changed the implementation as follows:

  public IEnumerable<Category> Categories
  {
    get { from c in UnitOfWorkScope.Current.Categories select c; }
  }

and was interested to know the benefits of one approach over the other.

Well, to an extent, this is a matter of taste. As the customer had already discovered, LightSpeed generates the same SQL for the LINQ and “native” queries. The reason for this is simple: the LINQ query eventually resolves into a Find call anyway!

In fact, in general, there’s no difference in efficiency between LINQ and “native” queries. Any LINQ query eventually gets translated into a Query object and an IUnitOfWork method call, and executes with the same efficiency as if you had constructed the Query object and called the method “manually”. Consequently, not only do LINQ queries inherit the optimisation provided by the core LightSpeed engine, but they also respect tuning and behavioural settings such as caching, eager/lazy loading, soft delete, and so on.

So at the simple end of the scale it’s largely down to which syntax you prefer:

var allProducts1 = unitOfWork.Find<Product>();
var allProducts2 = from p in unitOfWork.Products select p;
var allProducts3 = unitOfWork.Products;  // equivalent to allProducts2
 
var cheapProducts1 = unitOfWork.Find<Product>(Entity.Attribute("Price") < 100);
var cheapProducts2 = from p in unitOfWork.Products where p.Price < 100 select p;
var cheapProducts3 = unitOfWork.Products.Where(p.Price < 100);  // equivalent to cheapProducts2

There are two cases where the two syntaxes are functionally different. The first is if you need to perform a query on an attribute which is only known at runtime. Perhaps your application allows the user to select which field to sort on:

string sortField = GetSortFieldFromUI();
Query query = new Query { Order = Order.By(sortField) };
var topItems = unitOfWork.Find<Widget>(query);

You can’t do this in LINQ because LINQ needs to resolve the orderby expression at compile time:

var topItems = from w in unitOfWork.Widgets orderby w.??? select w;

The second is that LINQ queries are not executed until you actually use them, for example in a foreach. For example, suppose you are writing a desktop application and you are concerned that your UI might become unresponsive during a database query. You might therefore want to control when the database access occurs, for example at startup or in a background thread. In this case, you should use the Find syntax, or call ToList() or ToArray() on the LINQ query to force it to execute then and there.

Complex Expressions in Queries

When you start dealing with more complex expressions in queries, we recommend using LINQ because it enables you to use your familiar C# or Visual Basic syntax and keep the query expressions concise. For example, consider the following LINQ query:

from sl in unitOfWork.SalesLines where sl.UnitPrice * sl.Quantity > 1000
                                 select sl;

It’s perfectly possible to express this in LightSpeed 2 — otherwise LINQ couldn’t do it — but because we expect a lot of users to adopt LINQ for these purposes, we haven’t tried to optimise the syntax at the QueryExpression level:

unitOfWork.Find<SalesLine>(
  Entity.Attribute("UnitPrice").Function(
    "*", Entity.Attribute("Quantity")) > 1000);

LINQ’s use of CLR expressions comes in doubly handy when you’re dealing with a function that is implemented differently in different SQL dialects. We wouldn’t recommend using the Find syntax for, for example, these:

// Might call LEN or LENGTH
from ct in unitOfWork.Comments where ct.Message.Length > 100;
 
// Might call YEAR, DATEPART, DATE_PART, TO_CHAR('YYYY')...
from m in unitOfWork.Members where m.JoinedOn.Year <= 2007;

Of course, this cuts both ways. If the SQL function you want to use in your query expression doesn’t have a CLR equivalent, or the CLR equivalent isn’t supported in LINQ to LightSpeed, then you must use QueryExpressions and Find instead of LINQ.

Projections

If you’re using projections, you should definitely consider using LINQ. Again, a projection query resolves down to a call to IUnitOfWork.Project(), so there’s no difference in efficiency or in the generated SQL, but in many cases LINQ makes it easier to work with the results of the projection. For example:

from p in unitOfWork.Products select new { p.Name, p.Price };  // anonymous type
from p in unitOfWork.Products select new ProductSummary 
                                         {
                                           Name = p.Name,
                                           Cost = p.Price
                                         };  // named type

When you know at design time which fields you’re working with, this is more convenient than getting the results out of an IDataReader. On the other hand, if you have a more dynamic system in which the fields you’re querying on may vary depending on user input, you won’t be able to use LINQ and should instead use IUnitOfWork.Project().

API Design Considerations

Look again at the ICategoryRepository API and the CategoryRepository implementation above. These work well if all the application code wants to do is enumerate the categories. But what if application code wants to filter the result set, or perform ordering? The repository author is faced with a decision: does he try to anticipate application requirements by creating a flexible query API, or does he try to anticipate likely queries and encapsulate them within his API? And how can he possibly anticipate the application’s future requirements — what if the application later needs to do paging as well as ordering?

Of course, the application can write LINQ queries against the Categories property:

from c in repository.Categories where c.Name == "Rayguns"
                                orderby c.Popularity
                                select c.Code

But, because Categories is IEnumerable, this query will execute on the client against the result set. So it will result all the data for every category being downloaded from the database (modulo lazy loading, of course), then being sorted and filtered in the application, which is less efficient than doing it in the database and returning only what is needed.

One possible way out of this is to reformulate the API and implementation as follows:

public interface ICatalogRepository : IRepository
{
  IQueryable<Category> Categories { get; }
}
 
public class CatalogRepository : RepositoryBase<NorthwindUnitOfWork>, ICatalogRepository
{
  public IQueryable<Category> Categories
  {
    get { from c in UnitOfWorkScope.Current.Categories select c; }
    // Equivalently: get { return UnitOfWorkScope.Current.Query<Category>(); }
    // Equivalently: get { return UnitOfWorkScope.Current.Categories; }
  }
}

Now, when the application performs a query against repository.Categories, LINQ notices that Categories is IQueryable rather than merely IEnumerable, and hands the query to LightSpeed to perform on the server. We have given application code the power to query the repository flexibly without having to anticipate its requirements in our repository API.

Another interesting side of this is when you consider other LINQ operators that can’t be efficiently implemented in terms of Find. Suppose the application needs to know how many categories there are. Again, getting the result of Find and calling Count against it would be woefully inefficient. Instead we want to call the LightSpeed IUnitOfWork.Count method. But the unit of work is encapsulated in the repository and the application code can’t get at it. So again the repository designer has to anticipate the application requirement and add a CategoryCount property to the repository API. Then next day someone needs a method to get the total number (IUnitOfWork.Calculate(“SUM”)) of sales across a certain line of products… But if the repository returns IQueryable objects, then LINQ allows application code to perform these queries against it and have them translated into IUnitOfWork.Count() and IUnitOfWork.Calculate() methods and hence into efficient SQL, with no need for additional complexity in the repository API:

int count = repository.Categories.Count();  
// generated SQL: SELECT COUNT(*) FROM Categories
 
int totalSales = repository.Products.Sum(p => p.NumberSold);
// generated SQL: SELECT SUM(NumberSold) FROM Products
 
int raygunSales = (from p in repository.Products where p.ProductType == "Raygun"
                                                 select p.NumberSold).Sum();

So you should consider using LINQ rather than Find for repository APIs where it is desirable for applications to be able to compose their own queries.

Conclusion

There are a few scenarios in which you will not be able to use LINQ: notably, if you want to use a SQL function that has no CLR equivalent, or if you need to be able to choose the fields involved in a projection, predicate or order at run time instead of design time. And you may not be able to use LINQ at all because you don’t want to migrate your project to .NET 3.5 for stability or deployment reasons. In these cases, you can of course still use the “native” IUnitOfWork interface and its Find, Project and Count methods. (And you can mix and match the LINQ syntax and native API within the same unit of work.)

In general, however, we recommend using the LINQ syntax for new LightSpeed query code if you are targeting .NET 3.5 or above. You will benefit from easier access to complex expressions and projections, and it will make it easier for developers from non-LightSpeed environments to get up to speed on your project. You can still tune your loading and caching strategies at the entity level, and because LINQ uses the LightSpeed engine, these strategies will still be respected. And because LINQ queries are translated into the equivalent calls to the LightSpeed engine, you are losing nothing in the way of efficiency or quality of generated SQL.

kick it on DotNetKicks.com

Leave a Reply

Data Products Visual Controls Community Store
LightSpeed ORM
NHibernate Designer
SimpleDB Tools
SharePoint Tools
WPF Elements
WPF Diagrams
Silverlight Elements
Forums
Blog
Register
Login
Subscribe to newsletter
Buy Now
My Account
Volume Discounts
Purchase Orders
Contact Us