queries – James Pearson

We’ve had query objects for a while now – since NAV 2013 (I think). In theory they sound great, link a bunch of tables together, aggregate some columns, get distinct values, left join, right join, cross join – executed as a single SQL query.

Why Don’t We Use Queries Much?

In practice I haven’t seen them used that much. There’s probably a variety of reasons for that:

We have limited control over the design and execution of the query at runtime. The design is pretty static making it difficult to create useful generic queries short of throwing all the fields from the relevant tables in – which feels heavy-handed
I find how the links between dataitems unintuitive
It isn’t easy to visualise the dataset that you are creating when writing the AL file
Habit – we all learnt FindFirst/Set, Repeat and Until when we first started playing with CAL development and are more comfortable working with loops than datasets. Let’s be honest, the RDLC report writing experience hasn’t done much to convert us to a set-based approach to our data

However, just because there is room for improvement doesn’t mean that we can’t find good uses for queries now. Queries are perfect for:

Using queries to select DISTINCT values in the dataset
Aggregates – min, max, sum, average, count – especially for scenarios that aren’t suited to flowfields
Date functions – convert date columns to day/month/year – which allows you to easily aggregate another column by different periods
Outer joins – it’s possible, but far more expensive, to create this kind of dataset by hand with temporary tables
Selecting the top X rows
Exposing as OData web services

It’s the last point in that list that I particularly want to talk about. We’ve been working on a solution lately where Business Central consumes its own OData web services.

What?! What kind of witchcraft is this? Why would you consume a query via a web service when you can call it directly with a query object? Hear me out…

Consuming Queries

I think you’ve got two main options for consuming queries via AL code.

Query Variable

You can define a variable of type query and specify the query that you want to run. This gives you some control over the query before you execute it – you can set filters on the query columns and set the top number of rows. Call Query.Open and Query.Read to execute the query and step through the result set.

The main downside is that you have to specify the query that you want to use at design-time. That might be fine for some specific requirement but is a problem if you are trying to create something generic.

Query Keyword

Alternatively we can use the Query keyword and execute a query by its ID. Choose whether you want the results in CSV (presumably this is popular among the same crowd that are responsible for an increase in cassette tape sales) or XML and save them either to disk or to a stream.

The benefit is that you can decide on the query that you want to call at runtime. Lovely. Unfortunately you have to sacrifice even the limited control that a query variable gave you in order to do so.

OData Queries/Pages

Accessing the query via OData moves us towards having the best of both worlds. Obviously there is significant extra overhead in this approach:

Adding the query to the web service table and publishing
Acquiring the correct URL to a service tier that is serving OData requests for your query
Creating the HTTP request with appropriate authentication
Parsing the JSON response to get at the data that you want

This is significantly more work than the other approaches – let’s not pretend otherwise. However, it does give you all the power of OData query parameters to control the query. While I’ve been talking about queries up til now almost all of this applies to pages exposed as OData services as well.

$filter: specify column name, operator and filter value that you want to apply to the query, join multiple filters together with and/or operators
$select: a comma-separated list of columns to return i.e. only return the columns that you are actually interested in
$orderBy: specify a column to order the results by – use in combination with $top to get the min/max value of a column in the dataset
$top: the number of rows to return
$skip: skip this many rows in the dataset before returning – useful if the dataset is too large for the service tier to return in a single call
$count: just return the count of the rows in the dataset – if you only want the count there is no need to parse the JSON response and count them yourself

This is a bit off-topic to what I’ve been blogging about lately but I’ve been caught out by this before and the other day so was a colleague so I thought it was worth a post.

TL;DR

Be careful of the difference between DataItemLink and DataItemTableFilter properties. DataItemLinks define the join between the dataitems in the query while DataItemTableFilters are applied to the results after the join has been processed.

Intro

In theory the query object in Business Central/NAV ought to be very useful. Instead of using nested REPEAT…UNTIL loops like we used to with the associated many round-trips to the database (or at least the cache) we should be able to create a query to join multiple tables and return all the columns we need in a single round-trip.

In practice, I’ve often found queries frustrating to work with. Sometimes because they can’t support a more complex scenario, sometimes because the parameters don’t do quite what I’d expect. Maybe my expectations are wrong. Fine, but even so, trying to “debug” a query and figure out why the query you have designed gives the results that you are getting is not fun. Not quite as bad as developing reports – but still not fun.

Scenario

Let’s imagine that for some reason we need a list of items with the total base quantity from sales invoice lines – including where that total is zero. Typically you might write some code like this:

SalesLine.SetRange("Document Type",SalesLine."Document Type"::Invoice);
SalesLine.SetRange(Type,SalesLine.Type::Item);

if Item.FindSet() then
  repeat
    SalesLine.SetRange("No.",Item."No.");
    SalesLine.CalcSums("Quantity (Base)");

    //use that result for something...

  until Item.Next() = 0;

You figure that doing a CalcSums() for each item probably isn’t going to perform too well. Surely, this is exactly the sort of thing that we have queries for?

Version One

Knowing that we need all items records, including ones that don’t have corresponding sales line records we are going to need a left join i.e. all records from table A and matching records from table B.

For starters I’m going to create a query that just shows the data we’ve got – no grouping or summing just yet.

query 50100 "Frustrating Query"
{
    QueryType = Normal;
    elements
    {
        dataitem(Item; Item)
        {
            column(No; "No.") {}
            column(Description; Description) {}

            dataitem(Sales_Line; "Sales Line")
            {
                SqlJoinType = LeftOuterJoin;
                DataItemLink = "No." = Item."No.";
                
                column(Document_Type;"Document Type") {}
                column(Document_No;"Document No.") {}
                column(Quantity;"Quantity (Base)") {}
            }
        }
    }
}

The first few results from that query look like this.

No.	Description	Document Type	Document No.	Quantity
1896-S	ATHENS Desk	Invoice	102201	1
1900-S	PARIS Guest Chair, black	Quote		0
1906-S	ATHENS Mobile Pedestal	Quote		0
1908-S	LONDON Swivel Chair, blue	Quote		0
1920-S	ANTWERP Conference Table	Order	101003	8
1920-S	ANTWERP Conference Table	Invoice	102202	4
1920-S	ANTWERP Conference Table	Invoice	102203	10
1920-S	ANTWERP Conference Table	Invoice	102205	4

Version Two

Cool. Now we need to Sum the Quantity column. I’ll remove the Document No. as we don’t want to group by that. Change the query design to this:

query 50100 "Frustrating Query"
{
    QueryType = Normal;
    elements
    {
        dataitem(Item; Item)
        {
            column(No; "No.") {}
            column(Description; Description) {}

            dataitem(Sales_Line; "Sales Line")
            {
                SqlJoinType = LeftOuterJoin;
                DataItemLink = "No." = Item."No.";
                
                column(Document_Type;"Document Type") {}
                column(Quantity;"Quantity (Base)")
                {
                    Method = Sum;
                }
            }
        }
    }
}

Now the results are:

No.	Description	Document Type	Quantity
1896-S	ATHENS Desk	Invoice	1
1900-S	PARIS Guest Chair, black	Quote	0
1906-S	ATHENS Mobile Pedestal	Quote	0
1908-S	LONDON Swivel Chair, blue	Quote	0
1920-S	ANTWERP Conference Table	Order	8
1920-S	ANTWERP Conference Table	Invoice	18

Version Three

Remember that we only wanted the sum of the base quantity for invoice lines. We’ve got a result for 1920-S order lines at the moment. That’s fine we can use the DataItemTableFilter to filter the Document Type.

At least, you’d think so. So would I…and we’d both be wrong. Adding DataItemTableFilter = “Document Type” = const(Invoice) to the query gives these results:

No.	Description	Document Type	Quantity
1896-S	ATHENS Desk	Invoice	1
1920-S	ANTWERP Conference Table	Invoice	18

Erm…where are the rest of my rows?!

Q: what has happened to items 1900-S, 1906-S and 1908-S?
A: there are no matching sales lines for those items

Q: but…that’s why we used a LeftOuterJoin. That should include items with no matching sales lines. I thought that was the point of specifying the join type?
A: yes, except DataItemTableFilter isn’t used as part of the join

Q: …eh?

Explanation

I expected, and maybe you did too, that DataItemTableFilter would be used to filter the Sales Line table before joining it to the Item table. It turns out that the join is processed first, respecting the DataItemLink properties, and the DataItemFilter property is used to filter the joined results afterwards.

In SQL terms the filters go into the HAVING clause and not the ON clause. We might have expected something like this:

SELECT Item.No_,
Item.Description,
SalesLine.[Document Type],
SUM(SalesLine.[Quantity (Base)]) AS Quantity
FROM [CRONUS International Ltd_$Item] AS Item
LEFT JOIN [CRONUS International Ltd_$Sales Line] AS SalesLine
ON SalesLine.No_ = Item.No_
AND SalesLine.[Document Type] = 2
GROUP BY Item.No_, Item.Description, SalesLine.[Document Type]

with SalesLine.[Document Type] = 2 forming part of the ON clause (the definition of the join between the tables). What you actually get is something like this:

SELECT Item.No_,
Item.Description,
SalesLine.[Document Type],
SUM(SalesLine.[Quantity (Base)]) AS Quantity
FROM [CRONUS International Ltd_$Item] AS Item
LEFT JOIN [CRONUS International Ltd_$Sales Line] AS SalesLine
ON SalesLine.No_ = Item.No_
GROUP BY Item.No_, Item.Description, SalesLine.[Document Type]
HAVING SalesLine.[Document Type] = 2

with a HAVING clause at the end which restricts the results after the tables have been joined. (The actual SQL queries you’ll see if you run SQL Server Profiler will be different – stuffed full of parameters and ISNULLs – but this is the general idea).

Conclusion

That was a long way of saying be careful how you use the DataItemTableFilter property – it might not do what you’re expecting. So how can you define an ON clause where the filter is a constant value not a field in another table? I don’t know.

As far as I can see as DataItemLink only allows you to define joins between field tables you’d need to engineer the data so that all of your joins are between fields and not constant values. I’d like to be wrong, but if I’m not this is a pretty big flaw is queries.

It’d be nice to be able add constant values into table joins for this kind of thing. While we’re wishing, it would be even better to be able to dynamically define queries at run-time, build and execute them on the fly. It seems I’m not the only one with a query wishlist: https://experience.dynamics.com/ideas/search-ideas/?q=queries&forum=e288ef32-82ed-e611-8101-5065f38b21f1

Tag: queries

Putting Queries to Use in Business Central