Recently I was tasked with doing some Crazy Assed Shit™ with
Umbraco Contour. Note, this applies to old school Contour, the
techniques outlined here won't work on Forms, as some of the
required information is no longer in the database.
The client wanted to be able to export ALL of the forms on the
site, merged into one giant CSV file. With the columns across the
top of the spreadsheet. So effectively the output would be
something like:
Form Name, Record ID, Date Submitted, Field 1, Field 2, Field 3,
etc
Now, the requirement also stated that fields with the same
caption should be merged together into one column.
Looking at the APIs, there are two options:
- Use the library methods. There were three issues with this, one
it returns XML, and two the caption names that are stored in the
XML are only correct at the time of saving. If the field name has
been updated, the caption will be wrong, so you'll end up with
extra columns that shouldn't be there, finally, you have to get the
results one form at a time, so there would be some horrific logic
around grouping the fields.
- Use the database driven API methods. This solves the stale
caption names issue, but again you have to grab the results one
form at a time, and the API is REALLY database intensive, which
makes it pretty slow for deaing with large numbers of records at
once.
So after scratching my head for a bit to think of a better way
of doing it, preferebly in a single database call, I decided that
you could do the whole thing, in one go (ish), using a
pivot query. This is a special kind of query that allows you to
turn rows into columns, they're commonly used in Excel and SQL
Server reporting.
The only issue that would cause any problems is that you can't
directly do a pivot query without knowing the column names. They
HAVE to be hard coded into the query. The only way round that is to
use dynamic SQL.
First things first, we're going to create a view that contains
all of the raw data that we need to perform our magic. This will
make the pivot query MUCH easier to read, and it also provides
around a 20% perfomance boost to the final query, as we need to use
the data twice (once to get all the captions, and once to actually
do the pivot query). Create a new view, and use the following
query:
SELECT dbo.UFRecordFields.Record, dbo.UFFields.Caption, dbo.UFRecords.Created AS [Date Submitted], dbo.UFRecords.IP, dbo.UFForms.Name AS [Contour Form Name], dbo.UFForms.Id,
CASE WHEN UFFields.PreValueProvider != '00000000-0000-0000-0000-000000000000' THEN STUFF
((SELECT ',' + a.Value
FROM UFPrevalues a LEFT JOIN
UFRecordDataString b ON CAST(a.Id AS Nvarchar(255)) = b.Value
WHERE a.Field = UFFields.Id AND b.[Key] = UFRecordFields.[Key] FOR XML PATH('')), 1, 1, '') ELSE COALESCE (CAST(UFRecordDataBit.[Value] AS Nvarchar(255)),
CAST(UFRecordDataDateTime.[Value] AS Nvarchar(255)), CAST(UFRecordDataInteger.[Value] AS NVarchar(255)), CAST(UFRecordDataLongString.[Value] AS nvarchar(MAX)),
UFRecordDataString.[Value]) END AS FieldValues
FROM dbo.UFRecordFields LEFT OUTER JOIN
dbo.UFRecordDataBit ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataBit.[Key] LEFT OUTER JOIN
dbo.UFRecordDataDateTime ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataDateTime.[Key] LEFT OUTER JOIN
dbo.UFRecordDataInteger ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataInteger.[Key] LEFT OUTER JOIN
dbo.UFRecordDataLongString ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataLongString.[Key] LEFT OUTER JOIN
dbo.UFRecordDataString ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataString.[Key] LEFT OUTER JOIN
dbo.UFFields ON dbo.UFRecordFields.Field = dbo.UFFields.Id LEFT OUTER JOIN
dbo.UFRecords ON dbo.UFRecords.Id = dbo.UFRecordFields.Record LEFT OUTER JOIN
dbo.UFForms ON dbo.UFRecords.Form = dbo.UFForms.Id
You might get an error about the query, but it should still save
(the error is because one of the functions can't be represented in
the diagram pane if you have it open). Save the view as
"vwContourFieldValues". The resulting view contains ALL of the
submitted field values for every submitted record in the database.
It also concatenates prevalue fields that allow multiple selections
into a single comma separated string.
Now that the view is in place, we can write the SQL to pull out
the pivot table. Basically, we will have to do two things. Firstly,
pull out all of the column names to use in the query, and secondly
build some dynamic SQL to use those column names in the pivot
query.
Here's the code:
DECLARE @DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE @ColumnName AS NVARCHAR(MAX)
--Get distinct values of the PIVOT Column (Contour Field Captions)
SELECT @ColumnName= ISNULL(@ColumnName + ',','')
+ QUOTENAME(Caption)
FROM (
SELECT DISTINCT Caption
FROM vwContourFieldValues
) AS Courses ORDER BY [Caption]
SET @DynamicPivotQuery =
N'SELECT [Contour Form Name], Record, [Date Submitted], IP,
' + @ColumnName + '
FROM vwContourFieldValues
PIVOT(MAX(FieldValues)
FOR Caption IN (' + @ColumnName + ')) AS PVTTable
ORDER BY [Contour Form Name] ASC, [Date Submitted] DESC'
--Execute the Dynamic Pivot Query
EXEC sp_executesql @DynamicPivotQuery
Once you have the query written, you can run it, and it will
list all of the records in the database, with all of the columns,
and all of the records with the same caption will be kept in a
single column. You can then output the results to CSV, so the
client can open it in Excel and do any crazy data mining or
reporting that they desire.
This is a pretty simple example, but you could easily modify the
query to filter the forms by date range, or allow them to choose
the forms that they want to bulk export.
I've tested this query on a site with around 70 forms, and over
150,000 records, and it took just over two minutes to run. Given
the resulting CSV file has something like 200 dynamic columns,
that's not too bad. If you run it on a site where the forms share
more fields, it's MUCH faster. You could probably make this even
faster by using temporary tables, indexed views, or other
performance optiomisations.
As I mentioned at the beginning of this article, the code here
won't work on Forms, as the Forms themselves are no longer stored
in the database. You'd have to do some crazy voodoo involving
getting all the forms, and creating a temporary lookup table with
all the GUID/Caption pairs for every form on the site. I'm not 100%
sure on how you'd get the prevalues back either, as they also don't
appear to be in the database any more.
I may try and figure it out though and post an example of how to
do this in Forms if I work it out.