Home > Blog > 2016 > 03

All Posts From : March 2016

Crazy Assed Shit With Contour, or How to Export ALL of Your Records as a Unified CSV

Recently I was tasked with doing some Crazy Assed Shit™ with Umbraco Contour. Note, this applies to old school Contour, the techniques outlined here won't work on Forms, as some of the required information is no longer in the database.

The client wanted to be able to export ALL of the forms on the site, merged into one giant CSV file. With the columns across the top of the spreadsheet. So effectively the output would be something like:

Form Name, Record ID, Date Submitted, Field 1, Field 2, Field 3, etc

Now, the requirement also stated that fields with the same caption should be merged together into one column.

Looking at the APIs, there are two options:

  • Use the library methods. There were three issues with this, one it returns XML, and two the caption names that are stored in the XML are only correct at the time of saving. If the field name has been updated, the caption will be wrong, so you'll end up with extra columns that shouldn't be there, finally, you have to get the results one form at a time, so there would be some horrific logic around grouping the fields.
  • Use the database driven API methods. This solves the stale caption names issue, but again you have to grab the results one form at a time, and the API is REALLY database intensive, which makes it pretty slow for deaing with large numbers of records at once.

So after scratching my head for a bit to think of a better way of doing it, preferebly in a single database call, I decided that you could do the whole thing, in one go (ish), using a pivot query. This is a special kind of query that allows you to turn rows into columns, they're commonly used in Excel and SQL Server reporting.

The only issue that would cause any problems is that you can't directly do a pivot query without knowing the column names. They HAVE to be hard coded into the query. The only way round that is to use dynamic SQL.

First things first, we're going to create a view that contains all of the raw data that we need to perform our magic. This will make the pivot query MUCH easier to read, and it also provides around a 20% perfomance boost to the final query, as we need to use the data twice (once to get all the captions, and once to actually do the pivot query). Create a new view, and use the following query:

SELECT dbo.UFRecordFields.Record, dbo.UFFields.Caption, dbo.UFRecords.Created AS [Date Submitted], dbo.UFRecords.IP, dbo.UFForms.Name AS [Contour Form Name], dbo.UFForms.Id, 
    CASE WHEN UFFields.PreValueProvider != '00000000-0000-0000-0000-000000000000' THEN STUFF
    ((SELECT     ',' + a.Value
        FROM UFPrevalues a LEFT JOIN
        UFRecordDataString b ON CAST(a.Id AS Nvarchar(255)) = b.Value
        WHERE a.Field = UFFields.Id AND b.[Key] = UFRecordFields.[Key] FOR XML PATH('')), 1, 1, '') ELSE COALESCE (CAST(UFRecordDataBit.[Value] AS Nvarchar(255)), 
        CAST(UFRecordDataDateTime.[Value] AS Nvarchar(255)), CAST(UFRecordDataInteger.[Value] AS NVarchar(255)), CAST(UFRecordDataLongString.[Value] AS nvarchar(MAX)), 
        UFRecordDataString.[Value]) END AS FieldValues
FROM dbo.UFRecordFields LEFT OUTER JOIN
    dbo.UFRecordDataBit ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataBit.[Key] LEFT OUTER JOIN
    dbo.UFRecordDataDateTime ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataDateTime.[Key] LEFT OUTER JOIN
    dbo.UFRecordDataInteger ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataInteger.[Key] LEFT OUTER JOIN
    dbo.UFRecordDataLongString ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataLongString.[Key] LEFT OUTER JOIN
    dbo.UFRecordDataString ON dbo.UFRecordFields.[Key] = dbo.UFRecordDataString.[Key] LEFT OUTER JOIN
    dbo.UFFields ON dbo.UFRecordFields.Field = dbo.UFFields.Id LEFT OUTER JOIN
    dbo.UFRecords ON dbo.UFRecords.Id = dbo.UFRecordFields.Record LEFT OUTER JOIN
    dbo.UFForms ON dbo.UFRecords.Form = dbo.UFForms.Id

You might get an error about the query, but it should still save (the error is because one of the functions can't be represented in the diagram pane if you have it open). Save the view as "vwContourFieldValues". The resulting view contains ALL of the submitted field values for every submitted record in the database. It also concatenates prevalue fields that allow multiple selections into a single comma separated string.

Now that the view is in place, we can write the SQL to pull out the pivot table. Basically, we will have to do two things. Firstly, pull out all of the column names to use in the query, and secondly build some dynamic SQL to use those column names in the pivot query.

Here's the code:

DECLARE @DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE @ColumnName AS NVARCHAR(MAX)
 
--Get distinct values of the PIVOT Column (Contour Field Captions) 
SELECT @ColumnName= ISNULL(@ColumnName + ',','') 
    + QUOTENAME(Caption)
    FROM (
        SELECT DISTINCT Caption
        FROM vwContourFieldValues
    ) AS Courses ORDER BY [Caption]

    SET @DynamicPivotQuery = 
        N'SELECT [Contour Form Name], Record, [Date Submitted], IP, 
        ' + @ColumnName + '
            FROM vwContourFieldValues

            PIVOT(MAX(FieldValues) 
            FOR Caption IN (' + @ColumnName + ')) AS PVTTable
            ORDER BY [Contour Form Name] ASC, [Date Submitted] DESC'

            --Execute the Dynamic Pivot Query
            EXEC sp_executesql @DynamicPivotQuery

Once you have the query written, you can run it, and it will list all of the records in the database, with all of the columns, and all of the records with the same caption will be kept in a single column. You can then output the results to CSV, so the client can open it in Excel and do any crazy data mining or reporting that they desire.

This is a pretty simple example, but you could easily modify the query to filter the forms by date range, or allow them to choose the forms that they want to bulk export.

I've tested this query on a site with around 70 forms, and over 150,000 records, and it took just over two minutes to run. Given the resulting CSV file has something like 200 dynamic columns, that's not too bad. If you run it on a site where the forms share more fields, it's MUCH faster. You could probably make this even faster by using temporary tables, indexed views, or other performance optiomisations.

As I mentioned at the beginning of this article, the code here won't work on Forms, as the Forms themselves are no longer stored in the database. You'd have to do some crazy voodoo involving getting all the forms, and creating a temporary lookup table with all the GUID/Caption pairs for every form on the site. I'm not 100% sure on how you'd get the prevalues back either, as they also don't appear to be in the database any more.

I may try and figure it out though and post an example of how to do this in Forms if I work it out.