How to: Move/ copy InfoPath attachments in SharePoint Online
Table of contents:
Although it has been repeatedly said that the history of InfoPath is over, for many companies still building workflows’ forms using InfoPath is as obvious as using Excel. However, when doing that in Office 365 and SharePoint Online the product’s boundaries are really visible and are becoming a real pain.
I’ve been struggling some time ago with an issue related to the workflow’s instance size (here), what was directly related to the size of the InfoPath form together with attached files. The obvious solution I was thinking then was moving those attachments away from the form, into a dedicated SharePoint library, but I wasn’t able to do that easily mainly because of the algorithm that is used in InfoPath to handle file attachments.
How InfoPath handles attachments?
File attachments in InfoPath form are being uploaded directly to the form’s XML structure and encoded into a Base64 string:
But what turns out, InfoPath does not only encode file contents, but also adds additional information: header that stores such information as size, version and filename length, then the filename itself.
Although the header size is fixed, it always has 24 bytes, the amount of bytes for the filename varies on its length. That additional number of bytes is always stored in the byte no. 20 as a DWORD information and the filename itself is stored using Unicode, so the length must be multiplied by 2.
Therefore, to get the actual file contents, the Base64 string must be decoded, then the first 24 bytes + the filename length*2 have to be cut off.
Read more about programmatically encoding and decoding InfoPath attachments here: https://support.microsoft.com/en-us/help/892730/how-to-encode-and-decode-a-file-attachment-programmatically-by-using-v
Looking for a solution
During looking for the best solution I first started with Nintex, however Nintex does not have any actions allowing it to convert the Base64 strings back into the binary data, which is needed for the file to be created. Then I switched to Microsoft Flow. I was really impressed about the amount of functions allowing me to convert one data type into another. I also found a function converting Base64 to a binary.
However, after the first euphory I got stuck looking for an approach to split actual file contents from the header. Having in mind, that the header size is not fixed, I wasn’t able to simply calculate how many first characters from Base64 I should cut off (using a simple equation, that each character corresponds to 6 bits, and 8 bits is a byte and so on…). Flow does not have as well any function that would allow to work on the binary data in a form of an array.
And then it dawned on me! I opened my Azure portal and navigated to Function Apps. This service really allows to extend capabilities of Flows and Logic Apps (and Nintex workflows as well) into the areas not available before.
What is Azure Function?
Basically this is a service allowing execution of a small code, function, method in a cloud. It is charged either per consumption (so for each run of the function, there may be delays, price list) or as an app service (hosted on a VM that you own in Azure).
What is also super cool about Azure Functions is the variety of languages the function can be written in. You can choose between JavaScript, PHP, Python, PowerShell, Bash and of course C# and F# (source). The code can be written using the built-in editor, or using Visual Studio (source).
Functions can also be invoked in many ways (source). I chose HTTP trigger, so that I was able to use it via the “HTTP Request” action.
The solution
I wrote a simple function, that does the following:
- Receives the Base64 string using the POST request;
- Decodes it into the byte array, extracts header and filename;
- Encodes file content’s back into Base64 string;
- Returns filename together with the Base64 string back to the requester.
The code itself is pretty easy, based on what can be found in Technet:
using System.Net; using System.IO; using System.Text; public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log) { log.Info("C# HTTP trigger function processed a request."); // Get request body dynamic requestBody = await req.Content.ReadAsAsync<object>(); string base64String = requestBody?.data; byte[] attachmentNodeBytes = Convert.FromBase64String(base64String); // Byte 20 contains a DWORD with the length of the // filename buffer. The filename is stored using Unicode // so the length is must be multiplied by 2. int fnLength = attachmentNodeBytes[20] * 2; byte[] fnBytes = new byte[fnLength]; // The actual filename starts at byte 24 for (int i = 0; i < fnLength; ++i) { fnBytes[i] = attachmentNodeBytes[24 + i]; } // Convert the filename bytes to a string. The string // terminates with \0 so the actual filename is the // original filename minus the last character ! char[] charFileName = UnicodeEncoding.Unicode.GetChars(fnBytes); string fileName = new string(charFileName); fileName = fileName.Substring(0, fileName.Length - 1); // The file is located after the header, which is 24 bytes long // plus the length of the filename. byte[] fileContents = new byte[attachmentNodeBytes.Length - (24 + fnLength)]; for (int i = 0; i < fileContents.Length; ++i) { fileContents[i] = attachmentNodeBytes[24 + fnLength + i]; } string fileContentsInBase64 = Convert.ToBase64String(fileContents); return base64String == null ? req.CreateResponse(HttpStatusCode.BadRequest, "Please pass a name on the query string or in the request body") : req.CreateResponse(HttpStatusCode.OK, fileName + "###" + fileContentsInBase64); }
The Flow is also not complicated. It does the following:
- Receives a HTTP Request containing information: InfoPath filename, name of the attachment field in InfoPath, target library name;
- Reads form’s file contents, using the xpath function gets Base64 strings and stores them in array;
- Then for each item from an array, it calls my Azure Function to split file contents from its name;
- Then creates a file in the given library, using obtained from Azure Function filename and file contents;
- Then replaces the Bae64 string in the InfoPath form with an empty string and updating the form so that the attachment is removed.
The expression using xpath function to get list of attachments looks as below:
xpath(xml(body('Get_file_content_using_path')), concat('//*[contains(local-name(), "', triggerBody()['text_1'], '")]/text()'))
Where the “text_1” is one of the flow’s input parameters, the name of the attachments control.
Results (yeah, I know Azure Function’s run history wasn’t refreshed when I opened it, but it did execute for real):
Possible development
The possible next steps to develop the solution may be:
- Insertion of a hyperlink to the file, uploaded into library, back to the form, to a dedicated field in the form, next to the attachment control;
- Creation of subfolders, ex. for each form flow can create a separate subfolder using its name, to store its files.
How do you like the solution? Please share your comments!
Charles
This article was a huge help to me. I always find the Microsoft documentation lacking in clear explanation. Your article was very clear a nice to follow. This helped me complete an important project. Thank you
-Charles
Tomasz Poszytek
Thank you for your kind appreciation 🙂 It’s always a wonderful feeling knowing that my work helps others.
Carolee
I would be very grateful for your help with the specifics of this Flow… There is a Gif in the body of the blog that will not display for me… Also, I am very experienced with SharePoint and thought I knew Flow pretty well, but I am not a hardcore developer… I could use some details…. Could you help me please?
Tomasz Poszytek
Sure I will. Let’s catch up via linkedin.
Dorinda Reyes
I could really use help with this as well I have an urgent need to get the documents out of several thousand InfoPath forms.
Thank you
Tomasz Poszytek
Please use contact form to reach me.
PowerAppFlow
Hi Tomasz! Thank you for a great post! This was very helpful!
Just wondering how you solved multiple attachments within same form (xml file)?
Tomasz Poszytek
Possibly by iterating one by one and merging the outcome 🙂
Stuart Shutts
i was able to do this easily in Nintex, Copy to Sharepoint from Infopath attachment, in Nintex you do a create/edit XML, go to your form, find the attachment field, then copy that as the source for the ‘Copy to Sharepoint’ object.
Dan KNnott
Thanks so much for this. I’ve been struggling for the flow part of your solution for a few days now. Is there any change of sharing or screenshots of the specifics of the flow actions.
Tomasz Poszytek
Yes, there are some changes in the UI, but that shouldn’t affect the overall approach. What’s the issue?
manjula
The Power automate workflow steps are not clear can you please post them step by step also Using the Azure function is free or its chargeable please let us know
Tomasz Poszytek
Unfortunately I don’t have access to it anymore. You need to try re-build it on your own. Azure Function requires Azure Subscription. Here you can find more about Azure Functions pricing: https://docs.microsoft.com/en-us/azure/azure-functions/pricing
Brendan Horner
For those who ask about multiple attachment fields and potentially picture attachment controls – you can loop through all fields in the xml form to find any that are greater than, say, 500 characters and don’t have a single space in them to process them. For image attachments, picture attachment controls do not add the file header – their entire base64 string should be processed as the file content and you generate a filename. The first 4 bytes of a proper infopath attachment control will be 199,73,70,65 every time. So you can check for those and process the way you explained…if they don’t exist, convert the file directly.
Tomasz Poszytek
Thanks!
Miguel Isidro
Hello Tomasz,
Is it possible to have greater detail of the Flow in this solution?
Or alternatively, make available the export of the Flow for us to download?
The Azure funtion part is clear for me, but not clear is the way the flow gets the file attachment and calls the Azure Functiln to get the byte array of the attachment.
Thamls
Tomasz Poszytek
Hi, unfortunately I have built it years ago and the flow is gone. I can try to build it again, if you’re interested write me please a message.
Miguel Isidro
Hello Tomasz,
If you can rebuild the flow and update the post with the flow details step by step it would be a major help!
Thanks,
Miguel
Tomasz Poszytek
Please reach out to me via the contact form 🙂
Miguel Isidro
Hello Tomasz,
If you can rebuild the flow and update the post with the flow details step by step it would be a great help!
Thanks,
Miguel
Rockie
Gretings Tomasz!
Can You tell something about fees – how much does it costs to store Azure function in the cloud?
Want to use a free webhook instead of paid Premium Connector..
Best regards, Gennady
Tomasz Poszytek
That all depends on the consumption. Take a look at Azure Function costs calculator: https://azure.com/e/d792fa5baf4f4b9c9061c15da53cb759.
Rockie
Thank You much Tomasz!
>> Azure Function costs calculator
Wow, interesting!
They say “The first 400,000 GB/s of execution and 1,000,000 executions are free.”.. Looks like for my case (not more than 300 launches / per month) it costs 0$..
Best regards, Gennady