Typical Large message documents:
- Large flat file documents with high volume (many records) and occasionally batched
- Large flat file documents wrapped in a single CDATA section node in an XML
- Large XML documents with thousands to millions of "rows" that were batched together
- EDI interchanges where the file or data to be processed independently or in aggregate
- Large flat document with a header and trailer at the starts and ends of the file with thousands to millions of records, each record need to be processed separately from the others, but the entire sequence must be processed in order to complete properly
Transforming a document with a map is a memory-intensive operation. BizTalk Server passes the message stream to the .Net XslTransform class, which then loads the document into a .NET XPathDocument object for processing in BizTalk 2006/2004, Where as DOM in the case of BizTalk 2002/2000. Loading the document into the .NET XPathDocument can potentially expand the original file size in memory by a factor of 10 or more.
XPathDocument caches information about the nodes of the XML along with the data itself to allow for faster access, but this result into high performance penalties because of the redundant data that sits in the objects. This is where 90%+ of the Out Of Memory (OOM) exceptions that cause orchestrations and receive/send ports to fail.
This expansion may be more pronounced when mapping flat files because flat files must be parsed into XML before they can be transformed
Note:
1 MB document may be enough with JITTed product and user code assemblies, other messages flowing through the process enough to blow the process to 200-500 MB in memory.
Since BizTalk converts the data into XML for internal processing we need to worry more with the flat files (Non-Xml files) thou they are designed to be as efficient as possible in order minimize cost, but XML explicitly stated this as a non-goal, with readability as a much higher priority.
The best recommendation not to send data that is more than 1MB into BizTalk, without some form of custom processing or large memory machines.
If possible try to transform the XML file before passing onto BizTalk Server Orchestration.
Other approach is to use distinguished fields or property promotion in our process. Orchestration does not load the data of the message stream unless required orchestration will fetch the right value without loading the whole message into memory and update the value this is a powerful means to manipulate key fields without loading the whole document into memory.
Adjust the message size threshold above which documents are buffered to the file system during mapping. To modify the size threshold, create a DWORD value named TransformThreshold in the BizTalk Server registry
HKLM\Software\Microsoft\BizTalk Server\3.0\Administration\TransformThreshold
Enter a decimal value with the number of bytes to set the new threshold to. E.g. 2097152 to increase the message size threshold to 2 MB (from the default of 1 MB). Increase this value on systems with a large amount of available memory to improve throughput. Buffering documents to disk conserves memory at a slight cost to overall throughput.