How to transfer large DataSet - GenuineChannels/GenuineChannels GitHub Wiki

DataSet is not a file and should be sent specifically.

By Dmitry Belikov

I would divide all DataSets being transferred into two broad categories: DataSets that contain changes and DataSets that contain plain data.

If you need to send a DataSet with changes, then obtain the DataSet via the DataSet.GetChanges method and send it directly. Such DataSets are usually small and it will not cause any problems. Or, if it is huge, serialize it into a stream and read my previous article.

Here I will try to show the efficient way of sending really huge DataSets with plain data. Suppose you need to send about 3000 rows. You fill it up and call the DataSet.AcceptChanges method. Also add here the mandatory requirement to service several clients concurrently without decreasing the performance and you will understand my feelings when several of my customers started experiencing this problem. :-)

First of all, try to perform a small test:

using System;
using System.Data;
using System.Runtime.Remoting.Messaging;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.IO;

// create DataSet structure
DataSet dataSet = new DataSet();
DataTable dataTable = new DataTable();
dataSet.Tables.Add(dataTable);

dataTable.Columns.Add("id", typeof(Int32));
dataTable.Columns.Add("smallString", typeof(String));
dataTable.Columns.Add("largeString", typeof(String));

// fill it up with data
const int TOTAL_ROWS = 500;
for ( int i = 0; i < TOTAL_ROWS; i++)
{
    DataRow dataRow = dataTable.NewRow();
    dataRow["id"] = i + 1;
    dataRow["smallString"] = "Small string number " + (i + 1);  

    const int LARGE_STRING_LENGTH = 120;
    StringBuilder builder = new StringBuilder(LARGE_STRING_LENGTH);
    for ( int o = 0; o < LARGE_STRING_LENGTH; o++)
        builder.Append((char) (short) ((o % 26) + 'A'));
    dataRow["largeString"] = builder.ToString();  

    dataTable.Rows.Add(dataRow);
}
dataSet.AcceptChanges();

// test direct serialization
const int NUMBER_OF_TRIES = 30;
DateTime started = DateTime.Now;
BinaryFormatter binaryFormatter = new BinaryFormatter(new
     RemotingSurrogateSelector(), new StreamingContext
     (StreamingContextStates.Other));

for ( int i = 0; i < NUMBER_OF_TRIES; i++)
{
    // serialize
    MemoryStream memoryStream = new MemoryStream();
    binaryFormatter.Serialize(memoryStream, dataSet);      

    // deserialize
    memoryStream.Position = 0;
    DataSet deserializedDataSet = 
         (DataSet) binaryFormatter.Deserialize(memoryStream);
}
DateTime finished = DateTime.Now;

Console.WriteLine("Direct serialization took {0} milliseconds.", (finished - started).TotalMilliseconds);

// test xml serialization
started = DateTime.Now;
binaryFormatter = new BinaryFormatter(new RemotingSurrogateSelector(),
    new StreamingContext(StreamingContextStates.Other));
for ( int i = 0; i < NUMBER_OF_TRIES; i++)
{
    // xml serialization
    string xml = dataSet.GetXml();
    MemoryStream memoryStream = new MemoryStream();
    binaryFormatter.Serialize(memoryStream, dataSet.Clone());
    binaryFormatter.Serialize(memoryStream, xml);      

    // deserialize
    memoryStream.Position = 0;
    DataSet deserializedDataSet = 
        (DataSet) binaryFormatter.Deserialize(memoryStream);
    string deserializedXml = 
        (string) binaryFormatter.Deserialize(memoryStream);
    StringReader reader = new StringReader(deserializedXml);
    deserializedDataSet.ReadXml(reader, XmlReadMode.IgnoreSchema);
}
finished = DateTime.Now;

Console.WriteLine("XML serialization took {0} milliseconds.", (finished - started).TotalMilliseconds);

Console.WriteLine("Press ENTER to exit.");
Console.ReadLine();

In the first case I showed how .NET Remoting would behave if you sent the DataSet as a parameter (or return it as a result).

In the second case I sent the DataSet structure and its XML content separately.

Of course, the second approach is faster. Usually 8 seconds vs 5.7 seconds on my machine.

It doesn't matter though because the difference is not so big. Another issue is important. It is rather a bad idea to send the entire XML content at one call. Instead you should send a little at a time as I showed in my previous article.

You can download the sample with source code here. Unpack it and open the Known\Known.sln solution in Visual Studio.

The interface allowing clients to receive XML content looks like this:

public interface IPartStringProvider
{
    /// <summary>
    /// Returns the next portion or null if the entire string was
    /// downloaded.
    /// </summary>
    /// <returns>Next substring or null.</returns>
    string GetNextPortion();
}

The server implementation takes a string and then sends portions from it:

public class SmartStringProvider : MarshalByRefObject, IPartStringProvider
{
    public SmartStringProvider(string strToGive)
    {
        this.str = strToGive;
    }   

    private string str;
    private int currentPosition = 0;

    /// <summary>
    /// It's about 64 Kb because of UNICODE.
    /// </summary>
    public const int PORTION_SIZE = 32768;  

    /// <summary>
    /// Returns the next portion or null if the entire string was
    /// downloaded.
    /// </summary>
    /// <returns>Next substring or null.</returns>
    public string GetNextPortion()
    {
      lock(this)
      {
          if (currentPosition >= str.Length)
              return null;    

          int sizeToSend = Math.Min(str.Length - currentPosition,
              PORTION_SIZE);
          string result = this.str.Substring(currentPosition,
              sizeToSend);
          currentPosition += sizeToSend;    
          return result;
      }
    }
}

The server returns the DataSet containing the structure and the XML content provider.

public static IPartStringProvider GetMyDataHandler(out DataSet dataSetWithSchema)
{
   lock(_dataSet)
   {
      dataSetWithSchema = _dataSet.Clone();
      SmartStringProvider smartStringProvider = 
         new SmartStringProvider(_dataSet.GetXml());
      return smartStringProvider;
   }
}

And this is how the client downloads it:

DataProvider dataProvider = new DataProvider();
DataSet dataSet = null;
IPartStringProvider stringProvider = 
    dataProvider.GetMyData(out dataSet);

StringBuilder builder = new StringBuilder();

// download the entire string
string nextPortion = null;
while ( (nextPortion = stringProvider.GetNextPortion()) != null )
     builder.Append(nextPortion);

StringReader stringReader = 
         new StringReader(builder.ToString());

dataSet.ReadXml(stringReader, XmlReadMode.IgnoreSchema);

Console.WriteLine("Download finished. Total rows: {0}.\r\nPress ENTER to exit.",
     dataSet.Tables[0].Rows.Count);
Console.ReadLine();
⚠️ **GitHub.com Fallback** ⚠️