all groups > dotnet clr > may 2007 >
You're in the

dotnet clr

group:

Buffer.BlockCopy and very long arrays (more than Int32.MaxValue el


Buffer.BlockCopy and very long arrays (more than Int32.MaxValue el Luca Martinetti - Phatsoft
5/31/2007 5:58:02 AM
dotnet clr:
Hi,

Anybody has ever needed to copy a very long array (more than Int32.MaxValue
elements) on x64?

I found this solution

private static void LongBufferBlockCopy<T>(T[] src, long srcOffset, T[] dst,
long dstOffset, long count)
{
if (srcOffset < Int32.MaxValue && dstOffset < Int32.MaxValue &&
count < Int32.MaxValue && count < Int32.MaxValue)
{
Buffer.BlockCopy(src, (int)srcOffset, dst, (int)dstOffset,
(int)count);
}
else
{
for (long i = 0; i < count; i++)
{
dst.SetValue(src.GetValue(srcOffset + i), dstOffset +
i); //TODO: check performance
}
}
}

Some cleaner, faster solution?

Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Luca Martinetti - Phatsoft
5/31/2007 6:51:02 AM
Hi Marc,

I'm currently trying to implement a MemoryStream that support addressing of
more than 2Gb. So I need a very large bye[] and need to copy on resize.

Any idea?

--
Phatsoft Inc.


[quoted text, click to view]
Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Luca Martinetti - Phatsoft
5/31/2007 7:10:03 AM
Sure! I'll love to see that code!

You can mail me at luca@domainsbot.com.

Thanks so much!

L.


[quoted text, click to view]
Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Luca Martinetti - Phatsoft
5/31/2007 7:56:04 AM
Thanks for your help!
--
Phatsoft Inc.


[quoted text, click to view]
Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
5/31/2007 12:16:03 PM
OK - I haven't profiled / optimised this at all - it is a little (not
much) slower than MemoryStream; I'd guess that caching the current
buffer (rather than going to the list each time) might help... but
here's what I have:

public sealed class BigMemoryStream : Stream {
private long position, length;
private readonly int bufferSize;
private List<byte[]> buffers;
private const int DEFAULT_BUFFER_SIZE = 2 ^ 20;
public BigMemoryStream() : this(DEFAULT_BUFFER_SIZE, null) { }
public BigMemoryStream(byte[] data) : this(DEFAULT_BUFFER_SIZE,
data) { }
public BigMemoryStream(int bufferSize) : this(bufferSize, null)
{ }
public BigMemoryStream(int bufferSize, byte[] data) {
if (bufferSize <= 0) throw new
ArgumentOutOfRangeException("bufferSize", "bufferSize must be a
positive integer");
this.bufferSize = bufferSize;

if (data != null) {
int blocks = data.Length / bufferSize, rem = data.Length %
bufferSize;
if (rem > 0) blocks++;
buffers = new List<byte[]>(blocks);
int sourceOffset = 0;
for (int i = 0; i < blocks; i++) {
byte[] blockData = new byte[bufferSize];
Buffer.BlockCopy(data, sourceOffset, blockData, 0, i
== blocks - 1 ? rem : bufferSize);
buffers.Add(blockData);
}
} else {
buffers = new List<byte[]>();
}
}

public override bool CanRead { get { return buffers != null; } }
public override bool CanWrite { get { return buffers != null; } }
public override bool CanSeek { get { return buffers != null; } }
public override long Length { get { return length; } }
public override long Position {
get { return position; }
set {
if (value == Position) return;
if (value < 0 || value >= Length) throw new
ArgumentOutOfRangeException("Position");
position = value;
}
}
public override long Seek(long offset, SeekOrigin origin) {
switch (origin) {
case SeekOrigin.Begin:
Position = offset; break;
case SeekOrigin.End:
Position = Length + offset; break;
case SeekOrigin.Current:
Position += offset; break;
default:
throw new ArgumentException("Seek operation not
recognised", "origin");
}
return Position;
}
private void CheckDisposed() {
if (buffers == null) throw new
ObjectDisposedException(typeof(BigMemoryStream).Name);
}
public override void SetLength(long value) {
CheckDisposed();
int blocks = (int) (value / bufferSize), rem = (int) (value %
bufferSize);
if (rem > 0) blocks++;
int blockIncrease = blocks - buffers.Count;
if (blockIncrease > 0) {
while (blockIncrease > 0) {
buffers.Add(new byte[bufferSize]);
blockIncrease--;
}
} else if (blockIncrease < 0) {
buffers.RemoveRange(blocks, -blockIncrease);
buffers.TrimExcess();
}
bool wipeExcess = value < length;
length = value;
if (wipeExcess) {
int wipeIndex = (int)((length % bufferSize)) + 1;
if (buffers.Count > 0 && wipeIndex > 1 && wipeIndex <
bufferSize) {
Array.Clear(buffers[buffers.Count - 1], wipeIndex,
bufferSize - wipeIndex);
}
}
if (position > length) position = length;
}
public override int Read(byte[] buffer, int offset, int count) {
CheckDisposed();
int totalBytes = 0;
long maxRead = length - position;
if (count > maxRead) count = (int) maxRead;
while (position < length && count > 0) {
int bufferIndex = (int) (position / bufferSize),
bufferOffset = (int) (position % bufferSize);
int bytes = bufferSize - bufferOffset;
if (bytes > count) bytes = count;
Buffer.BlockCopy(buffers[bufferIndex], bufferOffset,
buffer, offset, bytes);
offset += bytes;
position += bytes;
totalBytes += bytes;
count -= bytes;
}
return totalBytes;
}
public override void Write(byte[] buffer, int offset, int count) {
CheckDisposed();
if (position + count > length) SetLength(position + count); //
make some space...
while (count > 0) {
int bufferIndex = (int) (position / bufferSize),
bufferOffset = (int) (position % bufferSize);
int bytes = bufferSize - bufferOffset;
if (bytes > count) bytes = count;
Buffer.BlockCopy(buffer, offset, buffers[bufferIndex],
bufferOffset, bytes);
offset += bytes;
position += bytes;
count -= bytes;
}
}

protected override void Dispose(bool disposing) {
if (disposing) {
buffers = null;
}
base.Dispose(disposing);
}

public override void Flush() { }
}
Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
5/31/2007 12:55:04 PM
***** BY ECK!

Muppet warning... 2 ^ 20... is (whoops) 22 ;-p Oddly enough, a buffer
for every 22 bytes is a bit of an overhead!!!

Change the line:

private const int DEFAULT_BUFFER_SIZE = 1048576;

And it now out-performs MemoryStream for large streams - and that is
before optimisation ;-p

Shame on me ;-(

Marc
Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
5/31/2007 2:24:57 PM
And one other comment...

At this is well and good, and will be great when I get my 128Gb of
RAM, but until then, you might give serious consideration to using a
FileStream in a temp area. At the end of the day, streams are pipes,
not buckets. You can carry data around in a pipe if you try hard
enough, but it isn't the best tool for the job; pipes are for pumping
contents through. Unless you have a glut of memory dedicated to your
app, the paging from having that much data loaded in your process
might outweigh the benefits of having it in-memory.

Marc
Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValue el Marc Gravell
5/31/2007 2:26:39 PM
I'm sure there is a good reason, but perhaps an option is to use a
series of smaller arrays? In a rectangular (not jagged) formation.
Perhaps via a wrapper class with a "this[long index]" indexer that
identifies the correct array index and Int32 offset from the Int64
index? It would be a little slower, but would have less overhead in
terms of trying to find a single block of memory that size.

Copying would then involve 1 block-copy (or Clone()) per inner array?

Marc

Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
5/31/2007 2:58:56 PM
I'm not sure that necessitates a large *single* array... you just need
the read and write methods to watch for breaks between the inner
arrays... I have done something similar when writing (for R&D purely)
a reader/writer (producer/consumer) stream, which used multiple
buffers rather than a single buffer. This approach also makes it far
more efficient to resize, as you can just expand / trim the last
buffer, without copying everything.

Let me know if you want more of an example.

Marc

Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
5/31/2007 3:09:30 PM
i.e. (pseudocode; broad terms)

int Read(byte[] buffer, int offset, int count) {
int totalCopied = 0;
while(count > 0 && data remaining) {
find bytesInCurrentBuffer using BUFFER_SIZE and
currentBufferOffset
int copyCount = max of count, bytesInCurrentBuffer
block copy that much to "offset"
offset += copyCount; count -= copyCount; totalCopied += copyCount
if buffer now empty, move to next buffer and reset buffer offset
to 0
}
return totalCopied;
}

Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
5/31/2007 3:14:46 PM
I'll have a stab at a multi-buffer MemoryStream on the train... as for
the producer/consumer stream - that may be lost to the depths of time
;-p But it wasn't that hard if you really want... the fun bit was the
sync working.

Marc

Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Ben Voigt [C++ MVP]
6/1/2007 8:40:12 AM

[quoted text, click to view]

Try (1 << 20) instead.

[quoted text, click to view]

Re: Buffer.BlockCopy and very long arrays (more than Int32.MaxValu Marc Gravell
6/1/2007 2:28:33 PM
[quoted text, click to view]

Cheers - I was in a hurry when I realised my school-boy error, and
didn't want to compound things by posting another error - so a literal
was the obvious... The problem with dealing in multiple languages /
syntax is that you forget to translate... ;-p

But yes... left shift - "d'oh!" to myself...

Cheers,

Marc
AddThis Social Bookmark Button