Hashing file larger than 5 gig throws Out of memory exception

Hashing file larger than 5 gig throws Out of memory exception - hash

I am converting an IBrowserfile to a stream of 6gig. I am trying to hash the file in chunks but getting the Out Of Memory exception,
public static async Task<byte[]> hashmultiblock(Stream stream)
{
int chunckSize = 30000000;
int totalChunks = (int)(stream.Length / chunckSize);
if (stream.Length % chunckSize != 0)
totalChunks++;
using (SHA256 hashAlgorithm = SHA256.Create())
{
int offset = 0;
for (int i = 0; i < totalChunks; i++)
{
long position = (i * (long)chunckSize);
int toRead = (int)Math.Min(stream.Length - position + 1, chunckSize);
byte[] buffer = new byte[toRead];
if (i == totalChunks)
hashAlgorithm.TransformFinalBlock(buffer.ToArray(), 0, toRead);
else
hashAlgorithm.TransformBlock(buffer.ToArray(), 0, toRead, null, -1);
}
return hashAlgorithm.Hash;
}
}

Related

Processing heterogenous information as it arrives from a Socket

A server sends heterogeneous information as stream of bytes. Strings, for instance, are sent with 4 bytes indicating the length and then the characters. This means that my client app must read 4 bytes as an int (which implies that the 4 bytes are available) and then read as many bytes as indicated (the strings are latin1-encoded).
So far I tried two methods: read synchronously with a RawSocket and read the full data dump asynchronously with Socket.listen and process it later. The first method blocks the application, the second is wasteful as it requires to store all the data before processing it.
What I could do for asynchronously read N bytes from a Socket, process them, then read M bytes, process them, etc?

This sounds like you need a ring buffer/byte queue where you can append more data when it arrives, then consume as much as needed from the head when it's available.
There are different ways to implement one, depending on how much you want to avoid copying the bytes. The simplest would be a growing backing buffer with copying. The second approach would be keeping the original lists and combining them when you read only.
Here's a sample implementation:
// Copyright 2021 Google LLC.
// SPDX-License-Identifier: BSD-3-Clause
import "dart:typed_data" show Uint8List;
import "dart:collection" show Queue;
/// A cyclic buffer of bytes.
///
/// Bytes can be appended to the end of the buffer using [append]
/// and consumed from the start of the buffer by [read].
class ByteQueue {
Uint8List _buffer;
int _start = 0;
int _length = 0;
/// Creates a buffer with zero bytes.
///
/// If [initialCapacity] is provided, the buffer will start out
/// with that many bytes of initial capacity. It will not need to
/// grow until that capacity is exhausted.
ByteQueue({int initialCapacity = 256}) : _buffer = Uint8List(initialCapacity);
int get _end {
var end = _start + _length;
if (end > _buffer.length) end -= _buffer.length;
return end;
}
/// Number of bytes currently in the buffer.
///
/// This is the maximal number that can be read by [read] and [peek].
int get length => _length;
int operator [](int index) {
RangeError.checkValidIndex(index, _length);
var i = _start + index;
if (i > _buffer.length) i -= _buffer.length;
return _buffer[i];
}
/// Writes circular buffers into other circular buffers.
///
/// If [end] \< [start], the range of source wraps around at the end of the list.
/// If [offset] + ([end] - [start]) is greater than `target.length`, then
/// the write wraps around past the end of the list.
static void _write(
Uint8List source, int start, int end, Uint8List target, int offset) {
int length = end - start;
if (length >= 0) {
if (offset + length <= target.length) {
target.setRange(offset, offset + length, source, start);
} else {
var firstPart = target.length - offset;
target.setRange(offset, target.length, source, start);
target.setRange(0, length - firstPart, source, start + firstPart);
}
} else {
var firstPart = source.length - start;
_write(source, start, source.length, target, offset);
_write(source, 0, end, target, offset + firstPart);
}
}
static int _limit(int value, int limit) =>
value < limit ? value : value - limit;
/// Copies the next [count] bytes of the buffer into [target].
///
/// The bytes are *not* removed from the buffer, and can be read again.
/// The bytes are written starting at [offset] in [target].
void peek(int count, Uint8List target, [int offset = 0]) {
RangeError.checkValueInInterval(count, 0, _length, "count");
RangeError.checkValueInInterval(offset, 0, target.length, "offset");
if (target.length < count + offset) {
throw ArgumentError.value(
target, "target", "Must have room for $count elements");
}
var end = _limit(_start + count, _buffer.length);
_write(_buffer, _start, end, target, 0);
}
/// Returns the first byte of the buffer.
///
/// The buffer is not modified.
int peekByte() {
if (_length == 0) throw StateError("No element");
return _buffer[_start];
}
/// Consumes a single byte from the head of the buffer.
int readByte() {
if (_length == 0) throw StateError("No element");
var byte = _buffer[_start];
_start = _limit(_start + 1, _buffer.length);
_length -= 1;
return byte;
}
/// Consumes the next [count] bytes of the buffer and moves them into [target].
///
/// The bytes are removed from the head of the buffer.
/// The bytes are written starting at [offset] in [target].
void read(int count, Uint8List target, [int offset = 0]) {
RangeError.checkValueInInterval(count, 0, _length, "count");
RangeError.checkValueInInterval(offset, 0, target.length, "offset");
if (target.length < count + offset) {
throw ArgumentError.value(
target, "target", "Must have room for $count elements");
}
var end = _limit(_start + count, _buffer.length);
_write(_buffer, _start, end, target, 0);
_start = _limit(_start + count, _buffer.length);
_length -= count;
}
/// Removes the first [count] bytes from the buffer.
///
/// Can be useful after a [peek] has turned out to be the bytes
/// that you need, or if you know that the following bytes are
/// not useful.
void remove(int count) {
RangeError.checkValueInInterval(count, 0, _length, "count");
_start = _limit(_start + count, _buffer.length);
;
}
/// Appends [bytes] to the end of the buffer.
void append(Uint8List bytes) {
var newLength = _length + bytes.length;
if (newLength > _buffer.length) {
_grow(newLength);
}
_write(bytes, 0, bytes.length, _buffer, _end);
_length = newLength;
}
void _grow(int newLength) {
var capacity = _buffer.length;
do {
capacity *= 2;
} while (capacity < newLength);
var newBuffer = Uint8List(capacity);
_write(_buffer, _start, _end, newBuffer, 0);
_buffer = newBuffer;
_start = 0;
}
}
// Or another one with the same interface,
// but which doesn't copy bytes into the buffer, only out of it.
class ByteQueue2 {
final Queue<Uint8List> _source = Queue();
int _length = 0;
// The offset into the first element of _source that we haven't consumed.
int _start = 0;
int get length => _length;
void append(Uint8List bytes) {
_source.add(bytes);
_length += bytes.length;
}
int peekByte() {
if (_length == 0) throw StateError("No element");
return _source.first[_start];
}
int readByte() {
if (_length == 0) throw StateError("No element");
var first = _source.first;
var byte = first[_start];
_start += 1;
if (_start >= first.length) {
_source.removeFirst();
_start = 0;
}
_length -= 1;
return byte;
}
void peek(int count, Uint8List target, [int offset = 0]) {
RangeError.checkValueInInterval(count, 0, _length, "count");
RangeError.checkValueInInterval(offset, 0, target.length, "offset");
if (offset + count > target.length) {
throw ArgumentError.value(target, "target",
"Must have length >= ${offset + count}, was: ${target.length}");
}
var start = _start;
for (var source in _source) {
if (count == 0) return;
var length = source.length - start;
if (count <= length) {
target.setRange(offset, offset + count, source, start);
return;
}
target.setRange(offset, offset + length, source, start);
start = 0;
offset += length;
}
}
void read(int count, Uint8List target, [int offset = 0]) {
RangeError.checkValueInInterval(count, 0, _length, "count");
RangeError.checkValueInInterval(offset, 0, target.length, "offset");
if (offset + count > target.length) {
throw ArgumentError.value(target, "target",
"Must have length >= ${offset + count}, was: ${target.length}");
}
var start = _start;
while (count > 0) {
var source = _source.first;
var length = source.length - start;
if (count < length) {
target.setRange(offset, offset + count, source, start);
_start = start + count;
_length -= count;
return;
}
target.setRange(offset, offset + length, source, start);
offset += length;
count -= length;
_length -= length;
start = _start = 0;
_source.removeFirst();
}
}
}
(No priomises).

mergsort printing a strange result

I am having an issue with my merge sort, when I print out my sortedArray it only returns [ 0.0, 0.0.....] Im not sure if there is an error in my sort code or in my print line or if it has to do with doubles. The code I am us posted below.
By calling System.out.println(toString(sortedArray) I get an even more obscure answer.
Thanks for any help.
package mergesort;
import java.util.Arrays;
import java.util.Random;
public class mergesort {
public static void main(String[] args) {
double[] array = getIntArray();
long before = System.nanoTime();
double[] sortedArray= mergeSort(array);
System.out.println("Sorting took "+ (System.nanoTime() - before) +" nanoseconds ");
System.out.println(toString(array) + "\n\n" + toString(sortedArray) + "\n main method completed in: " + (System.nanoTime() - before) + " nanoseconds.");
}
private static String toString(double[] array) {
StringBuilder sb = new StringBuilder("[ ");
double len = array.length;
for(int i = 0; i < len - 1; i++) {
sb.append(array[i] + ", ");
}
sb.append(array[(int) (len - 1)] + " ]");
return sb.toString();
}
public static double[] mergeSort(double[] array) {
if (array.length <= 1) {
return array;
}
int half = array.length / 2;
return merge(mergeSort(Arrays.copyOfRange(array, 0, half)),
mergeSort(Arrays.copyOfRange(array, half, array.length)));
}
private static double[] merge(double[] ds, double[] ds2) {
int len1 = ds.length, len2 = ds2.length;
int totalLength = len1 + len2;
double[] result = new double[totalLength];
int counterForLeft =0,counterForRight=0,resultIndex=0;
while(counterForLeft<len1 || counterForRight < len2){
if(counterForLeft<len1 && counterForRight < len2){
if(ds[counterForLeft]<= ds2[counterForRight]){
result[resultIndex++] =(int) ds[counterForLeft++];
} else {
result[resultIndex++] =(int) ds2[counterForRight++];
}
}else if(counterForLeft<len1){
result[resultIndex++] = (int) ds[counterForLeft++];
}else if (counterForRight <len2){
result[resultIndex++] =(int) ds2[counterForRight++];
}
}
return result;
}
private static double[] getIntArray() {
double[] array = new double[10000];
Random random = new Random();
for(int i = 0; i < 10000; i++) {
array[i] = (random.nextDouble() * .99999);
}
return array;
}
}

In the merge method, when copying from one of the input arrays to the results, you cast to int. For example:
result[resultIndex++] =(int) ds[counterForLeft++];
All your doubles are in the range [0...1), so the result of casting any of them to int is zero. Just get rid of those casts, and you will keep your numbers in the merge result.
As an additional tip, it is much easier to debug small problems than large ones. It failed for any size greater than 2, so you should have been debugging with size 2, not 10000.

How do I use BER encoding with object System.DirectoryServices.Protocols.BerConverter.Encode("???", myData)

I need to encode and decode BER data. .NET has the class System.DirectoryServices.Protocols.BerConverter
The static method requires me to enter a string in the first parameter as shown below
byte[] oid = { 0x30, 0xD, 0x6, 0x9, 0x2A, 0x86, 0x48, 0x86, 0xF7, 0xD, 0x1, 0x1, 0x1, 0x5, 0x0 }; // Object ID for RSA
var result2 = System.DirectoryServices.Protocols.BerConverter.Decoding("?what goes here?", oid);
BER encoding is used in LDAP, Certificates, and is commonplace in many other formats.
I'll be happy with information telling me how to Encode or Decode on this class. There is nothing on Stack Overflow or the first few pages of Google (or Bing) regarding this.
Question
How do I convert the byte array above to the corresponding OID using BER decoding?
How can I parse (or attempt to parse) SubjectPublicKeyInfo ASN.1 data in DER or BER format?
It seems the DER encoding\decoding classes are internal to the .NET framework. If so, where are they? (I'd like to ask connect.microsoft.com to make these members public)

How do I convert the byte array above to the corresponding OID using BER decoding?
After you have extracted the OID byte array, you can convert it to an OID string using OidByteArrayToString(). I have included the code below, since I couldn't find a similar function in the .NET libraries.
How can I parse (or attempt to parse) SubjectPublicKeyInfo ASN.1 data in DER or BER format?
I was not able to find a TLV parser in the .NET SDK either. Below is an implementation of a BER TLV parser, BerTlv. Since DER is a subset of BER, parsing will work the same way. Given a BER-TLV byte[] array, it will return a list of BerTlv objects that support access of sub TLVs.
It seems the DER encoding\decoding classes are internal to the .NET framework. If so, where are they? (I'd like to ask connect.microsoft.com to make these members public)
Maybe somebody else can answer this question.
Summary
Here is an example of how you can use the code provided below. I have used the public key data you provided in your previous post. The BerTlv should probably be augmented to support querying like BerTlv.getValue(rootTlvs, '/30/30/06');.
public static void Main(string[] args)
{
string pubkey = "MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDrEee0Ri4Juz+QfiWYui/E9UGSXau/2P8LjnTD8V4Unn+2FAZVGE3kL23bzeoULYv4PeleB3gfmJiDJOKU3Ns5L4KJAUUHjFwDebt0NP+sBK0VKeTATL2Yr/S3bT/xhy+1xtj4RkdV7fVxTn56Lb4udUnwuxK4V5b5PdOKj/+XcwIDAQAB";
byte[] pubkeyByteArray = Convert.FromBase64String(pubkey);
List<BerTlv> rootTlvs = BerTlv.parseTlv(pubkeyByteArray);
BerTlv firstTlv = rootTlvs.Where(tlv => tlv.Tag == 0x30).First();//first sequence (tag 30)
BerTlv secondTlv = firstTlv.SubTlv.Where(tlv => tlv.Tag == 0x30).First();//second sequence (tag 30)
BerTlv oid = secondTlv.SubTlv.Where(tlv => tlv.Tag == 0x06).First();//OID tag (tag 30)
string strOid = OidByteArrayToString(oid.Value);
Console.WriteLine(strOid);
}
Output:
1.2.840.113549.1.1.1
OID Encode/Decode
public static byte[] OidStringToByteArray(string oid)
{
string[] split = oid.Split('.');
List<byte> retVal = new List<byte>();
//root arc
if (split.Length > 0)
retVal.Add((byte)(Convert.ToInt32(split[0])*40));
//first arc
if (split.Length > 1)
retVal[0] += Convert.ToByte(split[1]);
//subsequent arcs
for (int i = 2; i < split.Length; i++)
{
int arc_value = Convert.ToInt32(split[i]);
Stack<byte> bytes = new Stack<byte>();
while (arc_value != 0)
{
byte val = (byte) ((arc_value & 0x7F) | (bytes.Count == 0 ? 0x0:0x80));
arc_value >>= 7;
bytes.Push(val);
}
retVal.AddRange(bytes);
}
return retVal.ToArray();
}
public static string OidByteArrayToString(byte[] oid)
{
StringBuilder retVal = new StringBuilder();
//first byte
if (oid.Length > 0)
retVal.Append(String.Format("{0}.{1}", oid[0] / 40, oid[0] % 40));
// subsequent bytes
int current_arc = 0;
for (int i = 1; i < oid.Length; i++)
{
current_arc = (current_arc <<= 7) | oid[i] & 0x7F;
//check if last byte of arc value
if ((oid[i] & 0x80) == 0)
{
retVal.Append('.');
retVal.Append(Convert.ToString(current_arc));
current_arc = 0;
}
}
return retVal.ToString();
}
BER-TLV Parser
class BerTlv
{
private int tag;
private int length;
private int valueOffset;
private byte[] rawData;
private List<BerTlv> subTlv;
private BerTlv(int tag, int length, int valueOffset, byte[] rawData)
{
this.tag = tag;
this.length = length;
this.valueOffset = valueOffset;
this.rawData = rawData;
this.subTlv = new List<BerTlv>();
}
public int Tag
{
get { return tag; }
}
public byte[] RawData
{
get { return rawData; }
}
public byte[] Value
{
get
{
byte[] result = new byte[length];
Array.Copy(rawData, valueOffset, result, 0, length);
return result;
}
}
public List<BerTlv> SubTlv
{
get { return subTlv; }
}
public static List<BerTlv> parseTlv(byte[] rawTlv)
{
List<BerTlv> result = new List<BerTlv>();
parseTlv(rawTlv, result);
return result;
}
private static void parseTlv(byte[] rawTlv, List<BerTlv> result)
{
for (int i = 0, start=0; i < rawTlv.Length; start=i)
{
//parse Tag
bool constructed_tlv = (rawTlv[i] & 0x20) != 0;
bool more_bytes = (rawTlv[i] & 0x1F) == 0x1F;
while (more_bytes && (rawTlv[++i] & 0x80) != 0) ;
i++;
int tag = Util.getInt(rawTlv, start, i-start);
//parse Length
bool multiByte_Length = (rawTlv[i] & 0x80) != 0;
int length = multiByte_Length ? Util.getInt(rawTlv, i+1, rawTlv[i] & 0x1F) : rawTlv[i];
i = multiByte_Length ? i + (rawTlv[i] & 0x1F) + 1: i + 1;
i += length;
byte[] rawData = new byte[i - start];
Array.Copy(rawTlv, start, rawData, 0, i - start);
BerTlv tlv = new BerTlv(tag, length, i - length, rawData);
result.Add(tlv);
if (constructed_tlv)
parseTlv(tlv.Value, tlv.subTlv);
}
}
}
Here is a utility class that contains some functions used in the class above. It is included for the sake of clarity how it works.
class Util
{
public static string getHexString(byte[] arr)
{
StringBuilder sb = new StringBuilder(arr.Length * 2);
foreach (byte b in arr)
{
sb.AppendFormat("{0:X2}", b);
}
return sb.ToString();
}
public static byte[] getBytes(String str)
{
byte[] result = new byte[str.Length >> 1];
for (int i = 0; i < result.Length; i++)
{
result[i] = (byte)Convert.ToInt32(str.Substring(i * 2, 2), 16);
}
return result;
}
public static int getInt(byte[] data, int offset, int length)
{
int result = 0;
for (int i = 0; i < length; i++)
{
result = (result << 8) | data[offset + i];
}
return result;
}
}

How can i adapt speex echo canceller to process a float samples?

Good day! how сan i use float samples for echo cancellation processing? I tried to change interface and body of central function:
from
void speex_echo_cancellation(SpeexEchoState *st, const spx_int16_t *rec, const spx_int16_t *play, spx_int16_t *out);
to
void float_speex_echo_cancellation(SpeexEchoState *st, const float rec[], const float play[], float out[]);
and from
...
for (i=0;i<st->frame_size;i++)
{
spx_word32_t tmp_out;
tmp_out = SUB32(EXTEND32(st->input[chan*st->frame_size+i]), EXTEND32(st->e[chan*N+i+st->frame_size]));
tmp_out = ADD32(tmp_out, EXTEND32(MULT16_16_P15(st->preemph, st->memE[chan])));
if (in[i*C+chan] <= -32000 || in[i*C+chan] >= 32000)
{
if (st->saturated == 0)
st->saturated = 1;
}
**out[i*C+chan] = (spx_int16_t)WORD2INT(tmp_out);**
st->memE[chan] = tmp_out;
}
...
to
...
for (i=0;i<st->frame_size;i++)
{
spx_word32_t tmp_out;
tmp_out = SUB32(EXTEND32(st->input[chan*st->frame_size+i]), EXTEND32(st->e[chan*N+i+st->frame_size]));
tmp_out = ADD32(tmp_out, EXTEND32(MULT16_16_P15(st->preemph, st->memE[chan])));
if (in[i*C+chan] <= -32000 || in[i*C+chan] >= 32000)
{
if (st->saturated == 0)
st->saturated = 1;
}
**out[i*C+chan] = /*(spx_int16_t)WORD2INT(*/tmp_out*/)*/;**
st->memE[chan] = tmp_out;
}
...
and from
static inline void filter_dc_notch16(const spx_int16_t *in, spx_word16_t radius, spx_word16_t *out, int len, spx_mem_t *mem, int stride)
{
int i;
spx_word16_t den2;
den2 = (spx_word16_t)(radius*radius + .7f*(1.f-radius)*(1.f-radius));
for (i=0;i<len;i++)
{
spx_int16_t vin = in[i*stride];
spx_word32_t vout = mem[0] + SHL32(EXTEND32(vin),15);
mem[0] = mem[1] + 2*(-vin + radius*vout);
mem[1] = SHL32(EXTEND32(vin),15) - MULT16_32_Q15(den2,vout);
out[i] = SATURATE32(PSHR32(MULT16_32_Q15(radius,vout),15),32767);
}
}
to
static inline void float_filter_dc_notch16(const /*spx_int16_t*/spx_word16_t *in, spx_word16_t radius, spx_word16_t *out, int len, spx_mem_t *mem, int stride)
{
int i;
spx_word16_t den2;
den2 = /*(spx_word16_t)*/(radius*radius + .7f*(1.f-radius)*(1.f-radius));
for (i=0;i<len;i++)
{
/*spx_int16_t*/spx_word16_t vin = in[i*stride];
spx_word32_t vout = mem[0] + SHL32(EXTEND32(vin),15);
mem[0] = mem[1] + 2*(-vin + radius*vout);
mem[1] = SHL32(EXTEND32(vin),15) - MULT16_32_Q15(den2,vout);
out[i] = /*SATURATE32(*/PSHR32(MULT16_32_Q15(radius,vout),15)/*,32767)*/;
}
}
So, i prevented conversion from float type output result to short int, but now i get a warning:
speex_warning("The echo canceller started acting funny and got slapped (reset). It swears it will behave now.");
that points to st->screwed_up parameter having 50 values and it signs of setting to zero all out samples:
...
if (!(Syy>=0 && Sxx>=0 && See >= 0)
|| !(Sff < N*1e9 && Syy < N*1e9 && Sxx < N*1e9)
)
{ st->screwed_up += 50; for (i=0;iframe_size*C;i++) out[i] = 0; }
...
What can i do?
enter code here

Why do you want to use float samples?
Standard linear PCM audio is represented as integer samples according to the chosen bitrate - 8 bit, 16 bit and so on.
Where do you get that input from?
If I were you I would just convert whatever you got to shorts and provide it to Speex so it can work with it.

QRCode encoding and decoding problem

I want to split a file ( a docx file) and use the individual fragments of the file to encode a QRCode such that when the qrcodes are read in sequence, it reproduces the original file.
I was able to split the file and create a bunch of QRCodes but when attempted to recreate the file, the Decoder throws the following Error Message.
"Invalid number of finder pattern detected"
I am using http://www.codeproject.com/KB/cs/qrcode.aspx library.
My encoder code
private List Encode(String content, Encoding encoding, int
System.Drawing.Color qrCodeBackgroundColor,
QRCodeCapacity,System.Drawing.Color qrCodeBackgroundColor,System.Drawing.Color
qrCodeForegroundColor,int qrCodeScale, int NoOfQRcodes)
{
List<Bitmap> _qrcodesImages = new List<Bitmap>();
byte[] _filebytearray = encoding.GetBytes(content);
for (int k = 0,l=0; k < NoOfQRcodes; k++)
{
byte[] _tempByteArray = _filebytearray.Skip(l).Take(QRCodeCapacity).ToArray();
bool[][] matrix = calQrcode(_tempByteArray);
SolidBrush brush = new SolidBrush(qrCodeBackgroundColor);
Bitmap image = new Bitmap((matrix.Length * qrCodeScale) + 1, (matrix.Length * qrCodeScale) + 1);
Graphics g = Graphics.FromImage(image);
g.FillRectangle(brush, new Rectangle(0, 0, image.Width, image.Height));
brush.Color = qrCodeForegroundColor;
for (int i = 0; i < matrix.Length; i++)
{
for (int j = 0; j < matrix.Length; j++)
{
if (matrix[j][i])
{
g.FillRectangle(brush, j * qrCodeScale, i * qrCodeScale, qrCodeScale, qrCodeScale);
}
}
}
_qrcodesImages.Add(image);
l += QRCodeCapacity;
}
return _qrcodesImages;
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Hashing file larger than 5 gig throws Out of memory exception - hash

Related

Processing heterogenous information as it arrives from a Socket

mergsort printing a strange result

How do I use BER encoding with object System.DirectoryServices.Protocols.BerConverter.Encode("???", myData)

How can i adapt speex echo canceller to process a float samples?

QRCode encoding and decoding problem

Categories

Resources