Assumptions
All exceptions must be caught and handled by a caller or wrapper (of SimFile). In the case of SimZip (wrapper of SimFile), this would be Main.
All sweeps involved in a file compression have the same number of samples.
(necessary for array regularity)
FIXED Feb 2009 - Generalizing the above, each component to be compressed must model as a rectangular array.
Exceptions
System.OverflowException: Arithmetic operation resulted in an overflow.
(occurred when trying to allocate an array with a negative size - which was interpreted as an extremely large integer)
ReadVersionHeader
System.ArgumentOutOfRangeException: Index was out of range.
Must be non-negative and less than the size of the collection.
GetByteBlockScale
System.ArgumentException
Message="Destination array was not long enough. Check destIndex and length, and the array's lower bounds." ... (occurred during Array.Copy)
Utility.ByteArrayConverter.SetChar
Utility.ByteArrayConverter.SetInt
System.ArgumentException: Empty path name is not legal.
PackFileComponent
System.ArgumentException: Destination array is not long enough to copy all the items in the collection. Check array index and length.
Utility.ByteArrayManipulator.ConvertCoordinates
System.ArgumentException: Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
DataTools.WriteJpgRaw
System.IndexOutOfRangeException: Index was outside the bounds of the array.
Utility.ByteArrayManipulator.PickPeriodicValues
Utility.ByteArrayManipulator.Transpose
Utility.StreamProcessor.TransformTranspose
System.DivideByZeroException: Attempted to divide by zero.
Utility.ByteArrayManipulator.PickPeriodicValues
System.FormatException: Input string was not in a correct format.
Utility.ByteArrayConverter.ReadValueFromCsvStream
System.NullReferenceException: Object reference not set to an instance of an object.
Model.BuildDictHisto
System.ObjectDisposedException: Cannot access a closed file
SimFile.Sweep.StreamWrite
System.IO.IOException: The process cannot access the file '
F:\My Documents\Visual Studio 2005\Projects\SimFile\SimFile\fileHdr.raw' because
it is being used by another process.
IsolateFileHeaderCompressed
System.IO.FileNotFoundException: Could not find file 'F:\My
Documents\Visual Studio 2005\Projects\SimFile\SimFile\data_image.cim'.
ReadVersionHeader
Utility.FileProcessor.Catenate
PackFileComponent
DataTools.ConvertFileRaw
Future work: SimFile
· StreamToArray2D and its callers should be modified so that callees work correctly from the current stream pointer (rather than forcing or (worse) assuming Position == 0).
... Other methods may need to be identified and modified similarly.
· Methods that take an output block, stream, or file - should be modified to take an array of inputs and an array of outputs, with the precondition that it's either n to n corresponding, or 1 to n scalar. (Perhaps using the param keyword.)
· Sweep:: public void StreamWriteData( Stream outStream, FileContent fc, int start, int count ) {
// Prepare and write the Sweep data.
// Review this part later. Not sure this index-changing is all good.
// .. Review this with all Stream Io methods.
(should probably return Io.Result.BadIndex for some of these conditions)
· amend StreamRead methods so that if Io.Result.Underflow is returned, data is read up to the point of underflow.
· Exceptions - review lower-level methods for finer-grained handling.
Future work: SimZip
· Sweeps must be allowed to have variable number of samples, even if compression suffers.
· Experiment with command-line switches (the previous set did well in prelim testing, but not for SimZip)
· Review LZMA source-code to incorporate direct streaming rather than using the .exe
· StreamToArray2D and its callers should be modified so that callees work correctly from the current stream pointer (rather than forcing or (worse) assuming Position == 0).
... Other methods may need to be identified and modified similarly.
· One file component (I think it's byte 3) is so self-similar, that even lzma's rendering of 138 bytes overhead is probably more than needed. You could use a very simple run-length encoding, reducing the compressed file size by ~100 bytes.
· Methods that take an output block, stream, or file - should be modified to take an array of inputs and an array of outputs, with the precondition that it's either n to n corresponding, or 1 to n scalar. (Perhaps using the param keyword.)
· Exceptions - review lower-level methods for finer-grained handling.
Future work: Utility
· StreamToArray2D and its callers should be modified so that callees work correctly from the current stream pointer (rather than forcing or (worse) assuming Position == 0).
... Other methods may need to be identified and modified similarly.
· Methods that take an output block, stream, or file - should be modified to take an array of inputs and an array of outputs, with the precondition that it's either n to n corresponding, or 1 to n scalar. (Perhaps using the param keyword.)
· Exceptions - review lower-level methods for finer-grained handling.
Io.Result
namespace Utility
class StreamProcessor
¬ TransformTranspose ¬ {StreamToArray2D}
Underflow
Overflow
OK
¬ TransformCopyStream
OK
¬ CopyStreamMinusPrefix ¬ {ReadByteStream}
BadIndex
OK
¬ Transform(Stream,Stream,int[])
Mismatch
Underflow
Overflow
NoAction
OK
class FileProcessor
¬ CopyFileMinusPrefix
¬ Decatenate ¬ {StreamProcessor.ReadByteStream}
BadIndex
OK
namespace SimFile
class SimFile.Sweep
¬ StreamWrite
NoAction
OK
¬ StreamWriteHdr
BadFormat
OK
¬ StreamReadHdr
BadFormat
Underflow
Overflow
OK
¬ StreamRead ¬ StreamReadData
Underflow
Overflow
NoAction
OK
class SimFile
¬ FileWrite ¬ {StreamWrite}
¬ FileRead
¬ LoadByteBlock
OK
¬ StreamWriteHdr
¬ ReadVersionHeader
BadFormat
OK
¬ StreamWriteData
NoAction
OK
¬ StreamReadHdr
BadFormat
Underflow
Overflow
OK
¬ StreamReadData
Overflow
NoAction
OK
¬ StreamRead
Overflow
OK
class SimZip
¬ FindJaggedContent ¬ {UnpackFileComponent(Stream,FileContent,out int[])} ¬ {}
¬ UnpackFileComponent(string,string,FileContent,out int[],StreamProcessor.Transform,int[])
¬ UnpackFileComponent(string,FileContent,out int[])
¬ PackFileComponent
¬ TransformIntegerDataJg
¬ TransformSignalBt3Jg
¬ TransformSignalBt012Jg
OK
¬ Compress
NoAction
OK
¬ Decompress
¬ ReadVersionHeader
¬ ReadDataSizeHeader
¬ InvTransformIntegerDataJg
¬ InvTransformSignalBt3Jg
¬ InvTransformSignalBt012Jg
BadFormat
OK
¬ TransformIntegerData
¬ TransformSignalBt3
¬ TransformSignalBt012
¬ (transformRectangular within TransformJaggedFileComponent)
¬ InvTransformIntegerData
¬ InvTransformSignalBt3
¬ InvTransformSignalBt012
¬ (invTransformRectangular within InvTransformJaggedFileComponent)
Mismatch
OK
¬ (transform within UnpackFileComponent)
Mismatch
Underflow
Overflow
OK
¬ (transform within PackFileComponent)
BadFormat
Mismatch
Underflow
Overflow
OK
¬ TransformJaggedFileComponent
Mismatch
Underflow
NoAction
OK
¬ InvTransformJaggedFileComponent
Underflow
Overflow
OK
class DataTools
¬ ReadSignal
Underflow
Overflow
OK
Io.Result vs. Exceptions
Any method that opens a Stream, must try-catch all code between the opening and closing of that Stream, and must finally close the Stream.
In general, no specific standard objects derived from Exception are caught. Some methods of SimFile and SimZip will catch Exception objects and re-throw them as SimFileException or SimZipException (see below).
No Io.Result nor Exceptions have been built into classes AnalyzeData.Model and SimFile.DataTools. These are prototype and test-code classes.
Methods of classes within the Utility namespace, return Io.Result values and do not explicitly generate native Exceptions.
Classes SimFile and SimZip:
Methods that originate Io.Result values, just return those values and do not generate native Exceptions.
Methods that receive without interpreting an Io.Result value from a callee, return that value and do not generate an Exception.
Methods that receive and interpret Io.Result values from callees, must generate a native SimFileException or SimZipException, accordingly.
Each Main method must try-catch all code that invokes methods of any other class.
Known issues
· size of each file addressed by this code, is limited to the number of bytes expressible by a signed integer (2 GB).
· not yet a full-service file modification facility. Ex: modifying the "fileDesc" field of a sweep, must also hand-modify nDescChars. (Should use a Property that automatically updates the length of the string.)
· Version header info is not Unicode-compliant
· Minimal data validation is performed during StreamWrite and StreamRead. The reason for validating some integer data, is that such integers may be used as indices. If a negative integer is encountered in one of certain locations where only a non-negative integer is functional, then an Io.Result.BadFormat error is returned.
Matlab: m (method 0)
1. Collect rows odd/even (while floating) -- check again to see if this really helps
2. msb <- convert to bytes
3. msbx <- transpose
4. msbxg3 <- group row 3 (this means pulling out byte 3)
5. msbxg3.12 <- group rows 1,2
5a. msbxg3.12s <- collect odd/even cor looks like jpg! (but not much)
6. msbxg3.345 <- group rows 3,4,5
7. msbxg4 <- group row 4 (this means pulling out byte 4)
8. msbxg12 <- group rows 1, 2 (no correlation)
8a. msbxg12_1 <- first 129 rows
8b. msbxg12_1f <- convert to float
8a. msbxg12_1fi <- invert
8c. msbxg12_1fi_byt <- convert to bytes & check correlation
----
8a. msbxg12.4 <- truncate rightmost col(s) to leave a multiple of 4 (1 col truncated in this case)
8b. msbxg12t <- the truncated col(s) (4020 x 1 in this case)
8c. msbxg12.4.1 <- the first (4n x n) matrix from top of source (512 x 128 in this case)
8d. msbg12.4.1 <- transpose (n x 4n) (128 x 512 in this case)
8d. Convert 8, 8d to binary.
8e. Convert 8d to float. (Now have n x n square matrix) (128 x 128 in this case).
8f. msbg12.4.1i <- invert matrix, check correlation
Matlab: o (method 1)
1. o.byt <- convert to bytes
2. obt <- row 129 (truncate to modulo 4)
3. ob4 <- truncated to modulo 4 (128 in this case)
... can get (7) 128 x 512 blocks, plus nearly another.
4. ob4.1 <- first block of 128 x 512 bytes. Check correlation, should be zero.
4a. o4.1 <- convert to float, 128 x 128
4b. o4.1i <- invert matrix, check precision & correlation
Method 1 - Compress
Use C# for conversion to polar --> m.raw, o.raw, m.byt.csv
##o.raw --> ConvertFileRaw --> o.byt.csv
m.byt.csv --> shuffle.m --> msb.byt.csv
msb.byt.csv --> make_msb_file.m --> msb.finalxx.byt.csv
msb.finalxx.byt.csv --> ConvertFileRaw --> msb.finalxx.raw
m.raw --> shuffle.m --> msb.raw
lzma e msb.finalxx.raw msb.finalxx.lzma
lzma e o.raw o.lzma
Method 1 - Decompress
lzma d o.lzma o.bck.raw
lzma d msb.finalxx.lzma msb.finalxx.bck.raw
msb.finalxx.bck.raw --> ConvertFileRaw --> msb.finalxx.bck.byt.csv
msb.finalxx.bck.byt.csv --> unmake_msb_file --> msb.bck.byt.csv
msb.bck.byt.csv --> unshuffle --> m.bck.byt.csv
m.bck.byt.csv --> ConvertFileRaw --> m.bck.raw
fc o.raw o.bck.raw > fco.txt
fc m.raw m.bck.raw > fcm.txt
Rectangular - Compress
Start with original.sim
LoadByteBlock( SweepContent.FileHdr )
fileHdr.raw <-- WriteByteBlock( outStream, FileFormat.raw )
SwpHdrs.Compress --> swphdr.final.lzma
Short.Compress --> short.final.lzma
LoadByteBlock( SweepContent.ComplexData )
ByteOffset[] bar = { ByteOffset.B0, ByteOffset.B1, ByteOffset.B2 };
abcde012.raw <-- WriteByteBlockRaw( outStream, FileFormat.Raw , SignalId.All, ComponentId.All, bar );
ByteOffset[] bar = { ByteOffset.B3 };
abcde3.raw <-- WriteByteBlockRaw( outStream, FileFormat.CsvByte , SignalId.All, ComponentId.All, bar );
abcde3.byt.csv --> make_abcde3_nis --> abcde3.nis.byt.csv
abcde3.nis.byt.csv --> ConvertFileRaw --> abcde3.nis.raw
lzma e abcde3.nis.raw abcde3.nis.lzma
lzma e fileHdr.raw fileHdr.lzma
Concatenate fileHdr.lzma, swphdr.final.lzma, short.final.lzma, abcde3.nis.lzma, abcde012.raw --> original.cim
Rectangular - Decompress
##lzma e abcde012.raw abcde012.lzma
Short.Decompress
SwpHdrs.Decompress
lzma d abcde3.nis.lzma abcde3.nis.bck.raw
abcde3.nis.bck.raw --> ConvertFileRaw --> abcde3.nis.bck.byt.csv
abcde3.nis.bck.byt.csv --> unmake_abcde3_nis --> abcde3.bck.byt.csv
(abcde3.bck.byt.csv --> ConvertFileRaw --> abcde3.bck.raw)
Scraps
string tag = Io.GetResultTag( rtn );
string msg = "Sweep.StreamRead(Stream,FileContent,int,int): '"+tag+"' signal from Sweep.StreamReadHdr";
throw new SimFileException( msg );
-------------------
for ( int n = 0, m = 0; n < nBytesSweepHdr; n += m ) {
m = inStream.Read( bufSweepHdr, n, nBytesSweepHdr - n );
}
> SimFile.exe!Compress.SimFile.StreamReadHdr(System.IO.Stream inStream = {System.IO.FileStream}) Line 1307 C#
SimFile.exe!Compress.SimFile.FileRead(string fileSpec = "..\\..\\data_image.sim") Line 1380 + 0x9 bytes C#
SimFile.exe!Compress.SimFile.Compress(string inFile = "..\\..\\data_image.sim", string outFile = "..\\..\\data_image.cim", bool szChkFlg = false) Line 479 + 0xb bytes C#
SimFile.exe!Compress.SimFile.Test.Main(string[] args = {Dimensions:[0]}) Line 2261 + 0x17 bytes C#
switch ( rtn ) {
case Io.Result.BadFormat :
case Io.Result.BadIndex :
case Io.Result.Mismatch :
case Io.Result.Underflow :
case Io.Result.Overflow :
case Io.Result.NoAction :
foreach ( string file in cmFiles ) File.Delete( file );
return rtn;
case Io.Result.OK :
default :
}
switch ( rtn ) {
case Io.Result.BadFormat :
case Io.Result.BadIndex :
case Io.Result.Underflow :
case Io.Result.Overflow : return rtn;
case Io.Result.OK :
default :
}
switch ( rtn ) {
case Io.Result.BadFormat :
case Io.Result.Mismatch :
case Io.Result.Underflow :
case Io.Result.Overflow :
case Io.Result.NoAction : return rtn;
case Io.Result.OK :
default :
}
switch ( rtn0 ) {
case Io.Result.Underflow :
case Io.Result.Overflow : return false;
case Io.Result.OK :
default :
switch ( rtn1 ) {
case Io.Result.Underflow :
case Io.Result.Overflow : return false;
case Io.Result.OK :
default : return true;
}
}
switch ( rtn ) {
case Io.Result.BadFormat :
case Io.Result.NoAction : return rtn;
case Io.Result.OK :
default :
if ( szAry.Length == 3 ) {
//if ( szAry[1] != szAry[2] ) rtn = true; // Jagged array detected.
if ( szAry[1] != szAry[2] ) isJag = true; // Jagged array detected.
}
}
fc == FileContent.SwpHdr
fc == FileContent.IntegerData
case Io.Result.Overflow :
if ( i + 1 < mySweeps.Count ) {
rtn = Io.Result.OK;
break;
} else {
return rtn; // Supposed to be finished.
}
} catch ( Exception ex ) {
inStream.Close();
throw ex;
}
------------------------------
try {
string emsg = "Call to SimFile.StreamWrite(Stream, FileContent) from SimFile.FileWrite(string)";
rtn = StreamWrite( outStream, FileContent.All );
} catch ( NullReferenceException ex ) {
throw new SimFileException( emsg, ex );
} catch ( Exception ex ) {
throw new SimFileException( emsg, ex );
} finally {
outStream.Close();
}
} catch ( Exception ex ) {
string emsg = "in SimZip.Compress(string,string,bool)";
throw new SimZipException( emsg, ex );
} finally {
fs.Close();
}
} catch ( Exception ex ) {
string emsg = "in SimZip.PackFileComponent(string,string,ZipContent,Pair<int,int[]>,StreamProcessor.Transform,int[])";
throw new SimZipException(emsg, ex);
} finally {
s3Stream.Close();
}
} catch ( Exception ex ) {
string emsg = "in SimZip.InvTransformSignalBt3(Stream,Stream,int[])";
throw new SimZipException( emsg, ex );
} finally {
foreach ( Stream s in sAry ) s.Close();
}
-----------------
rtn = PackFileComponent(
cmFile0, cmFile1, ZipContent.FileHdr, unk, StreamProcessor.TransformCopyStream, bufSize
);
------------
//aryCfg.Second[0] = NbrSweeps * nComponentsPerSample;
szAry[1] = (int) ( outStream.Length / (NbrSweeps * nComponentsPerSample) );
mySzAry[0] = szAry[1] * nBytesInt16PerSample;
mySzAry[1] = szAry[0];
Short - Compress
· LoadByteBlock( SweepContent.IntegerData );
· WriteByteBlock( outStream, FileFormat.CsvByte ) --> short.byt.csv
· short.byt.csv --> make_short_file --> short.final.byt.csv
· short.final.byt.csv --> ConvertFileRaw --> short.final.raw
· lzma e short.final.raw short.final.lzma
Sweep Headers - Compress
· LoadByteBlock( SweepContent.Header );
· WriteByteBlock( outStream, FileFormat.CsvByte ) --> swphdr.byt.csv
· swphdr.byt.csv --> make_swphdr_file --> swphdr.final.byt.csv
· swphdr.final.byt.csv --> ConvertFileRaw --> swphdr.final.raw
· lzma e swphdr.final.raw swphdr.final.lzma
Tasks
legend: Ö = completed; ÿ = in progress; · = priority; ° = later
--------------------------------------------------------------
Ö Apply version header with flags to compressed file.
Ö Compare sizes after decompression.
Ö StreamRead methods must check nRead and loop - be sure you fix this.
Ö Use Io.Result rather than returning integers
Ö Everywhere using stream.Length, be sure you *expect* a long (don't cast) - may require code revision - or just put it on a list of known issues.
Ö Remove 2nd parameter from ReadZipComponent (or document why it's left in).
Ö Accomodate uneven Sweep Headers.
Ö Test with uneven Sweep Headers.
Ö Test with smaller .sim files.
Ö Review previous results and try command-line switches.
Ö Update use of Io.Result return values for Stream methods.
Ö Test with bad-format files - (1) Compress finds a non-Sim file, (2) Decompress finds a non-SimZip file
Ö Strip out "brute-parallel" pre-compression code.
Ö Reorganize code base, segregate Compress/Decompress code.
Ö private data + Properties
Ö exception handling
Ö Build .exe file for Compress/Decompress.
Ö Validate command-line arguments
Ö Re-test smaller files (~15 minutes).
Ö accept jagged signal data
Ö Update SimZip API.
Ÿ Update DataTools API
· Update Model API.
· Update Matlab API.
ð Update project paper.
· Submit blue form to Jen Aaseth to request room & time for presentation.
· Prepare poster board(s) for Grad Showcase presentation.
· Fix the project and .exe to be named "SimZip" (~15 minutes).
· Update slides for ES Dept. presentation.
ÿ Update SimFile API (including comments for GetByteBlockScale, etc.)
Ÿ Update Utility API
· Apply LGPL (Lesser GNU Public License) to source code.
° configure data file directory path
° debug log
° Test size-check feature in Compress. From Main, catch and handle "NoAction" exception as failure of size-check.
° Remove my neophyte ByteArrayConverter class. replace with standard C# facilities.