I've been beating my head over this one and would appreciate any help.
What is happening: I'm making successive calls to Istream.Read to parse an Excel file, and everything is running perfect up to one point. Once that point is reached, the next call to Read reads in stuff from who knows where!
I've edited just about everything out except what is needed for debugging, so this will be easy to see if you take the time to load it.
1) Save the bas and xls files to the same directory.
2) Load the bas file and set a breakpoint on line 66. (ghettoDebugAssert = ghettoDebugAssert)
3) Run the program in debug mode until the first break at position &H8B0. This is the last correct read incase you want to step through it. It will read a number format into an array.
4) The next break is at position &H8CE which is directly after the number format in 3. I've placed a zero seek just beforehand so you can verify that qNewPos is reading the correct position in the file.
The very next read into the Identifier type should read in 2 words which are 00E0 and 0014 once you take into account endians, but it reads in 041E and 0018 instead. I've attached the binary Workbook file if you want to fire it up in a hex editor to follow along. I've searched the entire file for this 1E041800 sequence but it does not exist. Can you account for the sudden garbage read? FYI I have manually checked the cbRead and hr result on every read call and they all return expected values so it can't be that.
I guess my next step will be to load this up in olly to see what is going on. Any help you can provide would be a lifesaver.
I havn't looked at your program or data David, and I don't know if this comment will help, but about two weeks ago I was working with IStorage and IStream interfaces a good bit, and I discovered somewhat to my dismay that while the analogy to directories and files is pretty close, its not exact. For example, there is no counterpart that I know of to the Win Api GetFileSize() function. When an IStream is created its allocated in a 'block' type manner, say for example 2048 bytes. If you Write, say, 256 bytes to it, there is no way to recall that number. If you read past the 256 bytes you wrote into the 2048 byte buffer, you'll get S_OK return values, even though you've read beyond what you wrote. So I was wondering if your bizarre read was reading some control code of COM marking the end of a IStream/IStorage block?
In terms of the problem I was having with not being able to obtain the count of bytes written into an IStream, I decided to open a seperate IStream within my IStorage to record the count of bytes written into other IStreams within the IStorage. That worked out quite well.
To be perfectly honest, I didn't know there was any published info on the internal structure of *.xls files. I know there was at one time, but I thought it was all to be done now through Excel interfaces.
Quote
For example, there is no counterpart that I know of to the Win Api GetFileSize() function. When an IStream is created its allocated in a 'block' type manner, say for example 2048 bytes. If you Write, say, 256 bytes to it, there is no way to recall that number.
You can use the Stat method of the IStream interface, that fills a STATSTG structure with diverse information. One of the members of this structure, cbSize, is a QUAD that specifies the size, in bytes, of the stream or byte array.
That was my first thought too Jose, but it doesn't seem to work. The number that gets returned is the block size of the IStream object, whose rough counterpart would be a sector on a file system I guess. In fact, I asked the question over on the Code Project Forum, and the reply there I got was that it doesn't seem to be possible to determine the size written into a IStream after it is closed. The individual there told me he checked one of his company's applications that uses IStreams, and they were storing the length written into the first four bytes. So that's how I came up with my idea of just creating a seperate IStream within the IStorage to persist metadata, as it were, relating to other IStreams within the IStorage. By the way, I downloaded and examined your IStorage/IStream examples and was really happy to see you had implemented those Structured Storage Interfaces in your Includes. I had been working on this in C++ but when I did a PowerBASIC translation it was exactly the same. I know everyone is always thanking you for doing all those wonderful translations you do, and sometimes I don't because so many other folks are always thanking you, but I too appreciate them.
If you want I can easily post an example demonstrating the problem I described above. Maybe there is a way to do it but I sank about a day into it and finally did what I did and it worked out OK.
That sounds strange. There is an API function available in Shlwapi.inc, but is a wrapper that calls the Stat method and returns the cbSize value:
DECLARE FUNCTION IStream_Size IMPORT "SHLWAPI.DLL" ALIAS "IStream_Size" ( _
BYVAL pstm AS IStream _ ' __in IStream *pstm
, BYREF pui AS QUAD _ ' __out ULARGE_INTEGER *pui
) AS LONG ' HRESULT
If it also fails, then a possible workaround can be to call the Read method of the IStream interface with a buffer greater than the possible length of the stream (with a maximum of 2147483647& characters because PB quads are signed) and then the size will be the number of characters read returned in the pcbRead parameter.
I have a wrapper function in Ole2Utils.inc that has worked in my tests:
' ========================================================================================
' Reads numchars at the current Seek position. If numchars = 0 or -1, it returns all the
' content of the stream.
' ========================================================================================
FUNCTION IStream_ReadText (BYVAL pIStream AS IStream, BYVAL numchars AS LONG) AS STRING
LOCAL hr AS LONG
LOCAL plibNewPosition AS QUAD
LOCAL pcbRead AS DWORD
LOCAL tstatstg AS STATSTG
LOCAL strText AS STRING
IF ISNOTHING(pIStream) THEN EXIT FUNCTION
IF numchars < -1 THEN EXIT FUNCTION
IF numchars = 0 THEN numchars = -1
IF numchars = -1 THEN
hr = pIStream.Seek(0, %STREAM_SEEK_SET, plibNewPosition)
IF hr <> %S_OK THEN EXIT FUNCTION
hr = pIStream.Stat(tstatstg, %STATFLAG_NONAME)
IF hr <> %S_OK THEN EXIT FUNCTION
IF tstatstg.cbsize < 1 THEN EXIT FUNCTION
IF tstatstg.cbsize <= 2147483647& THEN
numchars = tstatstg.cbsize
ELSE
numchars = 2147483647&
END IF
END IF
strText = SPACE$(numchars)
hr = pIStream.Read(STRPTR(strText), LEN(strText), pcbRead)
IF hr <> %S_OK THEN EXIT FUNCTION
IF pcbRead < numchars THEN strText = LEFT$(strText, pcbRead)
FUNCTION = strText
END FUNCTION
' ========================================================================================
Hi Jose!
I just dug up the code I was working with a couple weeks ago that pretty concisely demonstrates the problem. I've just glanced through your post above quiuckly and am going to examine it in more detail after I post this. Itsounds like you have some ideas there that might as you say be a work around. Anyway, this code is unfortunately C++, but I'm sure you'll understand it. I call the IStream::Stat method to get the cbSize data which is quite awkward in C/C++ due to the LARGE_INTEGER struct (PB makes this much easier!!!!). The particular IStream I previously wrote 16 bytes into, i.e., four four byter integers. The Index struct just has one four byte member. I arbitrarily ran the loop up to 8 and you can see the output. Disheartingly, pcbRead comes out with '4' for all reads - even beyond the end of the IStream (only first four are ledgitimate). S_OK was returned for all reads (even beyond end). And of course cbSize returned 4096. When I saw this all I could say/think was WOW! THIS IS A PROBLEM!
//CmpStr10 -- compiled as C++
#include <objbase.h>
#include <stdio.h>
#include <string.h>
typedef struct tagINDEX
{
DWORD dwPlotRecord; // 4
}INDEX; //==
// 4
int main(void)
{
wchar_t szFile[]=L"C:\\Code\\VStudio\\VC++6\\Projects\\COM\\CompStor\\CmpStr07\\Release\\Data.dat";
IStream* pIdxStream=NULL;
IStorage* pStorage=NULL;
char szBuffer[128];
ULONG pcbRead=0;
DWORD grfMode;
STATSTG sts;
unsigned i;
HRESULT hr;
INDEX idx;
wprintf(L"szFile = %s\n",szFile);
grfMode=STGM_DIRECT | STGM_READ | STGM_SHARE_EXCLUSIVE;
hr=StgOpenStorage(szFile,NULL,grfMode,NULL,0,&pStorage);
if(SUCCEEDED(hr))
{
printf("StgOpenStorage() Succeeded!\n");
hr=pStorage->OpenStream(L"Index", NULL, STGM_READ | STGM_SHARE_EXCLUSIVE, 0, &pIdxStream);
if(SUCCEEDED(hr))
{
printf("pStorage->OpenStream(Index) Succeeded!\n");
hr=pIdxStream->Stat(&sts,1);
if(SUCCEEDED(hr))
{
printf("pIdxStream->Stat(&sts,1) Succeeded!\n");
printf("sts.cbSize.LowPart = %u\n",sts.cbSize.LowPart);
printf("sts.cbSize.HighPart = %u\n",sts.cbSize.HighPart);
}
printf("\ni\tpcbRead\tidx.dwPlotRecord\tHRESULT\n");
printf("===============================================\n");
for(i=1; i<=8; i++)
{
hr=pIdxStream->Read(&idx,sizeof(INDEX),&pcbRead);
switch(hr)
{
case S_OK:
strcpy(szBuffer,"S_OK");
break;
case S_FALSE:
strcpy(szBuffer,"S_FALSE");
break;
default:
strcpy(szBuffer,"Don't Know What hr Means!");
break;
}
printf("%u\t%u\t%u\t\t\t%s\n",i,pcbRead,idx.dwPlotRecord,szBuffer);
}
pIdxStream->Release();
}
pStorage->Release();
}
getchar();
return 0;
}
/*
szFile = C:\Code\VStudio\VC++6\Projects\COM\CompStor\CmpStr07\Release\Data.dat
StgOpenStorage() Succeeded!
pStorage->OpenStream(Index) Succeeded!
pIdxStream->Stat(&sts,1) Succeeded!
sts.cbSize.LowPart = 4096
sts.cbSize.HighPart = 0
i pcbRead idx.dwPlotRecord HRESULT
===============================================
1 4 1 S_OK
2 4 3 S_OK
3 4 4 S_OK
4 4 5 S_OK
5 4 0 S_OK
6 4 0 S_OK
7 4 0 S_OK
8 4 0 S_OK
*/
Here's the link at www.codeproject.com where I asked the question...
http://www.codeproject.com/Messages/3655596/Find-Count-Of-Bytes-Written-To-An-IStream-modified.aspx
its...
"Find Count Of Bytes Written To An IStream"
Quote from: Frederick J. Harris on November 17, 2010, 04:20:22 PM
To be perfectly honest, I didn't know there was any published info on the internal structure of *.xls files. I know there was at one time, but I thought it was all to be done now through Excel interfaces.
They have been published for a while now. I've been slowly working at them off and on. You can check out two posts I made on pb forums here:
Part 1 - http://www.powerbasic.com/support/pbforums/showthread.php?t=43427
Part 2 - http://www.powerbasic.com/support/pbforums/showthread.php?t=43986
You can find links to the documentation in part 1. Part 2 has the full code from the OP. The thing is that the Test.xls both here and in my example in the pbforums are almost identical. The one in the pb forums works perfectly and dumps all cell data to a text file. All I did to cause this error was change the format of the date and then all of a sudden Istream doesn't work any more. I am quite confident that my program above has not reached the end of the stream. I extracted the binary Workbook dump using 7-Zip and it confirms my belief that I am not trying to read past the end of the stream. There is a call to IStream_GetSize which returns &H4253 which is greater than &H8CE where the error occurs.
I had originally thought I had a misunderstanding from the BIFF docs about the layout of the xls file when I started getting crashes once I changed the date layout. Now that I've finally taken the time to track this error down I don't believe that is the case. The crash occurred because due to the garbage read it was causing array out of bounds errors.
I did try loading the program in olly to debug it, but it looks like PB obfuscates the asm output because I've never seen anything like that. That avenue is too much time for what it's worth.
My next thought would be to either see if I get the same behavior in C++ or abandon Istream entirely and just read the data directly.
Edit: I can and do use Excel interfaces, but I deal with Excel files in such a volume that Excel causes too much slowdown. I've already integrated xlsx parsing using XML parsers and ECMA 376. I am very close now to getting what I want from xls files. Once that is done it will be a dream come true.
Quote
All I did to cause this error was change the format of the date and then all of a sudden Istream doesn't work any more.
I know this isn't funny to you David, but to me it is! We use Excel *.xls files in a major way in my Timber Sale System here where I work. Whenever I get a call that something crashed or isn't working right the first thing I do is check my log file and check any date info (there are only a few in quite a lot of data). Almost invariably that's where the problem is. Excel and dates will absolutely brutalize you!
Here's basically that same C++ program converted to PowerBASIC Console Compiler 5. I've attached the doc compound file it reads, which contains mostly binary data. Its in Data.zip and named Data.dat. Only the 1st four reads are valid data - everything else is garbage. The output follows the program, but anyone should be able to reproduce it him/herself from the attached file.
#Compile Exe
#Dim All
#Include "ObjIdl.inc"
Function PBMain() As Long
Local hr,iValue,pcbRead As Long
Local pIdxStream As IStream
Local pStorage As IStorage
Local strFile As String
Local stats AS STATSTG
Register i As Long
strFile = UCode$("Data.dat")
hr=StgOpenStorage(Strptr(strFile), Nothing, %STGM_DIRECT Or %STGM_READ Or %STGM_SHARE_EXCLUSIVE, %NULL, 0, pStorage)
If SUCCEEDED(hr) Then
Print "StgOpenStorage() Succeeded!"
strFile = UCode$("Index")
hr=pStorage.OpenStream(Byval Strptr(strFile), %NULL, %STGM_READ Or %STGM_SHARE_EXCLUSIVE, %NULL, pIdxStream)
If SUCCEEDED(hr) Then
Print "pStorage.OpenStream(Index) Succeeded!"
hr=pIdxStream.Stat(stats, %STATFLAG_NONAME)
If SUCCEEDED(hr) Then
Print "pIdxStream.Stat(stats,1) Succeeded!"
Print "stats.cbSize = " stats.cbSize
Print
Print " i iValue pcbRead HRESULT"
Print " =================================================="
For i=1 To 12
hr=pIdxStream.Read(Varptr(iValue),SizeOf(iValue),pcbRead)
If SUCCEEDED(hr) Then
Print i, iValue, pcbRead, "SUCCEEDED"
Else
Print i, iValue, pcbRead, "FAILED"
End If
Next i
End If
Set pIdxStream = Nothing
End If
Set pStorage = Nothing
End If
Waitkey$
PBMain=0
End Function
'StgOpenStorage() Succeeded!
'pStorage.OpenStream(Index) Succeeded!
'pIdxStream.Stat(stats,1) Succeeded!
'stats.cbSize = 4096
'
' i iValue pcbRead HRESULT
' ==================================================
' 1 1 4 SUCCEEDED
' 2 3 4 SUCCEEDED
' 3 4 4 SUCCEEDED
' 4 5 4 SUCCEEDED
' 5 0 4 SUCCEEDED
' 6 0 4 SUCCEEDED
' 7 0 4 SUCCEEDED
' 8 0 4 SUCCEEDED
' 9 0 4 SUCCEEDED
' 10 0 4 SUCCEEDED
' 11 0 4 SUCCEEDED
' 12 0 4 SUCCEEDED
This works:
#COMPILE EXE
#DIM ALL
#INCLUDE "ObjIdl.inc"
TYPE INDEX
dwPlotRecord AS DWORD
END TYPE
' ========================================================================================
' Main
' ========================================================================================
SUB TestCreateStorage
LOCAL hr AS LONG
LOCAL pStorage AS IStorage
LOCAL pStream AS IStream
LOCAL wszName AS STRING
wszName = UCODE$("Data.dat" & $NUL)
hr = StgCreateDocFile(STRPTR(wszName), _
%STGM_CREATE OR %STGM_DIRECT OR %STGM_SHARE_EXCLUSIVE OR %STGM_READWRITE, _
%NULL, pStorage)
IF FAILED(hr) THEN
? "StgCreateDocFile failure: " & HEX$(hr)
EXIT SUB
END IF
wszName = UCODE$("Index" & $NUL)
hr = pStorage.CreateStream(STRPTR(wszName), _
%STGM_DIRECT OR %STGM_CREATE OR %STGM_READWRITE OR %STGM_SHARE_EXCLUSIVE, _
0, 0, pStream)
IF FAILED(hr) THEN
? "IStorage.CreateStream failure"
EXIT SUB
END IF
LOCAL i AS LONG
LOCAL idx AS INDEX
LOCAL cbWritten AS DWORD
FOR i = 1 TO 8
idx.dwPlotRecord = i
hr = pStream.Write(VARPTR(idx), SIZEOF(INDEX), cbWritten)
? "hr = " hr, "cbWritten = " cbWritten
NEXT
? "Storage created"
pStream = NOTHING
pStorage = NOTHING
END SUB
' ========================================================================================
' ========================================================================================
SUB TestOpenStorage
LOCAL hr AS LONG
LOCAL pStorage AS IStorage
LOCAL pStream AS IStream
LOCAL wszName AS STRING
wszName = UCODE$("Data.dat" & $NUL)
hr = StgOpenStorageEx(STRPTR(wszName), %STGM_SHARE_EXCLUSIVE OR %STGM_READ, _
%STGFMT_STORAGE, 0, BYVAL %NULL, %NULL, $IID_IStorage, pStorage)
IF FAILED(hr) THEN
? "StgOpenStorageEx failure: " & HEX$(hr)
EXIT SUB
END IF
wszName = UCODE$("Index" & $NUL)
hr = pStorage.OpenStream(STRPTR(wszName), %NULL, _
%STGM_SIMPLE OR %STGM_READ OR %STGM_SHARE_EXCLUSIVE, _
%NULL, pStream)
IF FAILED(hr) THEN
? "IStorage,OpenStream failure: " & HEX$(hr)
EXIT SUB
END IF
LOCAL i AS LONG
LOCAL idx AS INDEX
LOCAL cbRead AS DWORD
FOR i = 1 TO 8
hr = pStream.Read(VARPTR(idx), SIZEOF(INDEX), cbRead)
? "hr = " hr, "cbRead = " cbRead, "Data = " idx.dwPlotRecord
NEXT
pStream = NOTHING
pStorage = NOTHING
END SUB
' ========================================================================================
' ========================================================================================
FUNCTION PBMAIN
' TestCreateStorage ' Unrem this to create the file
TestOpenStorage
WAITKEY$
END FUNCTION
' ========================================================================================
That's interesting work you are doing with the BIFF format David. I'll have to think on it for awhile to see if I can make use of it. At tis time I'm strictly using automation to read/write Excel data. It does really bug me though! I've written this horrendously fast PowerBASIC Api program to crunch our somewhat voluminous data, and the bottleneck is always Excel. What a slug!
I kind of have a love/hate relationship with Excel. I love it because lots of folks have used it to make all these forms we use, and it didn't take me too long to figure out that every time somebody makes a form to enter some kind of data, that's one data entry screen I don't have to create myself. The hate part comes in because its a big slow pig. So yea, what you are doing is interesting. The last word I had on it was that it wasn't documented anymore, and only COM was the 'official' access to it. But that comment is only something I picked up on some C++ discussion forum somewhere. I havn't researched it like you.
Quote from: Frederick J. Harris on November 17, 2010, 07:55:35 PM
I know this isn't funny to you David, but to me it is! We use Excel *.xls files in a major way in my Timber Sale System here where I work. Whenever I get a call that something crashed or isn't working right the first thing I do is check my log file and check any date info (there are only a few in quite a lot of data). Almost invariably that's where the problem is. Excel and dates will absolutely brutalize you!
Most of my data scrubbing code relates to Excel dates. It's because Excel only stores numbers or text. Dates are just a display effect. Today I helped someone debug an Excel macro that wasn't working because all of the dates "11/15/2010" were actually stored as text so comparisons were invalid. The only way to reliably convert it was to convert it manually by loading an array and using variable = CDbl(CDate(variable)) - 1462. 1462 only if the workbook uses 1904 dates. No combination of copy/paste and formatting would remove the text flag and make excel parse the date outside of VB.
Quote from: Frederick J. Harris on November 17, 2010, 08:21:26 PM
That's interesting work you are doing with the BIFF format David. I'll have to think on it for awhile to see if I can make use of it. At tis time I'm strictly using automation to read/write Excel data. It does really bug me though! I've written this horrendously fast PowerBASIC Api program to crunch our somewhat voluminous data, and the bottleneck is always Excel. What a slug!
Are you familiar with the Range.Value2 property? It's the fastest way to extract from and put into Excel files using automation.
And the Stat method returns a correct value:
#COMPILE EXE
#DIM ALL
#INCLUDE "ObjIdl.inc"
TYPE INDEX
dwPlotRecord AS DWORD
END TYPE
' ========================================================================================
' Main
' ========================================================================================
SUB TestCreateStorage
LOCAL hr AS LONG
LOCAL pStorage AS IStorage
LOCAL pStream AS IStream
LOCAL wszName AS STRING
wszName = UCODE$("Data.dat" & $NUL)
hr = StgCreateDocFile(STRPTR(wszName), _
%STGM_CREATE OR %STGM_DIRECT OR %STGM_SHARE_EXCLUSIVE OR %STGM_READWRITE, _
%NULL, pStorage)
IF FAILED(hr) THEN
? "StgCreateDocFile failure: " & HEX$(hr)
EXIT SUB
END IF
wszName = UCODE$("Index" & $NUL)
hr = pStorage.CreateStream(STRPTR(wszName), _
%STGM_DIRECT OR %STGM_CREATE OR %STGM_READWRITE OR %STGM_SHARE_EXCLUSIVE, _
0, 0, pStream)
IF FAILED(hr) THEN
? "IStorage.CreateStream failure"
EXIT SUB
END IF
LOCAL i AS LONG
LOCAL idx AS INDEX
LOCAL cbWritten AS DWORD
FOR i = 1 TO 8
idx.dwPlotRecord = i
hr = pStream.Write(VARPTR(idx), SIZEOF(INDEX), cbWritten)
? "hr = " hr, "cbWritten = " cbWritten
NEXT
? "Storage created"
pStream = NOTHING
pStorage = NOTHING
END SUB
' ========================================================================================
' ========================================================================================
SUB TestOpenStorage
LOCAL hr AS LONG
LOCAL pStorage AS IStorage
LOCAL pStream AS IStream
LOCAL wszName AS STRING
wszName = UCODE$("Data.dat" & $NUL)
hr = StgOpenStorageEx(STRPTR(wszName), %STGM_SHARE_EXCLUSIVE OR %STGM_READ, _
%STGFMT_STORAGE, 0, BYVAL %NULL, %NULL, $IID_IStorage, pStorage)
IF FAILED(hr) THEN
? "StgOpenStorageEx failure: " & HEX$(hr)
EXIT SUB
END IF
wszName = UCODE$("Index" & $NUL)
hr = pStorage.OpenStream(STRPTR(wszName), %NULL, _
%STGM_SIMPLE OR %STGM_READ OR %STGM_SHARE_EXCLUSIVE, _
%NULL, pStream)
IF FAILED(hr) THEN
? "IStorage,OpenStream failure: " & HEX$(hr)
EXIT SUB
END IF
LOCAL stats AS STATSTG
hr = pStream.Stat(stats, %STATFLAG_NONAME)
? "Stat - hr = " hr, "cbSize = " stats.cbSize
LOCAL i AS LONG
LOCAL idx AS INDEX
LOCAL cbRead AS DWORD
FOR i = 1 TO 8
hr = pStream.Read(VARPTR(idx), SIZEOF(INDEX), cbRead)
? "hr = " hr, "cbRead = " cbRead, "Data = " idx.dwPlotRecord
NEXT
pStream = NOTHING
pStorage = NOTHING
END SUB
' ========================================================================================
' ========================================================================================
FUNCTION PBMAIN
' TestCreateStorage ' Unrem this to create the file
TestOpenStorage
WAITKEY$
END FUNCTION
' ========================================================================================
Probably it is because you both are using the deprecated StgOpenStorage function instead of StgOpenStorageEx. If I try to use StgOpenStorage in my computer, it returns a %STG_E_OLDFORMAT error.
Quote from: José Roca on November 17, 2010, 08:52:32 PM
Probably it is because you both are using the deprecated StgOpenStorage function instead of StgOpenStorageEx. If I try to use StgOpenStorage in my computer, it returns a %STG_E_OLDFORMAT error.
Thanks for looking, Jose. I updated to StgOpenStorageEx but I still get the same behavior. Here is the updated section of code that uses StgOpenStorageEx.
wszName = UCODE$("Test.xls" & $NUL)
hr = StgOpenStorageEx(STRPTR(wszName), %STGM_SHARE_EXCLUSIVE OR %STGM_READ, _
%STGFMT_STORAGE, 0, BYVAL %NULL, %NULL, $IID_IStorage, pStorage)
IF FAILED(hr) THEN
MSGBOX "StgOpenStorage failure: " & HEX$(hr)
EXIT FUNCTION
END IF
wszName = UCODE$("Workbook" & $NUL)
hr = pStorage.OpenStream(STRPTR(wszName), %NULL, %STGM_SIMPLE OR %STGM_READ OR %STGM_SHARE_EXCLUSIVE, %NULL, pStream)
IF FAILED(hr) THEN
wszName = UCODE$("Book" & $NUL)
hr = pStorage.OpenStream(STRPTR(wszName), %NULL, %STGM_SIMPLE OR %STGM_READ OR %STGM_SHARE_EXCLUSIVE, %NULL, pStream)
IF FAILED(hr) THEN
MSGBOX "IStorage.OpenStream failure: " & HEX$(hr)
EXIT FUNCTION
END IF
END IF
I haven't tested it yet but I'm just about sure I've found the source of the problem. First let me state I tried your code and it works Jose. However, only on your file created with your code. If I use your code to read my file, it still doesn't work. That of course suggested to me that the effect I'm seeing had something to do with the way I created the IStream. Here are the flags I used...
grfMode=STGM_SIMPLE | STGM_CREATE | STGM_READWRITE | STGM_SHARE_EXCLUSIVE;
hr=StgCreateDocfile(szFile,grfMode,0,&pStorage);
I believe the STGM_SIMPLE flag is the culpret.
STGM_SIMPLE STGM_SIMPLE is a mode that provides a much faster implementation of a compound file in a limited, but frequently used case. It is described in detail in the following Remarks section.
This mode is useful for applications that perform complete save operations. It has the following constraints:
There is no support for substorages. Access to streams follows a linear pattern. Once a stream is released, that stream cannot be opened for read/write
operations again. The IStorage::OpenStream method is not supported in this implementation. The storage and stream objects cannot be marshaled.
Each stream is at least 4096 bytes in length. If fewer than 4096 bytes are written into a stream by the time the stream is released, the stream will be extended to contain 4096 bytes. In this compound file implementation, only a subset of the methods of IStorage and IStream are available.
Specifically, in simple mode, supported IStorage methods are QueryInterface, AddRef, Release, CreateStream, Commit, and SetClass. In addition, SetElementTimes is supported with a NULL name, allowing applications to set times on a root storage in simple mode.
Supported IStream methods are QueryInterface, AddRef, Release, Seek, and SetSize. Also, Read and Write methods on ISequentialStream are supported.
I don't think the StgCreateDocfile() verses StgCreateDocfileEx() is the problem. I'll test it now but I'm about sure that's it.
Did you know you can use ODBC to read *.xls data David (doesn't even require Excel to be installed)?
Quote from: Frederick J. Harris on November 17, 2010, 10:03:31 PM
Did you know you can use ODBC to read *.xls data David (doesn't even require Excel to be installed)?
Yea, but it is inefficient for my needs. I run table detection, data validation, and some cleaning routines on the data I import. Importing ODBC into my structs and then back into another DB just is not any faster IMO. The problem becomes even worse when I have to work with large files. The memory has to be loaded from disk or network and then moved before I can even work with it, and I might not even need all of it. You just can't beat being able to read the files directly and BIFF is far superior to xlsx because you can grab one cell value out of a 1GB file with barely any parsing.
Quote
Are you familiar with the Range.Value2 property? It's the fastest way to extract from and put into Excel files using automation.
No, I'll have to check it out.
Say David, How hard is it using your BIFF file access techniques is it to create a blank *.xls file, the way it would be if you just opened Excel and got a blank Book1.xls and saved it?
I fooled around with this quite a bit using ODBC but wasn't successful. You can create a blank Microsoft Access database with ODBC (SQLConfigDataSource() I think), but all I could get out of ODBC in that regard (with Excel) was the creation of a blank directory of all things!
Quote from: Frederick J. Harris on November 17, 2010, 10:40:26 PM
No, I'll have to check it out.
Here is a VBA example. Keep in mind that the array will always have 2 dimensions and that it will always be "1 to ..." Keep this in mind because variable(x, y) does not always correspond to row, col. Try this on a very large range and you will be impressed.
Sub ArrayExample()
Dim ArrayVals() As Variant
Dim sht As Worksheet
Dim row As Long
Dim col As Long
Set sht = ActiveSheet
''Populating example
'Create an array, fill it, and put it in excel
'The column must be declared even if there is only one. It would be ArrayVals(1 to MAXROWS, 1 to MAXCOLS) in all cases.
ReDim ArrayVals(1 To 65536, 1 To 32) As Variant
For row = 1 To 65536
For col = 1 To 32
ArrayVals(row, col) = row * col
Next
Next
sht.Range("A1:AF65536").Value2 = ArrayVals
''Reading in example
'This reads in the used range into a 2 dimension array
ArrayVals = sht.UsedRange.Value2
For row = 1 To UBound(ArrayVals, 1)
For col = 1 To UBound(ArrayVals, 2)
If IsNumeric(ArrayVals(row, col)) Then
ArrayVals(row, col) = ArrayVals(row, col) * 2
End If
Next
Next
sht.UsedRange.Value2 = ArrayVals
End Sub
Quote from: Frederick J. Harris on November 17, 2010, 10:52:48 PM
Say David, How hard is it using your BIFF file access techniques is it to create a blank *.xls file, the way it would be if you just opened Excel and got a blank Book1.xls and saved it?
I fooled around with this quite a bit using ODBC but wasn't successful. You can create a blank Microsoft Access database with ODBC (SQLConfigDataSource() I think), but all I could get out of ODBC in that regard (with Excel) was the creation of a blank directory of all things!
I haven't got to that yet. So far I'm just reading files. I haven't seen a hello world example.
Thanks David. I'll check that out.
I'm like a bull in a china shop. I came into this thread with a half baked idea of what might be wrong with David's code, and as far as I know I'm about the only one who's benifited from all this. David's still got his original problem and Jose mostly fixed mine.