• Welcome to Jose's Read Only Forum 2023.
 

FreeBASIC CWstr

Started by Juergen Kuehlwein, April 09, 2018, 11:39:00 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

José Roca

My suggestion was to allow to use unicode string literals with FreeBasic. It won't work with PowerBasic because it only supports ansi source code files.

Theo Gottwald

As far as I know the Source-Code for Freebasic is available for download - not?
Also from what i saw on the WEB-Site the programmers run away from it already.
So if there are problems, why not make an own version?

Currently i am still fine with PB as it hast the best andeasiest Strings of all type.

José Roca

#32
Which problems? The problem is with PowerBasic.

> Also from what i saw on the WEB-Site the programmers run away from it already.

The same that run away from other sites. Low-level compilers are not for everybody.

> Currently i am still fine with PB as it hast the best andeasiest Strings of all type.

Free Basic strings are much faster that PB strings. The problem is that it does not support dynamic unicode strings natively, but my class adds support for them and are faster than the PB ones.

Anyway, the main purpose of using Free Basic by some PB programmers has been because it supports 64 bit. Now that I have mastered it, I don't miss PB at all.

With my framework, there is nothing that you can do with PB that you can't do with FB.

http://www.jose.it-berater.org/WinFBX/WinFBX.html

And Paul is adding a Visual Designer to his editor for FB.

Patrice Terrier

QuoteNow that I have mastered it, I don't miss PB at all.
He he, that's almost the same for all those who made the effort to go outside of their comfort zone.
Better late than never ;)
Patrice Terrier
GDImage (advanced graphic addon)
http://www.zapsolution.com

Juergen Kuehlwein

José,


one more question about CWSTR and CBSTR. You said CBSTRs are necessary for COM, but in general you recommend using CWSTR, because CWSTRs are faster. Do you always convert automatically from CWSTR to CBSTR whenever necessary in COM, or is it the user´s responsibility to pass the correct data type?

In other words: in an application implementing COM would i have to convert all the CWSTRs i´m using to CBSTRs before i can pass them to a COM method, or is this done automatically for me and i can pass just a CWSTR (which then gets converted to CBSTR, and CBSTR only exists to make this automatic conversion possible)? According to some tests i ran automatic conversion seems to take place. But i´m not absolutely sure, if i can rely on that under all circumstances, or if i found a code sample where by chance it works.   


Thanks


JK

José Roca

#35
There is automatic conversion between CWSTR and CBSTR. Therefore, you can pass a CWSTR to a custom procedure that expects a CBSTR, but not to a procedure that expects a BSTR (without the leading "C"), no matter if it has been declared as BSTR, AFX_BSTR, WSTRING PTR or ANY PTR. For these kind of procedures, usually COM methods, use a CBSTR, eg. DIM cbs AS CBSTR, and pass cbs or cbs.sptr to IN parameters and cbs.vptr to OUT/INOUT parameters.

It is not possible to simulate a BSTR using our own allocated memory because BSTRs are managed by the COM library and must be allocated/freed with SysAllocString/SysFreeString. Any attempt to cheat will cause problems sooner or later.

Juergen Kuehlwein

Quote
usually COM methods


Ok - but where from do i know EXACTLY that a BSTR is expected? Each and every COM method or property?


Your code sample


...
dim t as CWSTR = "Hello World"

  #INCLUDE ONCE "Afx/AfxCom.inc"
  #INCLUDE ONCE "Afx/AfxSapi.bi"
  using Afx
  ' // The COM library must be initialized to call AfxNewCom
  CoInitialize NULL
  ' // Create an instance of the Afx_ISpVoice interface
  DIM pSpVoice AS Afx_ISpVoice PTR = AfxNewCom("SAPI.SpVoice")
  DIM pCComPtrSpVoice AS CComPtr = pSpVoice
  ' // Call the Speak method
  pSpVoice->Speak(t, 0, NULL)
  ' // Uninitialize the COM library
  CoUninitialize
  PRINT
  PRINT "Press any key..."
  SLEEP


works like a charm with a CWSTR!


Is there a FreeBASIC version of your typelib browser btw. and does it tell me when i MUST pass a CBSTR as you descibed?


So to be on the safe side in general and especially when dealing with COM i could go with CBSTR and implement CWSTR only when speed matters. Or are there problems as well and it is better to have CWSTR as a standard and implement CBSTR only where necessary ? But when is CBSTR REALLY necessary and is the speed gain with CWSTR really worth the conversions needed then.

As  far as i understand it up to now, i could take both of it as my standard Unicode dynamic string type, but which one should i take - the faster one or the more universal one? Currently i tend to use CBSTR as a standard and use CWSTR only, if speed is important.

Would automatic conversion cost much time, if a custom procedure had been written for CBSTR (in/out) and i passed a CWSTR instead. How could i define parameters (in CBSTR)) and return type (out CBSTR) in order to minimize speed loss, it a CWSTR is passed and expected as return type instead of a CBSTR?


Tell me, if i´m wrecking your nerves, but as always, i want nothing more than the optimum ;-)


JK,

José Roca

> Ok - but where from do i know EXACTLY that a BSTR is expected? Each and every COM method or property?

Reading the documentation.

> Your code sample works like a charm with a CWSTR!

Indeed, because this method expects a null terminated unicode string, not a BSTR!
See: https://msdn.microsoft.com/en-us/library/ee125024(v=vs.85).aspx

Automation interfaces work with BSTR and Variants, but low-lewel COM interfaces can work with any data type, and you need to read the documentation.

> Is there a FreeBASIC version of your typelib browser btw. and does it tell me when i MUST pass a CBSTR as you descibed?

There is one, TLB_100 ( available at https://github.com/JoseRoca/WinFBX ), although it is always advisable to read the documentation because sometimes the low-level interfaces expect or return a pointer to a null terminated unicode string allocated with CoTaskMemAlloc, that must be freed with CoTaskMemFree, and you can't neither use CBSTR or CWSTR with them directly.

> As  far as i understand it up to now, i could take both of it as my standard Unicode dynamic string type, but which one should i take - the faster one or the more universal one?

The loss of speed won't be significant in most cases, but if you have to do thousands of string concatenations, as it is the case with my TypeLib Browser, then it will be noticeable. CWSTR is so fast than when you click the type library to parse, the code is generated instantaneously (there is not a separate option to generate the code because it is not needed).

When I need to work with COM and BSTRs, I use CBSTR; otherwise, I use CWSTR. But if you don't need to work with big strings or do heavy string manipulation, you can use any of them or even mix them. They work transparently, so if a function returns a CWSTR you can assign it to a CBSTR or to a STRING, and viceversa. You can also mix CBSTR, CWSTR, STRING a WSTRING when concatenating strings.

José Roca

#38
> How could i define parameters (in CBSTR)) and return type (out CBSTR) in order to minimize speed loss, it a CWSTR is passed and expected as return type instead of a CBSTR?

If you declare a parameter as BYREF CBSTR and you pass a CWSTR, a conversion will be performed. It is unavoidable. But the loss of speed won't be significant. It is like if in PowerBasic you have a parameter declared as WSTRING and you pass a STRING variable: The STRING must be converted to unicode. With normal usage, you don't have to worry. You may need to use CWSTR in the cases in which you will use StringBuilder with PowerBasic. CWSTR is faster than CBSTR because it uses string builder techniques to minimize memory allocations.


Juergen Kuehlwein

Thanks José, for your patience and your explanations!


JK

Juergen Kuehlwein

#40
Hi José,


CWstr (just like CBstr) have an upper limit of DWORD (2^32) size, is this correct ? In theory you could make it even larger (in 64 bit), if you defined all "UINT" as "UINTEGER", or is the restriction to 2^32 on purpose?


next question, this code gpfs:


#Define UNICODE
#include once "windows.bi"
#include once "Afx\AfxStr.inc"

#define ustring cwstr


declare function FB_MAIN as uinteger


END FB_MAIN


'***********************************************************************************************


FUNCTION FB_MAIN AS uinteger
'***********************************************************************************************
'
'***********************************************************************************************
dim i as long
dim s as string       = "ЙЦУК"
dim z as zstring * 64 = "ЙЦУКЙЦУКЙЦУК"
dim w as Wstring * 64 = "ЙЦУ"
dim u as Ustring      = "ЙЦУКЕ"
dim x as Ustring      = "ЙЦУКЕ"



'ods("start")
'  for i = 1 to 1000
'    x = rset$(s, 10)
'    x = rset$(z, 10)
'    x = rset$(w, 10)
'    x = rset$(u, 10)
'  next i
'ods("end")


using afx

'ods("start")
  for i = 1 to 1000
    x = AfxStrRSet(s, 10, " ")
    x = AfxStrRSet(z, 10, " ")
    x = AfxStrRSet(w, 10, " ")
    x = AfxStrRSet(u, 10, " ")
  next i
'ods("end")


  function = 0


end function


'***********************************************************************************************



while this code:



PRIVATE FUNCTION RSet_ overload (BYREF wszMainStr AS WSTRING, BYVAL nStringLength AS ULONG, BYref wszPadCharacter AS WSTRING = " ") AS ustring
'***********************************************************************************************

'***********************************************************************************************
dim cws as ustring = wstring(nStringLength, LEFT(wszPadCharacter, 1)) + wszMainstr

static c as long

  c = c + 1

'outputdebugstringa("ustring"+str(c))
  RETURN right(**cws, nStringLength)


END FUNCTION



succeeds.


(last sentence withdrawn ...)


JK



José Roca

> CWstr (just like CBstr) have an upper limit of DWORD (2^32) size, is this correct ? In theory you could make it even larger (in 64 bit), if you defined all "UINT" as "UINTEGER", or is the restriction to 2^32 on purpose?

Yes, it is on purpose. The maximum size (in characters) is 2147483647, as all the FreeBasic string types (String, ZString, WString) and also as COM BSTR.

José Roca

#42
> next question, this code gpfs:

Worked with 32 bit, but GPFed with 64 bit. 32 bit and 64 bit use different assemblers. Thanks for reporting it.

I have changed the code to:


PRIVATE FUNCTION AfxStrRSet (BYREF wszMainStr AS CONST WSTRING, BYVAL nStringLength AS LONG, BYREF wszPadCharacter AS WSTRING = " ") AS CWSTR
   DiM cwsPadChar AS CWSTR = wszPadCharacter
   IF cwsPadChar = "" THEN cwsPadChar = " "
   cwsPadChar = LEFT(cwsPadChar, 1)
   DIM cws AS CWSTR = SPACE(nStringLength)
   FOR i AS LONG = 1 TO LEN(cws)
      MID(**cws, i, 1) = cwsPadChar
   NEXT
   MID(**cws, nStringLength - LEN(wszMainStr) + 1, LEN(wszMainStr)) = wszMainStr
   RETURN cws
END FUNCTION


I have also changed the code for AfxStrLSet and AfxStrCSet:


' ========================================================================================
' Returns a string containing a left-justified (padded) string.
' If the optional parameter wszPadCharacter not specified, the function pads the string with
' space characters to the left. Otherwise, the function pads the string with the first
' character of wszPadCharacter
' Example: DIM cws AS CWSTR = AfxStrLSet("FreeBasic", 20, "*")
' ========================================================================================
PRIVATE FUNCTION AfxStrLSet (BYREF wszMainStr AS CONST WSTRING, BYVAL nStringLength AS LONG, BYREF wszPadCharacter AS WSTRING = " ") AS CWSTR
   DiM cwsPadChar AS CWSTR = wszPadCharacter
   IF cwsPadChar = "" THEN cwsPadChar = " "
   cwsPadChar = LEFT(cwsPadChar, 1)
   DIM cws AS CWSTR = SPACE(nStringLength)
   FOR i AS LONG = 1 TO LEN(cws)
      MID(**cws, i, 1) = cwsPadChar
   NEXT
   MID(**cws, 1, LEN(wszMainStr)) = wszMainStr
   RETURN cws
END FUNCTION
' ========================================================================================



' ========================================================================================
' Returns a string containing a centered (padded) string.
' If the optional parameter wszPadCharacter not specified, the function pads the string with
' space characters to the left. Otherwise, the function pads the string with the first
' character of wszPadCharacter.
' Example: DIM cws AS CWSTR = AfxStrCSet("FreeBasic", 20, "*")
' ========================================================================================
PRIVATE FUNCTION AfxStrCSet (BYREF wszMainStr AS CONST WSTRING, BYVAL nStringLength AS LONG, BYREF wszPadCharacter AS WSTRING = " ") AS CWSTR
   DiM cwsPadChar AS CWSTR = wszPadCharacter
   IF cwsPadChar = "" THEN cwsPadChar = " "
   cwsPadChar = LEFT(cwsPadChar, 1)
   DIM cws AS CWSTR = SPACE(nStringLength)
   FOR i AS LONG = 1 TO LEN(cws)
      MID(**cws, i, 1) = cwsPadChar
   NEXT
   MID(**cws, (nStringLength - LEN(wszMainStr)) \ 2 + 1, LEN(wszMainStr)) = wszMainStr
   RETURN cws
END FUNCTION
' ========================================================================================


Juergen Kuehlwein

Quote
Yes, it is on purpose. The maximum size (in characters) is 2147483647, as all the FreeBasic string types (String, ZString, WString) and also as COM BSTR.

But in theory you could make it even larger (in 64 bit), if you defined all "UINT" as "UINTEGER", or are there still other reasons, why  it wouldn´t work then ?


Comparing your code for RSET and mine:
1.), i don´t understand why you first fill a CWstr with spaces and overwrite these spaces with the pad character?
2.) why would you suppress a null string as pad character? If i pad with a null string then it shouldn´t be padded at all - makes sense to me.
3.) and if the string to pad (e.g. 11 characters) is larger than the resulting string (e.g 10 characters), only pad characters are returned with your code.

FB´s RSET returns the leftmost 10 characters of the string to pad (truncating on the right side), which is quite unexpected to me. PB´s RSET$ returns the rightmost 10 characters (the string to pad is truncated from left to right in this case), which seems to be the most logical choice to me and is, what my code does too.

JK

José Roca

> But in theory you could make it even larger (in 64 bit), if you defined all "UINT" as "UINTEGER", or are there still other reasons, why  it wouldn´t work then ?

The FreeBasic intrinsic string functions won't work with them. Also you can't allocate a BSTR bigger than 2147483647 characters.