在asp中字串與 byte array 互轉

參考網址 http://www.moretechtips.net/2008/10/convert-string-to-bytes-and-vice-versa.html

將字串轉換成byte array,這在.net或java中,只需一個getbyte()函式即可,但在asp中並不容易。

下面的函式是透過ado.stream的功能,可以將字串轉換成bytes,以及反過來轉換。

Old fashioned guys use SQL Server varchar/text fields to store strings of multiple languages that uses 1 byte encoding like [Windows Character Table] :

Windows-1252 : English, French , Spanish, German,Italian,Spanish (Western European characters)...

Windows-1251 : Russian,Bulgarian,Serbian,Ukrainian

Windows-1253 : Greek

Windows-1256 : Arabic

.....

Of course, 1 byte encoding field can contain English + only one other language characters - unlike UTF-8) , just as a file encoded in 1-byte encoding..

To know about Character sets, you should check :

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

in that case : Asp pages codepage should remain as default = 1252

<% @ LANGUAGE=VBScript CODEPAGE=1252 %>

and setting the Response.Charset or HTML meta tag for charset correctly will show things right.. and HTML page size is smaller than the same one in UTF-8

of course running a site like that in IIS, will require that windows > Control Panel > Regional and Language options > Advanced > must be English or Strings read from SQL server will be corrupted...

A disadvantage is that you can't show more than one language (Other than English) in the same page without using escape codes... ,which suitable is for small text (a link to other language home page)

but, if you need to output UTF-8 file (text,Xml ,RSS,..) from non UTF-8 page, you must remember that Strings are Unicode in memory, so if you read a string from SQL Server using settings as mentioned before , and as an example :

- if we have that string "привет" which is "hi" in Russian

- and saved in varchar field in SQL Server it will look like "ïðèâåò"

- when you read that string in memory using ado it will look like "ïðèâåò" , cause VB string can't know it is Russian ( it is readed from varchar and default codepage is 1252 ,so it thinks it is Western European characters)

- So To Convert it to Russian will use ADO Stream :

AlterCharset("ïðèâåò","windows-1252", "windows-1251")

- After that it would be saved in memory as "привет"

- and when written to UTf-8 file , it will be "привет" , but if u don't do the Conversion step it will be "ïðèâåò"

enough talking , here is the code

For this code to work in VB6, you will need to add a reference to the Microsoft ActiveX Data Objects 2.5+ Library and change [Dim Stream : Set Stream=Server.CreateObject("ADODB.Stream") ] to [Dim Stream as new ADODB.Stream]

<%

Const adTypeBinary = 1

Const adTypeText = 2

' accept a string and convert it to Bytes array in the selected Charset

Function StringToBytes(Str,Charset)

Dim Stream : Set Stream = Server.CreateObject("ADODB.Stream")

Stream.Type = adTypeText

Stream.Charset = Charset

Stream.Open

Stream.WriteText Str

Stream.Flush

Stream.Position = 0

' rewind stream and read Bytes

Stream.Type = adTypeBinary

StringToBytes= Stream.Read

Stream.Close

Set Stream = Nothing

End Function

' accept Bytes array and convert it to a string using the selected charset

Function BytesToString(Bytes, Charset)

Dim Stream : Set Stream = Server.CreateObject("ADODB.Stream")

Stream.Charset = Charset

Stream.Type = adTypeBinary

Stream.Open

Stream.Write Bytes

Stream.Flush

Stream.Position = 0

' rewind stream and read text

Stream.Type = adTypeText

BytesToString= Stream.ReadText

Stream.Close

Set Stream = Nothing

End Function

' This will alter charset of a string from 1-byte charset(as windows-1252)

' to another 1-byte charset(as windows-1251)

Function AlterCharset(Str, FromCharset, ToCharset)

Dim Bytes

Bytes = StringToBytes(Str, FromCharset)

AlterCharset = BytesToString(Bytes, ToCharset)

End Function

%>