在asp中字串與 byte array 互轉
參考網址 http://www.moretechtips.net/2008/10/convert-string-to-bytes-and-vice-versa.html
將字串轉換成byte array,這在.net或java中,只需一個getbyte()函式即可,但在asp中並不容易。
下面的函式是透過ado.stream的功能,可以將字串轉換成bytes,以及反過來轉換。
Old fashioned guys use SQL Server varchar/text fields to store strings of multiple languages that uses 1 byte encoding like [Windows Character Table] :
Windows-1252 : English, French , Spanish, German,Italian,Spanish (Western European characters)...
Windows-1251 : Russian,Bulgarian,Serbian,Ukrainian
Windows-1253 : Greek
Windows-1256 : Arabic
.....
Of course, 1 byte encoding field can contain English + only one other language characters - unlike UTF-8) , just as a file encoded in 1-byte encoding..
To know about Character sets, you should check :
in that case : Asp pages codepage should remain as default = 1252
<% @ LANGUAGE=VBScript CODEPAGE=1252 %>
and setting the Response.Charset or HTML meta tag for charset correctly will show things right.. and HTML page size is smaller than the same one in UTF-8
of course running a site like that in IIS, will require that windows > Control Panel > Regional and Language options > Advanced > must be English or Strings read from SQL server will be corrupted...
A disadvantage is that you can't show more than one language (Other than English) in the same page without using escape codes... ,which suitable is for small text (a link to other language home page)
but, if you need to output UTF-8 file (text,Xml ,RSS,..) from non UTF-8 page, you must remember that Strings are Unicode in memory, so if you read a string from SQL Server using settings as mentioned before , and as an example :
- if we have that string "привет" which is "hi" in Russian
- and saved in varchar field in SQL Server it will look like "ïðèâåò"
- when you read that string in memory using ado it will look like "ïðèâåò" , cause VB string can't know it is Russian ( it is readed from varchar and default codepage is 1252 ,so it thinks it is Western European characters)
- So To Convert it to Russian will use ADO Stream :
AlterCharset("ïðèâåò","windows-1252", "windows-1251")
- After that it would be saved in memory as "привет"
- and when written to UTf-8 file , it will be "привет" , but if u don't do the Conversion step it will be "ïðèâåò"
enough talking , here is the code
For this code to work in VB6, you will need to add a reference to the Microsoft ActiveX Data Objects 2.5+ Library and change [Dim Stream : Set Stream=Server.CreateObject("ADODB.Stream") ] to [Dim Stream as new ADODB.Stream]
<%
Const adTypeBinary = 1
Const adTypeText = 2
' accept a string and convert it to Bytes array in the selected Charset
Function StringToBytes(Str,Charset)
Dim Stream : Set Stream = Server.CreateObject("ADODB.Stream")
Stream.Type = adTypeText
Stream.Charset = Charset
Stream.Open
Stream.WriteText Str
Stream.Flush
Stream.Position = 0
' rewind stream and read Bytes
Stream.Type = adTypeBinary
StringToBytes= Stream.Read
Stream.Close
Set Stream = Nothing
End Function
' accept Bytes array and convert it to a string using the selected charset
Function BytesToString(Bytes, Charset)
Dim Stream : Set Stream = Server.CreateObject("ADODB.Stream")
Stream.Charset = Charset
Stream.Type = adTypeBinary
Stream.Open
Stream.Write Bytes
Stream.Flush
Stream.Position = 0
' rewind stream and read text
Stream.Type = adTypeText
BytesToString= Stream.ReadText
Stream.Close
Set Stream = Nothing
End Function
' This will alter charset of a string from 1-byte charset(as windows-1252)
' to another 1-byte charset(as windows-1251)
Function AlterCharset(Str, FromCharset, ToCharset)
Dim Bytes
Bytes = StringToBytes(Str, FromCharset)
AlterCharset = BytesToString(Bytes, ToCharset)
End Function
%>