ASP教程之实例解说asp抓取网上房产信息
由于ASP还是一种Script语言所没除了大量使用组件外,没有办法提高其工作效率。它必须面对即时编绎的时间考验,同时我们还不知其背后的组件会是一个什么样的状况;附:抓失信息的具体页面事例<%@LANGUAGE="VBSCRIPT"CODEPAGE="936"%>
<!--#includefile="conn.asp"-->
<!--#includefile="inc/function.asp"-->
<!DOCTYPEHTMLPUBLIC"-//W3C//DTDHTML4.01Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>UntitledDocument</title>
<metahttp-equiv="Content-Type"content="text/html;charset=gb2312">
<metahttp-equiv="refresh"content="300;URL=steal_house.asp">
</head>
<body>
<%
onerrorresumenext
Server.ScriptTimeout=999999
========================================================
字符编码函数
====================================================
FunctionBytesToBstr(body,code)
dimobjstream
setobjstream=Server.CreateObject("adodb.stream")
objstream.Type=1
objstream.Mode=3
objstream.Open
objstream.Writebody
objstream.Position=0
objstream.Type=2
objstream.Charset=code
BytesToBstr=objstream.ReadText
objstream.Close
setobjstream=nothing
EndFunction
取行字符串在另外一字符串中的呈现地位
FunctionNewstring(wstr,strng)
Newstring=Instr(lcase(wstr),lcase(strng))
ifNewstring<=0thenNewstring=Len(wstr)
EndFunction
交换字符串函数
functionReplaceStr(ori,str1,str2)
ReplaceStr=replace(ori,str1,str2)
endfunction
====================================================
functionReadXml(url,code,start,ends)
setoSend=createobject("Microsoft.XMLHTTP")
SourceCode=oSend.open("GET",url,false)
oSend.send()
ReadXml=BytesToBstr(oSend.responseBody,code)
start=Instr(ReadXml,start)
ReadXml=mid(ReadXml,start)
ends=Instr(ReadXml,ends)
ReadXml=left(ReadXml,ends-1)
endfunction
functionSubStr(body,start,ends)
start=Instr(body,start)
SubStr=mid(body,start+len(start)+1)
ends=Instr(SubStr,ends)
SubStr=left(SubStr,ends-1)
endfunction
dimgetcont,NewsContent
dimurl,title
url="http://www.***.com"旧事网址
getcont=ReadXml(url,"gb2312","<tableclass=k2border=""0""","</table>")
getcont=RegexHtml(getcont)
dimKeyId,NewsClass,City,Position,HouseType,Level,Area,Price,Demostra
dimContactMan,Contact
fori=2toubound(getcont)
response.Write(getcont(i)&"__<br>")
tempLink=mid(getcont(i),instr(getcont(i),"href=""")+6,instr(getcont(i),"""
onClick")-10)
tempLink=replace(tempLink,"../","")
response.Write(i&":"&tempLink&"<br>")
NewsContent=ReadXml(tempLink,"gb2312","<tdvalign=""bottom""
width=""400"">","<hrwidth=""760""
noshadesize=""1""color=""#808080"">
")
NewsContent=RemoveHtml(NewsContent)
NewsContent=replace(NewsContent,VbCrLf,"")
NewsContent=replace(NewsContent,vbNewLine,"")
NewsContent=replace(NewsContent,"","")
NewsContent=replace(NewsContent,"","")
NewsContent=replace(NewsContent,"","")
NewsContent=replace(NewsContent,"
","")
NewsContent=replace(NewsContent,chr(10),"")
NewsContent=replace(NewsContent,chr(13),"")
===============getContent=======================
response.Write(NewsContent)
KeyId=SubStr(NewsContent,"列号:","信息种别:")
NewsClass=SubStr(NewsContent,"种别:","地点乡村:")
City=SubStr(NewsContent,"乡村:","衡宇详细地位:")
Position=SubStr(NewsContent,"地位:","衡宇范例:")
HouseType=SubStr(NewsContent,"范例:","楼层:")
Level=SubStr(NewsContent,"楼层:","利用面积:")
Area=SubStr(NewsContent,"面积:","房价:")
Price=SubStr(NewsContent,"房价:","其他申明:")
Demostra=SubStr(NewsContent,"申明:","接洽人:")
ContactMan=SubStr(NewsContent,"接洽人:","接洽体例:")
Contact=SubStr(NewsContent,"接洽体例:","信息")
response.Write("总序列号:"&KeyId&"<br>")
response.Write("信息种别:"&NewsClass&"<br>")
response.Write("地点乡村:"&City&"<br>")
response.Write("衡宇详细地位:"&Position&"<br>")
response.Write("衡宇范例:"&HouseType&"<br>")
response.Write("楼层:"&Level&"<br>")
response.Write("利用面积:"&Area&"<br>")
response.Write("房价:"&Price&"<br>")
response.Write("其他申明:"&Demostra&"<br>")
response.Write("接洽人:"&ContactMan&"<br>")
response.Write("接洽体例:"&Contact&"<br>")
title=RemoveHTML(aa(i))
response.Write("title:"&title)
forn=0toapplication.Contents.count
if(application.Contents(n)=KeyId)then
ifexit=true
endif
next
ifnotifexitthen
application(time&i)=KeyId
增加到数据库
====================================================
setrs=server.CreateObject("adodb.recordset")
rs.open"selecttop1*fromnewsorderbyiddesc",conn,3,3
rs.addnew
rs("NewsClass")=NewsClass
rs("City")=City
rs("Position")=Position
rs("HouseType")=HouseType
rs("Level")=Level
rs("Area")=Area
rs("Price")=Price
rs("Demostra")=Demostra
rs("ContactMan")=ContactMan
rs("Contact")=Contact
rs.update
rs.close
setrs=nothing
endif
==================================================
next
functionRemoveTag(body)
SetregEx=NewRegExp
regEx.Pattern="<.*?</>"
regEx.IgnoreCase=True
regEx.Global=True
SetMatches=regEx.Execute(body)
dimi,arr(15),ifexit
i=0
j=0
ForEachMatchinMatches
TempStr=Match.Value
TempStr=replace(TempStr,"<td>","")
TempStr=replace(TempStr,"</td>","")
TempStr=replace(TempStr,"<tr>","")
TempStr=replace(TempStr,"</tr>","")
arr(i)=TempStr
i=i+1
if(i>=15)then
exitfor
endif
Next
SetregEx=nothing
SetMatches=nothing
RemoveTag=arr
endfunction
functionRegexHtml(body)
dimr_arr(47),r_temp
SetregEx2=NewRegExp
regEx2.Pattern="<a.*?</a>"
regEx2.IgnoreCase=True
regEx2.Global=True
SetMatches2=regEx2.Execute(body)
iii=0
ForEachMatchinMatches2
r_arr(iii)=Match.Value
iii=iii+1
Next
RegexHtml=r_arr
setregEx2=nothing
setMatches2=nothing
endfunction
======================================================
conn.close
setconn=nothing
%>
</body>
</html>
function.asp
<%
**************************************************
函数名:gotTopic
作用:截字符串,汉字一个算两个字符,英文算一个字符
参数:str----原字符串
strlen----截取长度
前往值:截取后的字符串
**************************************************
functiongotTopic(str,strlen)
ifstr=""then
gotTopic=""
exitfunction
endif
diml,t,c,i
str=replace(replace(replace(replace(str,"",""),""",chr(34)),">",">"),"<","<")
str=replace(str,"?","")
l=len(str)
t=0
fori=1tol
c=Abs(Asc(Mid(str,i,1)))
ifc>255then
t=t+2
else
t=t+1
endif
ift>=strlenthen
gotTopic=left(str,i)&"…"
exitfor
else
gotTopic=str
endif
next
gotTopic=replace(replace(replace(replace(gotTopic,"",""),chr(34),"""),">",">"),"<","<")
endfunction
=========================================================
函数:RemoveHTML(strHTML)
功效:往除HTML标志
参数:strHTML--要往除HTML标志的字符串
=========================================================
FunctionRemoveHTML(strHTML)
DimobjRegExp,Match,Matches
SetobjRegExp=NewRegexp
objRegExp.IgnoreCase=True
objRegExp.Global=True
取闭合的
objRegExp.Pattern="<.+?>"
举行婚配
SetMatches=objRegExp.Execute(strHTML)
遍历婚配汇合,并交换失落婚配的项目
ForEachMatchinMatches
strHtml=Replace(strHTML,Match.Value,"")
Next
RemoveHTML=strHTML
SetobjRegExp=Nothing
setMatches=nothing
EndFunction
%>
conn.asp
<%
onerrorresumenext
setconn=server.CreateObject("adodb.connection")
con="driver={MicrosoftAccessDriver(*.mdb)};dbq="&Server.MapPath("stest.mdb")
conn.opencon
subconnclose
conn.close
setconn=nothing
endsub
%>
在实现ERP等高端的ASP应用时,用户需要提供核心的经营资料,需要ASP商有很高的信用度。楼上说交互性不好,太牵强了吧。在微软提供的一套框架中,利用asp做网站,开发效率高,使用人数少,减少不必要的开销。交互性是互动方式,是有开发人员决定的。 在平时的学习过程中要注意现学现用,注重运用,在掌握了一定的基础知识后,我们可以尝试做一些网页,也许在开始的时候我们可能会遇到很多问题,比如说如何很好的构建基本框架。 以HTML语言整合(HTML负责界面上,ASP则负责功能上)形成一个B/S(浏览器/服务器)模式的网页程序。 如何学好ASP,以前也有人问过,把回答给你转过来看看能否对你有帮助: 尽管MS自己讲C#内核中更多的象VC,但实际上我还是认为它和Java更象一些吧。首先它是面向对象的编程语言,而不是一种脚本,所以它具有面向对象编程语言的一切特性,比如封装性、继承性、多态性等等,这就解决了刚才谈到的ASP的那些弱点。 下面简单介绍一下我学习ASP的方法,希望对想学习ASP的朋友有所帮助... 以上是语言本身的弱点,在功能方面ASP同样存在问题,第一是功能太弱,一些底层操作只能通过组件来完成,在这点上是远远比不上PHP/JSP,其次就是缺乏完善的纠错/调试功能,这点上ASP/PHP/JSP差不多。 Request:从字面上讲就是“请求”,因此这个是处理客户端提交的东东的,例如Resuest.Form,Request.QueryString,或者干脆Request("变量名") 你可以通过继承已有的对象最大限度保护你以前的投资。并且C#和C++、Java一样提供了完善的调试/纠错体系。
页:
[1]