老秘网_材夜思范文

标题: 网页采集程序(超级简单版) [打印本页]

作者: 福建老秘    时间: 2010-7-20 19:53
标题: 网页采集程序(超级简单版)
8 _6 Y( p, W0 H9 F \7 q) l1 f6 w& |& m
& a1 ^$ R& K8 ^' ^
网页采集程序(超级简单版)
" `; r) x4 D2 _; \- s! u# j4 r
% Q u0 ~. _8 X; _* {6 T( D

网页采集程序(超级简单版)
01 protected void btn_click(object sender, EventArgs e) 

$ q+ n* ?# d1 Q5 @) \+ V9 _, \

02         { 

3 ~ t+ }1 A7 F! q% }9 k$ K8 x

03             //方法一: 

5 M! o3 P$ X! q5 x! ]

04             //System.Net.WebClient wc = new System.Net.WebClient(); 

: O2 G7 }) \1 A+ d. u+ W- }

05             //byte[] b = wc.DownloadData("http://www.baidu.com"); 

1 n% C( `, C. j

06             //string html = System.Text.Encoding.GetEncoding("gb2312").GetString(b); 

1 B/ S* D- J% L$ l# Q

07             //html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

% }: \: Z5 ~( C* Z

08             //html = html.Substring(0, html.IndexOf("</p>")); 

$ w+ j$ H% t" }9 A

09             //Response.Write(html); 

* R' R/ U& P: O: k

10   

# u; ^. l3 R, z" F) }

11             //方法二: 

6 e2 C6 y0 a% j- x, b. X

12         //获取整个网页 

! K9 w }% E/ A3 l4 n

13             System.Net.WebClient wc = new System.Net.WebClient(); 

: ^4 G1 R+ J: U: U/ i

14             System.IO.Stream sm = wc.OpenRead("http://www.baidu.com"); 

) y( L- R2 o5 r7 d; u* z2 Q4 }

15             System.IO.StreamReader sr = new System.IO.StreamReader(sm, System.Text.Encoding.Default, true, 256000); 

' m0 d K' ]9 Q6 t5 G) j3 C

16             string html = sr.ReadToEnd(); 

! n. l3 l/ b6 B8 G; P# f

17             sr.Close(); 

! g$ k% t# }- y

18             //根据规则获取想要的内容 

/ s! `/ ?( @8 D$ _1 B& A4 Y$ j

19             html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

- S. H$ g/ E# [, `

20             html = html.Substring(0, html.IndexOf("</p>")); 

9 _- o& s# L, v V# l' l n* _- k$ C

21             Response.Write(html); 

! g! S0 x3 D/ |# B8 ~

22         }


作者: 福建老秘    时间: 2010-7-20 20:00

http://hereson.javaeye.com/blog/207468






欢迎光临 老秘网_材夜思范文 (https://laomiw.com/) Powered by Discuz! X3.4