老秘网_材夜思范文

标题: 网页采集程序(超级简单版) [打印本页]

作者: 福建老秘    时间: 2010-7-20 19:53
标题: 网页采集程序(超级简单版)
1 V9 j2 G6 p$ Y/ _, g+ S
" W( `& `/ z3 e/ |7 K4 Z# N- m0 t
网页采集程序(超级简单版)
$ g8 d6 U. L# _
; [5 H' C3 K U- S y( w& c" u- c

网页采集程序(超级简单版)
01 protected void btn_click(object sender, EventArgs e) 

N3 r Q, k# L* a7 @2 V

02         { 

! T" B) n2 [. P7 ~) [

03             //方法一: 

: w+ h6 r" w$ @! y4 O4 T% [

04             //System.Net.WebClient wc = new System.Net.WebClient(); 

' R6 [; B; k# P- G, r2 a

05             //byte[] b = wc.DownloadData("http://www.baidu.com"); 

& V9 T6 a) B" d. C

06             //string html = System.Text.Encoding.GetEncoding("gb2312").GetString(b); 

5 X' z3 Y! T) W" v) K6 ?

07             //html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

3 a0 S. ]2 W3 S7 u

08             //html = html.Substring(0, html.IndexOf("</p>")); 

# l4 _2 e" ` l7 C

09             //Response.Write(html); 

8 [# s4 S/ A/ [# z

10   

, B' i+ M, \# ~/ b: @7 x+ w

11             //方法二: 

; b2 T/ m2 }$ ]9 x& C7 L& n' }1 u; U

12         //获取整个网页 

1 k; r/ ^* D3 t. J" U( C7 s* }

13             System.Net.WebClient wc = new System.Net.WebClient(); 

$ C( d: C1 V* ^* ?" O% {- q$ q

14             System.IO.Stream sm = wc.OpenRead("http://www.baidu.com"); 

* r7 m3 w* C2 N" w3 l0 |: m; A

15             System.IO.StreamReader sr = new System.IO.StreamReader(sm, System.Text.Encoding.Default, true, 256000); 

6 D0 s5 Q+ w! X- X6 F, q1 R7 Y1 ]

16             string html = sr.ReadToEnd(); 

7 ]. K- L) ^) v3 {# Y

17             sr.Close(); 

' x4 k Z5 Z6 T: r3 t; u

18             //根据规则获取想要的内容 

3 {% ]3 D# J0 B4 g( j* {9 q

19             html = html.Substring(html.IndexOf("<p id=\"lg\">") + "<p id=\"lg\">".Length); 

* d. I2 c! i; l+ B" y

20             html = html.Substring(0, html.IndexOf("</p>")); 

/ ?0 N* n1 G0 {0 |% G" I

21             Response.Write(html); 

$ O, K: z4 S! e3 X

22         }


作者: 福建老秘    时间: 2010-7-20 20:00

http://hereson.javaeye.com/blog/207468






欢迎光临 老秘网_材夜思范文 (https://laomiw.com/) Powered by Discuz! X3.4