Download source code of a website in C#

Posted by Raffael on 01/05/2013 Posted in C#.Net

With the help of the follwing short C# snippet, you’re able to receive the html code / source of any website as string. Such a function is useful, for example, if you want to parse some information from a website for later use.

Thus the snippet works, you have to add the following two using directives to the header of your source.

using System.Net;
using System.IO;

The function itself looks like this:

public string getHTML(string url)
{
 //Create request for given url
 HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);

 //Create response-object
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();

 //Take response stream
 StreamReader sr = new StreamReader(response.GetResponseStream());

 //Read response stream (html code)
 string html = sr.ReadToEnd();

 //Close streamreader and response
 sr.Close();
 response.Close();

 //return source
 return html;
}

A function call could look like this:

getHTML("http://www.code-bude.net");

If you’ve got suggestions or even problems with the snippet, just write a comment.

11 Comments

Gracieitasays:
10 years ago

I want to do something like that with a complete solution on c# for migrate bootstrap version any ideas, please, and thanks in advance

Reply
Freedomsays:
10 years ago

Is there supposed to be a static in the getHTML class, because my program won’t write without it?

Reply
Jerrysays:
11 years ago

Hello Raffi, look works good for a simple url (ex: http://www.code-bude.net).

But for the url which contains parameter (ex: http://www.code-bude.net/search?query=abc). Seem it also return the html code of http://www.code-bude.net only

Is there a way to read html source code from url which contains parameters?

Regards,

Jerry

Reply
AV Awesomesays:
11 years ago

Man!!!
You are awesome……….
Legen….. wait for it…….
Dary….

Reply
Aeroxsays:
12 years ago

Hey i need help i have a url, and when i open it in the browser it works but if i use it in the C# Code it shows a other site. There is a Problem with the Date when i open it in my C# programm
this is the link:
http://ch.tilllate.com/de/events/applyfilter/2013-11-08?ref=calendar-horizontal

Reply
Hardiksays:
12 years ago

Raffi , dude I love you ! :*

Reply
- Raffisays:
  12 years ago
  
  Thanks, but now please tell me, that you’re a beatiful woman. ;)
  
  Reply
Paritoshsays:
12 years ago

how to get all the pages from a website and found a keyword in whole website?

Reply
- Raffisays:
  12 years ago
  
  You have to start with the root domain. Then parse the received html-code with Regular Expressions. That way you can get all urls/links from html-code. Then save the links in a List for example and get the html code for all of them. Then also parse them for links. So you can get the html of all pages. Save the html-code from every page and then search with string.IndexOf or another Regular Expression for the keyword.
  
  Greets
  
  Reply
krishsays:
12 years ago

hello sir,
The code which you have posted is very useful to extract the entire source code. now i need c#.net code to extract specific tags from the given URLs. for example i want to extract the “script” tag code means how can i do..

Reply
- Raffisays:
  12 years ago
  
  You could use Regular Expressions to get special parts of the whole source code. Example: http://www.dotnetperls.com/paragraph-html
  
  Reply

Related Posts

11 Comments

Leave a Reply to Raffi Cancel reply