DotNetNewsgroup.com  
web access to complete list of Microsoft.NET newsgroups
   home   |   control panel login   |   archive  |  
 
  carried group
academic
adonet
aspnet
aspnet.announcements
aspnet.buildingcontrols
aspnet.caching
aspnet.datagridcontrol
aspnet.mobile
aspnet.security
aspnet.webcontrols
aspnet.webservices
assignment_manager
datatools
dotnet.distributed_apps
dotnet.general
dotnet.myservices
dotnet.nternationalization
dotnet.scripting
dotnet.security
dotnet.vjsharp
dotnet.vsa
dotnet.xml
dotnetfaqs
framework
framework.clr
framework.compactframework
framework.component_services
framework.controls
framework.databinding
framework.drawing
framework.enhancements
framework.interop
framework.odbcnet
framework.performance
framework.remoting
framework.sdk
framework.setup
framework.webservices
framework.windowsforms
framework.wmi
frwk.windowsforms.designtime
lang.csharp
lang.jscript
lang.vb
lang.vb.controls
lang.vb.data
lang.vb.upgrade
lang.vc
lang.vc.libraries
  
 
start date: Wed, 22 Aug 2007 11:40:26 -0700,    posted on: microsoft.public.dotnet.framework.aspnet        back       

Thread Index
  1    am
          2    Peter Bromberg [C# MVP]
          3    am
          4    am
          5    (Walter Wang [MSFT])


Interactive web page archiving app: need guidance   
Hi,

I need some guidance regarding the best way to build an app for archiving 
web page content via an interactive browse session. I have a fair bit of 
experience building ASP.NET apps using VS.NET 2005/2008 and almost no 
experience building Windows forms apps, which is what this may end up being.

I have built many ASP.NET apps that scrape remote webpage content and save 
them as webpages, MHTs, or PDFs. The common weakness of all of them is that 
they are somewhat limited in cases where there are depencies on login, 
cookies, or state.

What I would like to build would be an intranet app something like the 
following:

1) The user would have what appears to be a web browser session.
2) The cookie store specific to the local machine would be accessible to the 
app.
3) The user would browse around, logging in as necessary. When they saw a 
web page they wanted to keep, they would press a button marked "SAVE THIS 
PAGE."
4) The application would save the page to a network share and enter some 
information to our database.

I believe there is a control in the realm of Windows forms apps that will 
allow items #1,2 and 3. Is that correct? Can someone suggest a starting 
point?

ALTERNATELY, I could speculatively imagine an asp.net application designed 
as follows:

1) Some control would emulate a web browser session. The user would interact 
with that, logging in as necessary.
2) Somehow the app would intercept and save the actual byte stream being 
sent to the "browser"
3) Somehow the app would reassemble that byte stream into a collection of 
files that could be saved.

As a learning thing I'd be really interested to know if this alternate 
option can be done. As a get-the-work-done thing, I would appreciate any 
guidance on the first scenario I described.

Thank you,
-KF
Date:Wed, 22 Aug 2007 11:40:26 -0700   Author:  

RE: Interactive web page archiving app: need guidance   
I think you are looking for the WebBrowser control, which offers most or all 
of the features you describe. It should be on the Toolbox, if not you can add 
it.
-- Peter
Recursion: see Recursion
site:  http://www.eggheadcafe.com
unBlog:  http://petesbloggerama.blogspot.com
BlogMetaFinder:    http://www.blogmetafinder.com



"kenfine@newsgroup.nospam" wrote:


> Hi,
> 
> I need some guidance regarding the best way to build an app for archiving 
> web page content via an interactive browse session. I have a fair bit of 
> experience building ASP.NET apps using VS.NET 2005/2008 and almost no 
> experience building Windows forms apps, which is what this may end up being.
> 
> I have built many ASP.NET apps that scrape remote webpage content and save 
> them as webpages, MHTs, or PDFs. The common weakness of all of them is that 
> they are somewhat limited in cases where there are depencies on login, 
> cookies, or state.
> 
> What I would like to build would be an intranet app something like the 
> following:
> 
> 1) The user would have what appears to be a web browser session.
> 2) The cookie store specific to the local machine would be accessible to the 
> app.
> 3) The user would browse around, logging in as necessary. When they saw a 
> web page they wanted to keep, they would press a button marked "SAVE THIS 
> PAGE."
> 4) The application would save the page to a network share and enter some 
> information to our database.
> 
> I believe there is a control in the realm of Windows forms apps that will 
> allow items #1,2 and 3. Is that correct? Can someone suggest a starting 
> point?
> 
> ALTERNATELY, I could speculatively imagine an asp.net application designed 
> as follows:
> 
> 1) Some control would emulate a web browser session. The user would interact 
> with that, logging in as necessary.
> 2) Somehow the app would intercept and save the actual byte stream being 
> sent to the "browser"
> 3) Somehow the app would reassemble that byte stream into a collection of 
> files that could be saved.
> 
> As a learning thing I'd be really interested to know if this alternate 
> option can be done. As a get-the-work-done thing, I would appreciate any 
> guidance on the first scenario I described.
> 
> Thank you,
> -KF
> 
> 
> 
Date:Wed, 22 Aug 2007 13:50:01 -0700   Author:  

Re: Interactive web page archiving app: need guidance   
Thanks Peter. Does that control assume the "identity" of the currently 
logged in user and plug into the IE cookie store for that user?


"Peter Bromberg [C# MVP]"  wrote 
in message news:D43671DD-430D-4602-80AB-3EBA0139AA77@microsoft.com...

>I think you are looking for the WebBrowser control, which offers most or 
>all
> of the features you describe. It should be on the Toolbox, if not you can 
> add
> it.
> -- Peter
> Recursion: see Recursion
> site:  http://www.eggheadcafe.com
> unBlog:  http://petesbloggerama.blogspot.com
> BlogMetaFinder:    http://www.blogmetafinder.com
>
>
>
> "kenfine@newsgroup.nospam" wrote:
>
>> Hi,
>>
>> I need some guidance regarding the best way to build an app for archiving
>> web page content via an interactive browse session. I have a fair bit of
>> experience building ASP.NET apps using VS.NET 2005/2008 and almost no
>> experience building Windows forms apps, which is what this may end up 
>> being.
>>
>> I have built many ASP.NET apps that scrape remote webpage content and 
>> save
>> them as webpages, MHTs, or PDFs. The common weakness of all of them is 
>> that
>> they are somewhat limited in cases where there are depencies on login,
>> cookies, or state.
>>
>> What I would like to build would be an intranet app something like the
>> following:
>>
>> 1) The user would have what appears to be a web browser session.
>> 2) The cookie store specific to the local machine would be accessible to 
>> the
>> app.
>> 3) The user would browse around, logging in as necessary. When they saw a
>> web page they wanted to keep, they would press a button marked "SAVE THIS
>> PAGE."
>> 4) The application would save the page to a network share and enter some
>> information to our database.
>>
>> I believe there is a control in the realm of Windows forms apps that will
>> allow items #1,2 and 3. Is that correct? Can someone suggest a starting
>> point?
>>
>> ALTERNATELY, I could speculatively imagine an asp.net application 
>> designed
>> as follows:
>>
>> 1) Some control would emulate a web browser session. The user would 
>> interact
>> with that, logging in as necessary.
>> 2) Somehow the app would intercept and save the actual byte stream being
>> sent to the "browser"
>> 3) Somehow the app would reassemble that byte stream into a collection of
>> files that could be saved.
>>
>> As a learning thing I'd be really interested to know if this alternate
>> option can be done. As a get-the-work-done thing, I would appreciate any
>> guidance on the first scenario I described.
>>
>> Thank you,
>> -KF
>>
>>
>> 
Date:Wed, 22 Aug 2007 14:00:30 -0700   Author:  

Limits of programmatic saves and WebBrowser control Re: Interactive web page archiving app: need guidance   
I've looked into this at greater length. From browsing around the net, it 
looks like it is not going to be so simple a problem as you might imagine. 
For what are probably valid security reasons, it is not easy to coax the 
WebBrowser control into saving the page without user input. There are a lot 
of schemes for grabbing the HTML/textual content of the page, but I need the 
assets as well: CSS and image files.

For my purposes, it would probably be sufficient if the "File-->Save As" 
prompt that was raised could be prepopulated with a filename and path that 
was generated programmatically; the user could simply hit "Return" and be 
done. However, it would be best if this could be completely freed of the 
need for interaction by the user.

"Peter Bromberg [C# MVP]"  wrote 
in message news:D43671DD-430D-4602-80AB-3EBA0139AA77@microsoft.com...

>I think you are looking for the WebBrowser control, which offers most or 
>all
> of the features you describe. It should be on the Toolbox, if not you can 
> add
> it.
> -- Peter
> Recursion: see Recursion
> site:  http://www.eggheadcafe.com
> unBlog:  http://petesbloggerama.blogspot.com
> BlogMetaFinder:    http://www.blogmetafinder.com
>
>
>
> "kenfine@newsgroup.nospam" wrote:
>
>> Hi,
>>
>> I need some guidance regarding the best way to build an app for archiving
>> web page content via an interactive browse session. I have a fair bit of
>> experience building ASP.NET apps using VS.NET 2005/2008 and almost no
>> experience building Windows forms apps, which is what this may end up 
>> being.
>>
>> I have built many ASP.NET apps that scrape remote webpage content and 
>> save
>> them as webpages, MHTs, or PDFs. The common weakness of all of them is 
>> that
>> they are somewhat limited in cases where there are depencies on login,
>> cookies, or state.
>>
>> What I would like to build would be an intranet app something like the
>> following:
>>
>> 1) The user would have what appears to be a web browser session.
>> 2) The cookie store specific to the local machine would be accessible to 
>> the
>> app.
>> 3) The user would browse around, logging in as necessary. When they saw a
>> web page they wanted to keep, they would press a button marked "SAVE THIS
>> PAGE."
>> 4) The application would save the page to a network share and enter some
>> information to our database.
>>
>> I believe there is a control in the realm of Windows forms apps that will
>> allow items #1,2 and 3. Is that correct? Can someone suggest a starting
>> point?
>>
>> ALTERNATELY, I could speculatively imagine an asp.net application 
>> designed
>> as follows:
>>
>> 1) Some control would emulate a web browser session. The user would 
>> interact
>> with that, logging in as necessary.
>> 2) Somehow the app would intercept and save the actual byte stream being
>> sent to the "browser"
>> 3) Somehow the app would reassemble that byte stream into a collection of
>> files that could be saved.
>>
>> As a learning thing I'd be really interested to know if this alternate
>> option can be done. As a get-the-work-done thing, I would appreciate any
>> guidance on the first scenario I described.
>>
>> Thank you,
>> -KF
>>
>>
>> 
Date:Wed, 22 Aug 2007 15:20:23 -0700   Author:  

RE: Interactive web page archiving app: need guidance   
Hi KF,

You may want to check out following resource:

#Harvesting Web Content into MHTML Archive - The Code Project - C# Libraries
http://www.codeproject.com/cs/library/mhtmllib.asp


Regards,
Walter Wang (wawang@online.microsoft.com, remove 'online.')
Microsoft Online Community Support

==================================================
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.
Date:Thu, 23 Aug 2007 11:19:38 GMT   Author:  

Google
 
Web dotnetnewsgroup.com


COPYRIGHT ?2005, EUROFRONT WORLDWIDE LTD., ALL RIGHT RESERVE  |   Contact us