|
|
|
start date: Wed, 22 Aug 2007 11:40:26 -0700,
posted on: microsoft.public.dotnet.framework.aspnet
back
| Thread Index |
|
1
am
|
|
2
Peter Bromberg [C# MVP]
|
|
3
am
|
|
4
am
|
|
5
(Walter Wang [MSFT])
|
Interactive web page archiving app: need guidance
Hi,
I need some guidance regarding the best way to build an app for archiving
web page content via an interactive browse session. I have a fair bit of
experience building ASP.NET apps using VS.NET 2005/2008 and almost no
experience building Windows forms apps, which is what this may end up being.
I have built many ASP.NET apps that scrape remote webpage content and save
them as webpages, MHTs, or PDFs. The common weakness of all of them is that
they are somewhat limited in cases where there are depencies on login,
cookies, or state.
What I would like to build would be an intranet app something like the
following:
1) The user would have what appears to be a web browser session.
2) The cookie store specific to the local machine would be accessible to the
app.
3) The user would browse around, logging in as necessary. When they saw a
web page they wanted to keep, they would press a button marked "SAVE THIS
PAGE."
4) The application would save the page to a network share and enter some
information to our database.
I believe there is a control in the realm of Windows forms apps that will
allow items #1,2 and 3. Is that correct? Can someone suggest a starting
point?
ALTERNATELY, I could speculatively imagine an asp.net application designed
as follows:
1) Some control would emulate a web browser session. The user would interact
with that, logging in as necessary.
2) Somehow the app would intercept and save the actual byte stream being
sent to the "browser"
3) Somehow the app would reassemble that byte stream into a collection of
files that could be saved.
As a learning thing I'd be really interested to know if this alternate
option can be done. As a get-the-work-done thing, I would appreciate any
guidance on the first scenario I described.
Thank you,
-KF
Date:Wed, 22 Aug 2007 11:40:26 -0700
Author:
|
RE: Interactive web page archiving app: need guidance
I think you are looking for the WebBrowser control, which offers most or all
of the features you describe. It should be on the Toolbox, if not you can add
it.
-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com
"kenfine@newsgroup.nospam" wrote:
> Hi,
>
> I need some guidance regarding the best way to build an app for archiving
> web page content via an interactive browse session. I have a fair bit of
> experience building ASP.NET apps using VS.NET 2005/2008 and almost no
> experience building Windows forms apps, which is what this may end up being.
>
> I have built many ASP.NET apps that scrape remote webpage content and save
> them as webpages, MHTs, or PDFs. The common weakness of all of them is that
> they are somewhat limited in cases where there are depencies on login,
> cookies, or state.
>
> What I would like to build would be an intranet app something like the
> following:
>
> 1) The user would have what appears to be a web browser session.
> 2) The cookie store specific to the local machine would be accessible to the
> app.
> 3) The user would browse around, logging in as necessary. When they saw a
> web page they wanted to keep, they would press a button marked "SAVE THIS
> PAGE."
> 4) The application would save the page to a network share and enter some
> information to our database.
>
> I believe there is a control in the realm of Windows forms apps that will
> allow items #1,2 and 3. Is that correct? Can someone suggest a starting
> point?
>
> ALTERNATELY, I could speculatively imagine an asp.net application designed
> as follows:
>
> 1) Some control would emulate a web browser session. The user would interact
> with that, logging in as necessary.
> 2) Somehow the app would intercept and save the actual byte stream being
> sent to the "browser"
> 3) Somehow the app would reassemble that byte stream into a collection of
> files that could be saved.
>
> As a learning thing I'd be really interested to know if this alternate
> option can be done. As a get-the-work-done thing, I would appreciate any
> guidance on the first scenario I described.
>
> Thank you,
> -KF
>
>
>
Date:Wed, 22 Aug 2007 13:50:01 -0700
Author:
|
Re: Interactive web page archiving app: need guidance
Thanks Peter. Does that control assume the "identity" of the currently
logged in user and plug into the IE cookie store for that user?
"Peter Bromberg [C# MVP]" wrote
in message news:D43671DD-430D-4602-80AB-3EBA0139AA77@microsoft.com...
>I think you are looking for the WebBrowser control, which offers most or
>all
> of the features you describe. It should be on the Toolbox, if not you can
> add
> it.
> -- Peter
> Recursion: see Recursion
> site: http://www.eggheadcafe.com
> unBlog: http://petesbloggerama.blogspot.com
> BlogMetaFinder: http://www.blogmetafinder.com
>
>
>
> "kenfine@newsgroup.nospam" wrote:
>
>> Hi,
>>
>> I need some guidance regarding the best way to build an app for archiving
>> web page content via an interactive browse session. I have a fair bit of
>> experience building ASP.NET apps using VS.NET 2005/2008 and almost no
>> experience building Windows forms apps, which is what this may end up
>> being.
>>
>> I have built many ASP.NET apps that scrape remote webpage content and
>> save
>> them as webpages, MHTs, or PDFs. The common weakness of all of them is
>> that
>> they are somewhat limited in cases where there are depencies on login,
>> cookies, or state.
>>
>> What I would like to build would be an intranet app something like the
>> following:
>>
>> 1) The user would have what appears to be a web browser session.
>> 2) The cookie store specific to the local machine would be accessible to
>> the
>> app.
>> 3) The user would browse around, logging in as necessary. When they saw a
>> web page they wanted to keep, they would press a button marked "SAVE THIS
>> PAGE."
>> 4) The application would save the page to a network share and enter some
>> information to our database.
>>
>> I believe there is a control in the realm of Windows forms apps that will
>> allow items #1,2 and 3. Is that correct? Can someone suggest a starting
>> point?
>>
>> ALTERNATELY, I could speculatively imagine an asp.net application
>> designed
>> as follows:
>>
>> 1) Some control would emulate a web browser session. The user would
>> interact
>> with that, logging in as necessary.
>> 2) Somehow the app would intercept and save the actual byte stream being
>> sent to the "browser"
>> 3) Somehow the app would reassemble that byte stream into a collection of
>> files that could be saved.
>>
>> As a learning thing I'd be really interested to know if this alternate
>> option can be done. As a get-the-work-done thing, I would appreciate any
>> guidance on the first scenario I described.
>>
>> Thank you,
>> -KF
>>
>>
>>
Date:Wed, 22 Aug 2007 14:00:30 -0700
Author:
|
Limits of programmatic saves and WebBrowser control Re: Interactive web page archiving app: need guidance
I've looked into this at greater length. From browsing around the net, it
looks like it is not going to be so simple a problem as you might imagine.
For what are probably valid security reasons, it is not easy to coax the
WebBrowser control into saving the page without user input. There are a lot
of schemes for grabbing the HTML/textual content of the page, but I need the
assets as well: CSS and image files.
For my purposes, it would probably be sufficient if the "File-->Save As"
prompt that was raised could be prepopulated with a filename and path that
was generated programmatically; the user could simply hit "Return" and be
done. However, it would be best if this could be completely freed of the
need for interaction by the user.
"Peter Bromberg [C# MVP]" wrote
in message news:D43671DD-430D-4602-80AB-3EBA0139AA77@microsoft.com...
>I think you are looking for the WebBrowser control, which offers most or
>all
> of the features you describe. It should be on the Toolbox, if not you can
> add
> it.
> -- Peter
> Recursion: see Recursion
> site: http://www.eggheadcafe.com
> unBlog: http://petesbloggerama.blogspot.com
> BlogMetaFinder: http://www.blogmetafinder.com
>
>
>
> "kenfine@newsgroup.nospam" wrote:
>
>> Hi,
>>
>> I need some guidance regarding the best way to build an app for archiving
>> web page content via an interactive browse session. I have a fair bit of
>> experience building ASP.NET apps using VS.NET 2005/2008 and almost no
>> experience building Windows forms apps, which is what this may end up
>> being.
>>
>> I have built many ASP.NET apps that scrape remote webpage content and
>> save
>> them as webpages, MHTs, or PDFs. The common weakness of all of them is
>> that
>> they are somewhat limited in cases where there are depencies on login,
>> cookies, or state.
>>
>> What I would like to build would be an intranet app something like the
>> following:
>>
>> 1) The user would have what appears to be a web browser session.
>> 2) The cookie store specific to the local machine would be accessible to
>> the
>> app.
>> 3) The user would browse around, logging in as necessary. When they saw a
>> web page they wanted to keep, they would press a button marked "SAVE THIS
>> PAGE."
>> 4) The application would save the page to a network share and enter some
>> information to our database.
>>
>> I believe there is a control in the realm of Windows forms apps that will
>> allow items #1,2 and 3. Is that correct? Can someone suggest a starting
>> point?
>>
>> ALTERNATELY, I could speculatively imagine an asp.net application
>> designed
>> as follows:
>>
>> 1) Some control would emulate a web browser session. The user would
>> interact
>> with that, logging in as necessary.
>> 2) Somehow the app would intercept and save the actual byte stream being
>> sent to the "browser"
>> 3) Somehow the app would reassemble that byte stream into a collection of
>> files that could be saved.
>>
>> As a learning thing I'd be really interested to know if this alternate
>> option can be done. As a get-the-work-done thing, I would appreciate any
>> guidance on the first scenario I described.
>>
>> Thank you,
>> -KF
>>
>>
>>
Date:Wed, 22 Aug 2007 15:20:23 -0700
Author:
|
RE: Interactive web page archiving app: need guidance
Hi KF,
You may want to check out following resource:
#Harvesting Web Content into MHTML Archive - The Code Project - C# Libraries
http://www.codeproject.com/cs/library/mhtmllib.asp
Regards,
Walter Wang (wawang@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
==================================================
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
Date:Thu, 23 Aug 2007 11:19:38 GMT
Author:
|
|
|