Off-Topic: Parse an html file and transfer the text found

Leo Cacciari leo.cacciari at gmail.com
Wed Aug 6 15:46:12 UTC 2008


Il giorno mer, 06/08/2008 alle 11.18 -0400, John Toliver ha scritto:
> I want to send a pastebin because I think it's html with javascript
> embedded, but I'm not sure......
Please, try to not top-post....

It all depends what the javascript is for, if it is some REST thing,
then you have some problem, as the "visible" content of the page would
depends from the interaction of those REST component with the server,
and parsing the html+javascript will lead you nowhere. 

On the other hand, if the javascript is there for making some visual
effect, without adding to the data you are interested in, then it is
easy to eliminate it at parsing time.

Enjoy

-- 
Leo Cacciari

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Questa รจ una parte del messaggio	firmata digitalmente
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20080806/f247f591/attachment.sig>


More information about the ubuntu-users mailing list