Difference between revisions of "Navigating to HTML page"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
+ | < [[HTML_and_ConTeXt]] | ||
+ | |||
The next step is to retrieve the HTML pages created in the step above. Here I have used the ruby library 'open-uri' to | The next step is to retrieve the HTML pages created in the step above. Here I have used the ruby library 'open-uri' to | ||
retrieve the web-page and another libray [http://code.whytheluckystiff.net/hpricot 'hpricot'] to edit these pages and translate html markup into ConTeXt markup. | retrieve the web-page and another libray [http://code.whytheluckystiff.net/hpricot 'hpricot'] to edit these pages and translate html markup into ConTeXt markup. |
Revision as of 08:02, 16 July 2007
The next step is to retrieve the HTML pages created in the step above. Here I have used the ruby library 'open-uri' to retrieve the web-page and another libray 'hpricot' to edit these pages and translate html markup into ConTeXt markup.
#scan_page.rb = Retrieves the html page of interest from the server, # navigates to links within the main page and construct a # context document #!/usr/bin/ruby require 'rubygems' require 'open-uri' # the open-uri library require 'hpricot' # the hpricot library require 'scrape_page' # user-defined function to filter html into ConTeXt # scans the home page and lists # all the directories and subdirectories doc=Hpricot(open("http://ipa.dd.re.ss/AnnRep07"))