Click and navigate to chapters and sections

From Wiki
Jump to navigation Jump to search

<< HTML_and_ConTeXt | HTML_to_ConTeXt >

In this example, we created new pages for chapters and sections so that each part of the document could be authored by a different person. In Informl new pages are indicated by the CSS class name "existingWikiWord" as shown in the following figure.

Wiki prev2.jpg.

  <a class="existingWikiWord"
      APCC Research and Development Projects

Knowing this, I have used the following 'hpricot' code to click on chapter and section links to retrieve their contents.

chapters= (doc/"p/a.existingWikiWord")

# we need to navigate one more level into the web page
# let us discover the links for that
chapters.each do |ch|
  chap_link = ch.attributes['href']
  # using inner_html we can create subdirectories

  chap_name = ch.inner_html.gsub(/\s*/,"")
  chap_name_org = ch.inner_html

  # We create chapter directories
  system("mkdir -p #{chap_name}")
  fil.write "\\input #{chap_name}  \n"
  `rm #{chapFil}`,"a")
  cFil.write "\\chapter{ #{chap_name_org} } \n"
  # We navigate to sections now
  sections= (doc2/"p/a.existingWikiWord")
  sections.each do |sc|
    sec_link = sc.attributes['href']
    sec_name = sc.inner_html.gsub(/\s*/,"")

    `rm #{secFil}`,"a")
    `rm #{sechFil}`,"a")

After navigating to sections (h1 elements in HTML) retrieve their contents and send it to the ruby function "scrape_page.rb" for filtering.

    #  scrape_the_page(sec_link,"#{chap_name}/#{sec_name}")
    cFil.write "\\input #{chap_name}/#{sec_name} \n"
fil.write "\\stoptext \n"