|RFx ID :||19919670|
|Tender Name :||Whole of Domain Web Harvest Department of Internal Affairs (NZ)|
|Reference # :||DIA/2018/005|
|Open Date :||Tuesday, 17 July 2018 10:00 AM (Pacific/Auckland UTC+12:00)|
|Close Date :||Tuesday, 14 August 2018 3:00 PM (Pacific/Auckland UTC+12:00)|
|Tender Type :||Request for Proposals (RFP)|
|Tender Coverage :||Sole Agency [?]|
|Required Pre-qualifications :||None|
|Alternate Physical Delivery Address :|
|Alternate Physical Fax Number :|
The National Library of New Zealand is part of the Department of Internal Affairs and is seeking Respondents interested in partnering with the Library to undertake a crawl in January 2019 of the publically accessible New Zealand web domain. The New Zealand web domain is broadly defined as those resources belonging to the country-code top-level domain (i.e. .nz) and other resources identified as being hosted within New Zealand.
There are effectively two phases to this work. First the crawl is the managed systemic browsing of the Internet via software, following links found from starting URLs (seeds). Second, the resulting harvest is the collected artefacts of the crawl. These are the Internet resources (HTML documents, images, stylesheets, multimedia, social media, etc.) that are collected and packaged in such a way that allows faithful replay of the original webpages.
The Library is looking for Respondents with a proven solution, technical and support infrastructure, and track record in Web crawling at scale. They must have the ability to crawl according to breadth and depth specifications that will be supplied by the Library
It is important for the Successful Respondent to run a transparent crawl process so that the Library has a view into the crawl underway. This should include the flexibility to add additional seeds and/or stop further crawling of a particular domain during the crawl.
It is essential to achieve a harvest that is a good representation of the New Zealand web domain - all provided seeds crawled, captures as complete as possible to allow the faithful replay of the original webpages.
This project will enable the NLNZ to collect a snapshot of the entire New Zealand Web space, which it has been doing since 2008. By providing the crawling service for this project, you will be playing a key role in the building of this large and growing historical collection of New Zealand websites and social media.