wespiva — Web Spider Validator
Web Spider Validator, short named wespiva, is a mix of a
- Web-Spider (Robot, Crawler) , which traverses between webpages linked together,
- and an XHTML-Validator, which proofs whether a page contains valid tags, attributes and allowed attribute-values.
Description
The purpose of this tool is to ensure high-quality
standard-complying websites.
With
xenu's link sleuth
there is a great tool for spidering
and finding dead links, but it does not validate a page.
With the
w3.org-Validator
there is a great validation-tool,
but it only checks a single page,
and is often overloaded and slow.
The solution to overcome these restrictions is wespiva,
which spiders and
validates
in one rush.
This tool assists in the transition of bigger sites to XHTML.
Download
In spite of being programmed not to harm any computer, there is a chance of an crash by accident or programming-error in the application or one of the .NET-functions used by it, which could hurt your system. In order not to be held liable for any negative circumstances resulting of the usage of this program, you may only use the program when accepting the following rules:
- You backup your system before installing and using it or run it on your own responsibility in a Virtual Machine
- You will not make me responsible for damages (lost time, crashed computer, etc), if the damage is not provoked intentionally.
Click here to
download wespiva
Version 0.1.7
(115 kb ZIP-File, 2008-08–21)
wespiva Version 0.1.6
(100 kb ZIP-File, 2007-09-14)
Installation
Prerequisites
wespiva runs on Windows with NET Framework 2.0 installed.
How to run
Just unzip the single file in the zip-archive
and start using it.
Frequently asked questions for wespiva
- Will there be a MONO-Version?
- Probably yes—if someone pays for it. If no one would pay for it, there is no big demand for it.
- How much pages could be checked in one run?
- I've used it to check sites with more than 50.000 elements in less than 15 minutes. The duration depends mainly on the line-speed and responsiveness of the page-delivering webserver.
- Why Validation?
- I'll let others speak here:
Samples
Features
- easy to use
- easy to install (just a single exe file)
- fast (could check over 50.000 elements in less than 15 minutes)
- detects dead links
- finds validation issues
- generates easy to understand reports
- generates a sitemap in the standard-sitemap-format
- could be called per command-line for automated periodically checking of a site
- Spidering and validation is done in a background-thread, the GUI stays responsive
- comfortable configuration, for example a grace-period could be set
runnable from command-line
c:\ wespiva.exe "www.wissing.com" "example@example.not"
Known Bugs
-
It's not a bug, but a limitation:
Only well-formed pages are checked. If a page is not well-formed, the reason for the offending error is shown. Please correct these errors first. The spidering is unaffected, but the validation stops at these errors. - This version shows not the correct result of the check of external links
Future Features
- Proxy-Support
- https-Support
- Online-Version
- Multi-Threading (for other than the GUI)
-
Text-Extraction Style/CSS-Extraction- Javascript-Extraction
- Thumbnail of every web-page and graphic ressources
robots.txt conformance- checking of inline-Anchor-Hrefs (like #top)
Other nice Validators
They are really good, but don't let you check whole sites:
