For this homework assignment, you will create an HTMLLinkParser class that is able to grab URLs from the href attribute of the <a> anchor tag from a chunk of HTML.
You have been given some starter code for HTMLLinkParser below. You need to modify the regular expression to capture the link in the href attribute of any <a> anchor tags in the text. Go to http://htmlhelp.com/reference/html40/special/a.html to see examples of these tags.
You need to consider several things, like mixed case, spaces, and extra attributes. For example, the following are all valid links:
<a href="http://www.usfca.edu/">
<a href="http://docs.python.org/library/string.html?highlight=string#module-string">
<A HREF="HTTP://WWW.USFCA.EDU">
<A hREf="http://www.usfca.edu">
<a href = "http://www.usfca.edu" >
<a name="home" href="index.html">
<a href="index.html" name="home">
<a name="home" target="_top" href="index.html" id="home" accesskey="A">
<a href =
"http://www.usfca.edu">
<A
HrEF = "index.html" naMe=home >
You can assume the HTML is valid, and that the value for the href attribute will always be quoted and URL encoded.
To help test your regular expression, you have been provided jUnit tests in the class HTMLLinkTester. To receive a 100% on this homework, you must pass all of these unit tests. You will need to add jUnit 4 to your build path in Eclipse for the unit tests to work.
You must submit this homework to your SVN repository using Eclipse. Your homework, including the src directory and all required source code files must be at the following location:
https://www.cs.usfca.edu/svn/<username>/cs212/homework03/
Replace <username> with your CS username in all lowercase letters.
The explicit locations required for key files are:
https://www.cs.usfca.edu/svn/<username>/cs212/homework03/src/HTMLLinkParser.java
https://www.cs.usfca.edu/svn/<username>/cs212/homework03/src/HTMLLinkTester.java
Your code must be committed by 11:59pm on Friday, February 15, 2013. Late homework is not accepted.