Commons Feedparser - Tag List report

Tag List Report

The following document contains the listing of user tags found in the code. Below is the summary of the occurrences per tag.

Tag	Total number of occurrences
@todo	0
FIXME	71
TODO	2

Each tag is detailed below:

FIXME

Number of occurrences found in the code: 71

org.apache.commons.feedparser.AtomFeedParser	Line
what if there is no type attribute specified? Whats the default?	145
get xml:base to expand the URIs.	283
move this code to MetaFeedParser...	343

org.apache.commons.feedparser.BaseParser	Line
unify this with RSSFeedParser.getChildElementTextByName	103
this can be rewritten to use getChild()	134

org.apache.commons.feedparser.ContentDetector	Line
look for the RDF namespace and the RSS DTD namespace	111

org.apache.commons.feedparser.FeedFilter	Line
return an object here so that I can flag a bozo bit.	74
this isn't actually true. We should leave the BOM and remove the prolog anyway due to the fact that this will still break the parser. Come up with some tests for UTF-16 to see if I can get it to break and then update this method.	105
note that when I was benchmarking this code that this showed up as a MAJOR bottleneck so we might want to optimize it a little more.	185

org.apache.commons.feedparser.FeedParserImpl	Line
when this is a JDOM or XML parser Exception we should detect when we're working with an XHTML or HTML file and then parse it with an XFN/XOXO event listener.	81
if we return the WRONG content type here we will break. getBytes()... UTF-16 and UTF-32 especially. We should also perform HTTP Content-Type parsing here to preserve the content type. This can be fixed by integrating our networking API from NewsMonster.	98
if this is XHTML we need to handle this with either an XFN or an XOXO directory parser. There might be more metadata we need to parse here. (also I wonder if this could be a chance to do autodiscovery).	172
if this is an UNKNOWN format We need to throw an UnsupportedFeedxception (which extends FeedParserException)	179

org.apache.commons.feedparser.HTMLFeedParser	Line
only convert to using XFN if these types of links are detected. If its just a plain XHTML file then we shouldn't use this interface. Also FeedVersion needs to be called.	44
only include onItem when we have at least ONE XFN relations that valid.	73
when this current rel is NOT part of any XFN spec we should not be using the feed parser listener because it might just be a nofollow link or such.	84

org.apache.commons.feedparser.MetaFeedParser	Line
this should be refactored into a new class called MetaFeedParser to be used by both Atom and RSS. Also the date handling below needs to be generic.	40
make sure RSS .9 is working and 0.91. I just need to confirm but I think they are working correctly	51

org.apache.commons.feedparser.RSSFeedParser	Line
migrate this to XPath	151
if this is a GUID and isPermalink=false don't use it as the permalink.	157
move to the onContent API defined within the AtomFeedParser and deprecated this body handling.	208
with malformed XML this could throw an NPE. Luckly this format is rare now.	222
move to the onContent API defined within the AtomFeedParser and deprecated this body handling.	230
move to the onContent API defined within the AtomFeedParser and deprecated this body handling.	254

org.apache.commons.feedparser.locate.AnchorParser	Line
we do NOT obey base right now and this is a BIG problem!	33
what if there are HTML comments here? We would parse links within comments which isn't what we want.	53
how do we pass back the content of the href?	56
we SHOULD be using this but its not working right now.	78
won't work with single quotes	118
won't work with <a /> parse( "<a href=\"http://peerfear.org\" rel=\"linux\" title=\"linux\" >adf</a>", listener );	119

org.apache.commons.feedparser.locate.AnchorParserListener	Line
Pass a fourth attribute that is the body of the anchor here.	39

org.apache.commons.feedparser.locate.EntityDecoder	Line
see FeedFilter.java for a list of all valid HTML entities. I should replace them with character literals in this situation.	35
there are a LOT more of these and we need an exhaustive colleciton.	44
	51
(performance): do I have existing code that does this more efficiently?	67

org.apache.commons.feedparser.locate.FeedLocator	Line
if we were GIVEN an RSS/Atom/OPML/etc file then we should just attempt to use this and return a FeedList with just one entry. Parse it first I think to make sure its valid XML and then move forward. The downside here is that it would be wasted CPU if its HTML content.	81
add UNIT TESTS for Yahoo Groups and Flickr	116

org.apache.commons.feedparser.locate.LinkLocator	Line
if it's at the same directory level we should prioritize it. for example:	80
What happens if the Feed Parser is used to aggregate feeds on the localhost? This will break that. Brad Neuberg, bkn3@columbia.edu	97
we should assert tha that these feeds are from the SAME domain not a link to another feed.	109
This is a hack, Brad Neuberg, bkn3@columbia.edu	210

org.apache.commons.feedparser.locate.ProbeLocator	Line
This doesn't seem like the right place for this. Can you document this more? It's cryptic. Brad Neuberg, bkn3@columbia.edu.	155

org.apache.commons.feedparser.locate.ResourceExpander	Line
What happens if resource is a "file://" scheme?	81
Brad says this method is totally broken.	265

org.apache.commons.feedparser.locate.TestAnchorParser	Line
this won't work because it has an image	82
what about unit tests which have multiple lines ?	85
don't find anchors in comments. doTest( 0, "file:tests/anchor/anchor6.html" );	97
won't work with <a />	104

org.apache.commons.feedparser.locate.blogservice.Blosxom	Line
This might be fragile, but it is used across all of the Blosxom blogs I have looked at so far. Brad Neuberg, bkn3@columbia.edu	69

org.apache.commons.feedparser.locate.blogservice.ExpressionEngine	Line
No way to detect this type of weblog right now	56
Implement	77

org.apache.commons.feedparser.locate.blogservice.GreyMatter	Line
Implement	80

org.apache.commons.feedparser.locate.blogservice.Manila	Line
No way to detect this type of weblog right now	56

org.apache.commons.feedparser.locate.blogservice.MovableType	Line
Implement	80

org.apache.commons.feedparser.locate.blogservice.iBlog	Line
No way to detect this type of weblog right now	56

org.apache.commons.feedparser.network.BaseResourceRequest	Line
this needs to use the cache.	230

org.apache.commons.feedparser.network.NetworkException	Line
java.lang.NumberFormatException: For input string: "fie" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:468) at java.lang.Integer.parseInt(Integer.java:518) at org.peerfear.newsmonster.network.NetworkException.getResponseCode(NetworkException.java:142) at ksa.robot.FeedTask._doTaskLogFailure(FeedTask.java:264) at ksa.robot.FeedTask.run(FeedTask.java:202) at ksa.robot.TaskThread.doProcessTask(TaskThread.java:298) at ksa.robot.TaskThread.run(TaskThread.java:111)	117

org.apache.commons.feedparser.network.NetworkException

Line

java.lang.NumberFormatException: For input string: "fie" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:468) at java.lang.Integer.parseInt(Integer.java:518) at org.peerfear.newsmonster.network.NetworkException.getResponseCode(NetworkException.java:142) at ksa.robot.FeedTask._doTaskLogFailure(FeedTask.java:264) at ksa.robot.FeedTask.run(FeedTask.java:202) at ksa.robot.TaskThread.doProcessTask(TaskThread.java:298) at ksa.robot.TaskThread.run(TaskThread.java:111)

117

org.apache.commons.feedparser.network.ResourceRequestFactory	Line
(should this be a linked list?)	77
remove this until we figure out how to do proxy authentication. java.net.Authenticator.setDefault ( new Authenticator() );	204

org.apache.commons.feedparser.network.URLCookieManager	Line
How can we make sure to delete older sites...?! no need for this to grow to infinite size.	27
merge these... new cookies into the site cookies	94

org.apache.commons.feedparser.network.URLResourceRequest	Line
do smart user agent detection. if this is a .html file we can set it to us Mozilla and if not we can use NewsMonster _urlConnection.setRequestProperty( "Referer", REFERER );	136
performance improvement... don't write do disk and then //read from disk.?	348

org.apache.commons.feedparser.sax.RSSFeedParser	Line
move to a FastStringBuffer that's not synchronized.	315
it might be possible to call an item again without a member and the value from the LAST item is used... this needs to be a fatal error and we need to clear ...	475
is there a more efficient way to clear a buffer than this?	486
also only do this ifif it's necessary and content has actually been added. This will save some performance.	488

org.apache.commons.feedparser.test.TestProbeLocator	Line
Test this	159
Test this	164
We should be able to pass this test when we expand resources inside of the Feed Parser; we don't currently do this yet, Brad Neuberg, bkn3@columbia.edu	190
use the IO package from NewsMonster for this.	545

TODO

Number of occurrences found in the code: 2

org.apache.commons.feedparser.FeedFilter	Line
undeclared namespace prefixes should be expanded to their common form. 'rdf, 'atom', 'xhtml' etc. Considering that they're will only be a handful H and then 4^36 different possibilities the probability will only be H in 4^36 which is pretty good that we won't have a false positive.	84

org.apache.commons.feedparser.MetaFeedParserListener	Line
what does RSS 0.91, 0.9, etc provide?	124

Feedparser

Project Documentation

Commons

ASF

Tag List Report

FIXME

TODO