Search This Blog

2015-01-11

CPAN Pull Request Challenge (PRC) 2015 - Introduction / January (Plack::Session::State::URI)

I have signed myself up for the CPAN Perl Request Challenge (PRC) 2015 as organized by Neil Bowers. The idea is that every month you are assigned a CPAN module at random that is hosted on GitHub and you have to submit at least one pull request for it. I heard about it a few days ago when Olaf Alders announced it to the Toronto Perl Mongers list (again?). It sounded like a neat idea so I decided to sign up.

My January assignment is Plack::Session::State::URI. I began by creating an issue on the GitHub repository requesting feedback and guidance from the author. Unfortunately, I haven't received a response yet. The module is fortunately very tiny. Unfortunately, it is based upon Plack, which I have limited experience with (at least I have minimal experience at all...). There are a few things that could obviously use some work.

  • There is very little documentation. In fact, there's so little documentation that I just have to infer what it does myself. Fortunately, I have a background in Web programming so it was no real mystery. Unfortunately, that still makes it difficult for me to test it. And worse, it makes it difficult for me to imagine how changes I make will affect users of the module.
  • There are very few tests. Granted, the module only has about 60 SLOC so how much testing is really needed? That said, I believe that several more tests could be possible. Many of them will probably require a richer understanding of the Plack::Middleware framework than I possess. It would be lovely to acquire that extra needed experience, and perhaps this is a good opportunity to do it, but I'll need to try to balance it with the other aspects of life.
  • Parsing HTML with a regular expression. Of course, regular expressions cannot properly parse HTML, SGML, or XML. You need a proper parser to do that correctly. Several such modules exist on CPAN. Unfortunately I only have experience using XML parsers in Perl, and HTML is far too forgiving for an XML parser to get along with it. I'll need to learn to use an HTML parser with the power to edit the document contents, preferably without altering the formatting of the existing document. This is probably the most useful thing that I could do for this project at the present time, but I'll need to identify the module to use and learn to use it properly first. If I do pursue this problem then I can also add test cases that break the regular expression parsing and use that to verify that my fix works.

Since I haven't received a response yet from the module maintainer I have begun doing my best to come up with work to do. We all have personal preferences when it comes to code formatting, structure, and style. I've tried to avoid rewriting too much code, but at the same time I've tried to address things that I believe could be improved. I expect the author will not like some of the changes that I've made so I'm trying to make it clear that I can rebase as necessary to keep the changes that he likes while getting rid of the changes that he doesn't. I'm also trying to keep my commits nice and small so they're easy to understand and hopefully are small enough that controversial changes are isolated and easy to leave out.

I currently have a work-in-progress wip/tidy branch. This will probably eventually become my first pull request for this module. That branch is not considered safe yet. I am currently force pushing to it as I edit history and try to get things just right. I'm pushing mostly as a backup mechanism. In order to publish something that the module maintainer can review without worrying about the branch getting destroyed out from under him I have published a static snapshot to preview/tidy/1 and submitted an issue to document it. The idea is that until the wip/tidy branch is read to be merged I will continue to rebase it and publish snapshots to incrementing integer revisions as in preview/tidy/2, preview/tidy/3, ... preview/tidy/n. When it's finally done I'll probably delete wip/tidy and publish a tidy branch instead for the pull request. I have lots of experience with Git, but I have no real experience with pull requests. I mostly use Git by myself so I haven't had a need for pull requests and generally don't have to worry about people tripping over history edits either. One of the things I can hope to learn from this experience is how to do nice pull requests and also how to manage history in the public eye without sacrificing history cleanliness nor causing undue pain for others.

Hopefully I hear back from the maintainer soon so I have an idea which direction to go in. I quite enjoy writing (just look around you!) so I'd be happy to add rich documentation to this module too. I just don't know exactly how robust it is and don't want to go making documentation up for it. It would be nice to get feedback directly from the author or others who have used it so that I know better what the strengths and weaknesses are and could document them appropriately. I wouldn't want to lure people into wasting time with this module if it can't do what they need, and similarly I wouldn't want to chase them away if it will do exactly what they need. Perhaps I should work on setting up a basic Web site with it and seeing just how well it works. That is quite a lot of work though. There are advantages to it, but these days I am typically not a ball of energy at the end of a work day. We'll just have to see what I can accomplish in the coming weeks.