Yapip - Yet Another [HTTP] Proxy in Perl. -------------------------------------------------------------------------------- Introduction -------------------------------------------------------------------------------- Kyle R. Burton Fri Aug 17 09:37:08 EDT 2001 I've started a secondary project in Pas for the purposes of including an integrated regression testing tool. Justin Bedard and I created a tool for regression testing, that included a [modified] CGI proxy that recorded what you did in your browser as a perl program that could play back the actions that you took. Starting with the recorded script, you could modify it to create a regression test for specific functionality on the website. Our original work is available here: http://www.bgw.org/projects/guts/ The CGI based proxy is kind of slow (which is no fault of the program, but more because of the fact that it's CGI based), so I've started work on a custom proxy that's object oriented and uses IO multiplexing instead of forking to handle connections. Starting from scratch has provided a few added benefits over the proxy programs I found while searching on the web. Since the proxy is an object, it allows the proxy to be easily subclassed to implement new functionality (to record regression testing scripts for example). This should allow the proxy to live several different lives, with the primary one being a simple proxy, the main secondary one being as a tool for recording and generating regression tests. The io multiplexing nature of this implementation is significantly faster than CGI or forking style proxies. By using io multiplexing, there is only 1 process, this allows it to more easily track what it's clients are doing, as well as provide a 'control' style connection. The drawback is that while the proxy is busy handling data from one connection, all the others block. This is acceptable for our purposes - the proxy is designed for use mainly as a testing tool, not as a high performance gateway or caching proxy. This blocking will not be an issue unless there are many simultaneous users on the proxy. In most instances where we see this software being used (tinkering, testing, observation), it should perform exteremly well. The RFC for HTTP includes information on how HTTP proxies should behave. This proxy was not created based on that RFC. It probably should be. It attempts to perform as little modification to the HTTP requests as possible, only re-writing the request URI as necessary. The RFC for HTTP/1.1 can be found here: http://www.faqs.org/rfcs/rfc2068.html Http Sniffer by Tim Meadowcroft (another proxy written in perl) was used extensivly as an example when writing this code. Http Sniffer is available from: http://www.schmerg.com/HttpSniffer.pl.txt Yapip - Yet Another [HTTP] Proxy in Perl. ------------------------------------------------------------------------------ - more of a pass through than a full HTTP proxy - ideal for subclassing to perform extended tasks (like logging, or request observation, etc.) - captures HTTP header and post data seperatly - currently captures client requests - will capture server responses The following environment variables need to be set before you can successfuly run the proxy: PAS_BASE=/path/to/the/installation/of/pas PERL5LIB=$PAS_BASE/src To run it, use the proxy shell: [mortis@malevolence pas]$ perl -MOrg::Bgw::HTTP::Proxy::Shell -e shell args: trying to connect to proxy localhost:8081...failed : Connection refused You are not connected to a proxy, try 'help' or 'connect' proxy> help proxy> start launching server...started trying to connect to proxy localhost:8081..connected proxy> status proxy> status Pid: 6091 Sid: 6091 Current Time: Fri Aug 17 09:52:53 2001 Start Time: Fri Aug 17 09:52:43 2001 Elapsed Time: 10 Connections: 0 Control Connections: 1 Requests Processed: 0 Logfile: /home/mortis/projects/pas//logs/pas.log Loglevel: 9 proxy> stop shutting down... proxy> quit exiting... [mortis@malevolence pas]$ You can also run the proxy directly (without the shell): [mortis@malevolence pas]$ perl -MOrg::Bgw::HTTP::Proxy -e run_proxy args: daemonizing...forked. Child is: 6095. [mortis@malevolence pas]$ For debugging, run it from the command line and instruct it not to daemonize: [user@host dir]$ perl -MOrg::Bgw::HTTP::Proxy -e run_proxy -- -nodaemon Once the proxy is running you can use the shell to establish a control connection to it. Starting or stopping the shell has no effect on the proxy, the shell can disconnect or connect to the proxy arbritraily. The proxy outputs information to the PAS logfile. Tail the logfile to watch what it's doing: [mortis@malevolence pas]$ tail -f $PAS_BASE/logs/pas.log To utilize the proxy, configure your browser to use an HTTP proxy, and point it at the proxy on the port reported by the proxy when it started (or in the status). TODO -------------------------------------------------------------------------------- - implement response header parsing - implement client ip tracking . reportable through the control connection - implement request/response tracking by ip . initiate 'recording' through the shell interface - streaming monitor connections . connect to the proxy, issue a 'stream ip' request, and any requests made by that ip are sent directly to you as they come in - turns the connection from an interactive one to a streaming connection. - implement regression script recording . initiate 'recording' through the shell interface - For the app server, switch on 'test mode' or 'debug mode', and then from inside Pas's base page object, we can establish a connection to the proxy and communicate with it to record request/response information (there is alot of potential for this idea) - how else can we facilitate the regression testing process? . automaticly parsing out HTML? form elements? Possibly with HTML::Parser (though this is slow) - lists of anchor tags (for further spidering?) - names (and sizes) of textfields - names and contents of select elements (popup/dropdown lists) - names and contents of submit buttons