There is a free ETF portfolio back-test at ETFReplay:
http://www.etfreplay.com/combine.aspx
Using that site, I back-tested a portfolio of 25% each GLD, TLT, SHY, and VTI from January 3, 2006 to May 10, 2011. (The time frame was as long as I could make it given the date of inception for the youngest ETF).
Over that period of time there was a total return of 63.7%, volatility of 9.0%, Sharpe Ratio 0.65 and CAGR 9.7%.
The same back-test substituting VO for VTI gave a 66.4% return, volatility 9.3%, Sharpe Ratio 0.66 and CAGR 10.0%.
This seems under-whelming, but VO gained 35.6% over that time period, while VTI only gained 24.8%. Both are absolutely swamped by GLD, which gained 174.8%. If GLD had not been so strong (not to omit from mention, both long and short treasuries were also very strong at 29.2% and 22.4%, respectively), the beneficial effect of using VO rather than VTI would be more evident.
Of course, that is an extremely limited back-test period, and probably a very unique period at that!