Chuck wrote:
Consensus wins by virtue of avoiding extremes. In the process, it is consistently mediocre. A really good forecaster can beat consensus consistently.
In my mind, consensus is essentially a human ensemble of forecasts. Take the average of all forecasts issued by the forecasters and you get consensus. I don't think consensus wins by avoiding extremes, actually it does well no matter how (and maybe even because) individual forecasters go extreme (in both directions sometimes). It covers the phase space.
I am hard pressed to say that consensus is mediocre, though it is possible. Rather consensus tends to do quite well over the long term. (Wishing I had access to my old UAlbany forecast contest results). That being said, a good forecaster knows WHEN to beat a model that forecasters rely on and knows HOW to beat it. Playing the game you can easily make good bets or poor bets consistently, but to make big leaps in "skill" required risk taking. This comment applies mostly to the numbers (Max/ Min temperature, POP, precip amount).
Consensus on thunderstorm forecasting might be a different story, and supercells would be in another league entirely.
I too am a big fan of the dropsonde and I was very much disappointed that learjet dropsonde was not going to be used. (I tried to keep up with the mobile upsonde reports to see there progress. I really don't have much to say on if they succeeded in their mission or not, publications will have to tell that story.) I think what is most missed though will be a tremendous variety of environmental soundings. Not that the upsonde teams failed in any way. I think they have already published a paper on an MCS ...clearly not the goal of V2 alone.
But I think an opportunity was missed to really explore the environment around supercells which would have required a lot more forecasting precision (initiation location and time, relevant mechanisms teased out for sample flying patterns (across fronts, boundaries, moisture discontinuities, etc)).
The stated goals that were outlined in the essay:
- How, when, and why do tornadoes form? Why some are violent and long lasting while others are weak and short lived?
- What is the structure of tornadoes? How strong are the winds near the ground? How exactly do they do damage?
- How can we learn to forecast tornadoes better?
I am less certain they can answer why a tornado was short lived (since we have no proof it should have lasted longer) or long lived (since we have no proof it should have been shorter lived). Thats where sample size becomes an issue (and where the project will prove its worth). Are 40 tornado's enough? Do you need to have 40 similar tornado's paired with 40 similar storms to draw some conclusions? Only time will tell.
The last goal sets the bar very high and is another reason why a dropsonde platform would have been so key. Researchers have opted to use the RUC model to circumvent this. It is a decent quality model that has proved its worth but it is still a model. It is always desirable to have observations since it is the small details that tend to matter most. And more soundings over a larger area is exactly what would help. Spatial and temporal coverage needs to improve, but when V2 is over and the mobile radars go home we are left with the "old" network to work with. It would be nice to have perspective and quantification on how good or uncertain some proximity soundings are, and if these proximity soundings tell us something more than tornadoes are possible. We still have no clue why some storms produce them, why some storms don't, and why some storms don't rotate when we think rotation is possible. They should be able to differentiate whether storm scale or environmental processes are dominating with the help of model sensitivity studies.
Has anyone showed how poor or how good we already forecast tornado's? I believe there is one study showing the lead time of warnings but I am unaware of anything else that truly shows skill at tornado forecasting. I will have to check into that.