Hacker News | Statistical Learning as the Ultimate Agile Development Tool (Peter Norvig)

Hacker News

new | comments | leaders | jobs | submit

	Statistical Learning as the Ultimate Agile Development Tool (Peter Norvig) (videolectures.net)
	60 points by jbr 3 days ago \| 7 comments

5 points by andrewpbrett 3 days ago | link

It's a very interesting talk - but the title is misleading (title is the same in the submission as in the source). Norvig focuses on statistical learning and how that can be used to infer the meaning of words and the appropriate results in searches, particularly image searches. How these approaches tie into agile development, or any software development really, is given at best a cursory pass with a couple mentions that I, for one, wasn't able to follow. I kept waiting for the other shoe to drop and it never did.

I'd love to be told that I'm wrong or that I'm missing something here, so if anyone wants to enlighten me on the connection, please go for it.

7 points by jbr 3 days ago | link

My take: What I extracted from the talk regarding agile software development is his point about keeping specifics out of the code. If the operational specifics of your application are kept in "data form" instead of precise a priori knowledge, it is easier to be "agile" (adaptive). Agile is an incredibly loaded word, but I think he was referring to lightweight, working software that is easily adaptable.

As far as how it's software development: It's an architectural choice whether you're going to a) hardcode domain specifics into your code, b) keep some of the domain specifics out of the code, but neatly organized and structured, or c) leave the data messy but extract meaning from a large corpus statistically.

2 points by jbr 3 days ago | link

I'm particularly interested in the idea of distribution/statistical-test driven software testing. Instead of concrete assertions (truth assertions, equality assertions), we would make assertions about descriptive statistics or other tests. Or just assert that if we do something stochastic N times, we should expect some truth/equality test to pass at least M times.

I'm not sure what to do with that idea, but that's actually the core of why I posted this. Any other hackers out there excited by this idea?

1 point by pgbovine 3 days ago | link

some research projects that might be relevant (although not quite what you were thinking about):

statistical bug isolation: http://pages.cs.wisc.edu/~liblit/pldi-2005/

CHESS - automated finding of concurrency bugs: http://research.microsoft.com/en-us/projects/chess/

you could start from there and traverse thru related work

1 point by tristan_juricek 3 days ago | link

Yes; I'm a lot weaker in my statistical understanding, but it seems like it would be a fantastic way of slicing up the testing pie.

Unit tests are kind of handy if you've got some sort of "fencepost" style error where you need to figure out a piece of logic that was overlooked. But that's about it.

This sort of testing would way more relevant, because it's a way of focusing on "what (data) do we 'well' right now"?

I'm thinking that such sort of systems might be useful as a basis for performance or concurrency testing for example.

1 point by jwr 3 days ago | link

The videolectures.net site is frustrating. On the one hand, it is a great collection of interesting talks. On the other hand, its annoying use of Windows Media means that I have to watch the talks in the browser, while I would much rather download them for watching on my iPhone.

Does anybody have a solution that would allow me to download these, convert (possibly with VisualHub) and watch on an iPhone/iPod?

4 points by macmac 3 days ago | link

VLC will capture the stream and convert it to h.264 at the same time.