Monday, July 16, 2018

If It Isn't Trivial, It Must Be Impossible

I was listening to an interview with data scientist Hilary Mason (currently General Manager, Machine Learning at Cloudera.com) the other day. As she was talking about how many seemingly disparate problems across a variety of domains use essentially the same underlying prediction algorithms, she said something like the following (paraphrased):
Framing the question is the hard part. Once you've framed the question appropriately, finding the answer is usually either trivial or impossible.
That struck me as quite profound. If you can't articulate what you are trying to understand or predict, it will be very difficult to determine what information would be relevant to enhancing that understanding or enabling that prediction. In the educational realm, this is the essence of backwards design -- start with the goal and work backwards.

If you spend the time to really identify the question that you are interested in, you should get a sense of whether or not the answer is even knowable given the data that you have access to now and/or could collect in the future. By spending a chunk of time up front thinking through your question, you'll come away with either an analysis plan, a game plan for collecting more data, or a recognition that you will need to be satisfied with not knowing.

 

No comments: