This Week’s Wisdom
“What do you care what other people think?”
Most people think this quote was from Richard Feynman, it's actually not. It was from his first wife - Arlene Greenbaum - who had an incredible sense of playfulness and optimism, despite having to fight a deadly disease (and eventually passed away from it).
Feynman’s marriage to his first wife was short and full of obstacles, both due to Arlene’s illness and Feynman’s demanding schedule (as he was working on the Manhattan project). However, every time they planned a date, outside the hospital’s terrace, or via the postal systems, they always did amusing things together, after Arlene’s asking: “What do you care what other people think?” Having spent a short, but wonderful life together, it was Arlene’s love of life that created so many early and sweet memories in Richard Feynman’s life.
So, what do you care what other people think?
This Week’s Operation
Can data science be managed to deliver return on investment?
I was a bit confused the other day when reading a book chapter on “Maximising data science”, which is about how to do effective data science in enterprise. One of the chapter’s key conclusion is:
“Data science will always be experimental. It cannot be managed to deliver a predetermined goal or return on investment”.
My interpretation of the author’s point is that: “Data science is an experiment, which involves iteratively refining a model based on observations. There is always the potential of “failure” in a data science project. Only after the experiment is completed and the model's performance is evaluated, the team will know whether it is worth productionising or not.”
Although there are some merits in this argument, this way of seeing data science is incomplete. I think there are 2 forms of data science:
Data science as part of a data product: the key KPIs for a data science team would be how many data-science-enabled features are developed, and how well the user stories are satisfied with these features.
Data science as a knowledge discovery capability: the KPIs would be how many hypotheses are tested, what are the accuracy rates.
The author’s viewpoint earlier will only be applicable to the 1st form - data science as a part of a data product. Data science’s value here is only realised when the model is productionised, and the model’s predictions are embedded in the business process.
In the 2nd form, the RoI of data science doesn’t depend on the productionising of the model. For example, It is still valuable to run a one-off simulation for the marketing department to show that discount package A will generate more revenue than discount package B.
So my point is, you can manage data science to deliver a predetermined goal and ROI. You just need to define its form (and purpose) clearly in advance. Taking this to another level, whenever someone tells me any form of science cannot be managed to deliver a predetermined goal or return on investment, I would stay very alert. Would you?
This Week’s Impact
ImageNet bringing computer vision to a new chapter
About a decade ago, a Stanford professor invested in a seemingly career-ending project: download billions of images from the internet and teach a computer to recognize them.
At that time, research on computer vision focused on doing specific problems well (e.g. stereo vision, image retrieval, 3D reconstruction, etc). However, Fei-Fei Li saw the need to establish a clear North Star problem: building computer’s general capability to categorise objects. She also saw that good research required good resources, and while other colleagues were focusing on the maths, she spent three years downloading and categorising billions of images from the internet. This is the birth of ImageNet.
ImageNet revolutionised computer vision by addressing the challenge of large-scale image recognition. Its brilliance lies in its comprehensive and meticulously curated collection, enabling researchers to develop highly accurate models capable of understanding diverse visual inputs.
ImageNet provided a robust dataset that propelled the training results of deep convolutional neural networks (CNNs). It’s interesting to note that CNNs was not a new algorithm at the time. However, only until the ImageNet dataset was available, and a team of scientists used GPUs to train CNNs on it, the breakthrough occurred.
ImageNet not only democratised access to a treasure trove of quality datasets, but also inspired a new generation of data scientists to push the boundaries of what's possible in artificial intelligence.
Resource:
That’s it for this week! If you enjoy or get puzzled by the content, please leave a comment so we can continue the discussion. Throw in a like as or share as well if you know of someone who may enjoy this newsletter. Thanks!