Fishing is very special to me—especially when I go with my son, Connor.
It’s father and son time; a chance to check in to see how things are going, get insights into his life and, perhaps, offer a bit of counsel: It is, after all, my job as his father to offer Connor advice and help drive real-world results—to identify the data points that hold the most value and use what those data points communicate to enhance his life path and the experiences he’ll have along it.
Sometimes we fish in the ocean.
Sometimes we fish in the lake.
And sometimes we fish in a local pond.
This choice of locale carries its own lessons.
When we fish in the ocean, we need to charter a boat, a captain, and we are never sure what we will catch…if anything. The Atlantic Ocean is massive and out on it we must deal with waves, currents, and white caps.
When we go fishing in the lake, it is not as expensive as fishing in the ocean—and we have a greater likelihood of knowing what we will catch.
When my son and I fish in the pond, we know what we will catch. It also takes less time to catch a fish and there is no need to rent a boat. Fishing in the pond is simple, easy, and fun compared to fishing in the ocean or a lake.
As you may have suspected, I see a metaphor in this that relates to my “day jobs”—and some of the philosophy I bring to that work.
Big Data is like fishing in the ocean: massive volumes of both structured and unstructured data that are so large it is difficult to process through traditional database and software techniques. In most organizations, the volume of data is too big for it to move quickly through system processing, or it exceeds the current processing capacity.
Big Data is high volume and high variety. This is to say, it requires new technologies and techniques to capture, store, and analyze it. This information is used to enhance decision-making, provide insight and discovery, and support and optimize processes. It is always challenging and costly to collect, manage and use, and it is not necessarily relevant to any specific problem or issue to resolve.
Gartner defines Big Data as “high-volume, high-velocity, and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”
We must be aware the data we have is not necessarily the data we really need to drive value.
Data lakes are like fishing in a lake—not as large as an ocean and with a more concentrated type of data. The data lake storage repository holds a vast amount of raw data, in its native format, until it is needed.
While a hierarchical data warehouse store data in files or folders, a data lake uses a flat architecture to store data, the purpose of which is not yet defined. You can store your data “as is” without having to first structure the data and run different types of analytics.
Gartner refers to Data Lakes, in broad terms, as “enterprise-wide data management platforms for analyzing disparate sources of data in its native format.”
The data we capture is missing the context and framework to drive insights.
Data pond is a term I crafted many years ago during my undergraduate studies at St John’s University in New York City. A well-realized data pond can provide critical insights and vital clarity that is almost impossible to find with larger volumes of data. You can have data without information, but you cannot have information without data.
That said, there is zero value in information if it doesn’t drive actionable insights. Why do we think bigger is better and more is better than less?
I think less is better, more is waste, and bigger is not better.
Bigger is just…bigger—more costly, harder to deal with, and extremely difficult to drive real insights that will help lead an organization to success.
Initiated in 1958, Project Mercury was the United States’ first man-in-space program. The objectives of the program, which made six manned flights from 1961 to 1963, were highly specific: orbit a manned spacecraft around Earth, investigate man’s ability to function in space and recover both man and spacecraft safely. The computers used on that project utilized 300 kilobytes of memory.
My point?
If you can operate a spacecraft on less memory than it takes to get a snapshot of my kids, we can certainly do more with less and drive real actionable insights through data ponds.
Small enough for human comprehension, data ponds offer an accessible volume and format that is informative and, most importantly, actionable.
It is not about the data.
It is about the insights that will drive value.
This is the end game, nothing more, nothing less.
Why fish in the ocean when you have all you can eat in the pond next door?
Fish in the pond with me and my son, not in the ocean with Captain Ahab or in the lakes with the Loch Ness Monster: You will find the fish you’re looking for faster and easier at a lower cost.
Best of all, you can tell all your friends about the insights you learned about life and business while fishing—without getting lost in a sea of data.