From Punch Cards to the "Modern Data Stack"

From Punch Cards to the "Modern Data Stack"
QlikView, the earliest example of dashboard-driven analytics 

A journey from the origins of computing and data analytics to what we now call the "Modern Data Stack". What comes next?

The Origins of Computing and Data Analytics

The origins of computing and data analytics began in the mid-1950s and started taking shape with the introduction of SQL in 1970:

This file is sourced from Wikimedia and licensed under the Creative Commons Attribution 2.0 Generic license.
This file is sourced from Wikimedia and licensed under the Creative Commons Attribution 2.0 Generic license.
  • 1960: Punch Cards
This file is sourced from Wikimedia and licensed under the Creative Commons Attribution 2.0 Generic license.
  • 1970: Structured Query Language (SQL)
  • 1970s: Interactive Financial Planning Systems - Create a language to “allow executives to build models without intermediaries”
  • 1972: C, LUNAR - One of the earliest applications of modern computing, a natural language information retrieval system, helped geologists access, compare and evaluate chemical-analysis data on moon rock and soil composition
  • 1975: Express - The first Online Analytical Processing (OLAP) system, intended to analyse business data from different points of view
  • 1979: VisiCalc - The first spreadsheet computer program
This file is sourced from Wikimedia and licensed under the Creative Commons Attribution 2.0 Generic license.
  • 1980s: Group Decision Support Systems - “Computerized Collaborative Work System”

The “Modern Data Stack”

The "Modern Data Stack" is a set of technologies and tools used to collect, store, process, analyse, and visualise data in a well-integrated cloud-based platform. Although QlikView was pre-cloud, it is the earliest example of what most would recognise as an analytics dashboard used by modern platforms like Tableau and PowerBI:

  • 1994: QlikView - “Dashboard-driven Analytics”
This file is sourced from Wikimedia and licensed under the Creative Commons Attribution 2.0 Generic license.
  • 2003: Tableau
  • 2009: Wolfram Alpha - “Computational Search Engine”
  • 2015: PowerBI
  • 2017: ThoughtSpot - “Search-driven Analytics”

Paper, Query Languages, Spreadsheets, Dashboards, Search, what next?

Some of the most innovative analytics applications, at least in terms of user experience, convert human language to some computational output:

  • Text-to-SQL: A tale as old as time, LUNAR was first developed in the 70s to help geologists access, compare, and evaluate chemical-analysis data using natural language. Salesforce WikiSQL introduced the first extensive compendium of data built for the text-to-SQL use case but only contained simple SQL queries. The Yale Spider dataset introduced a benchmark for more complex queries, and most recently, BIRD introduced real-world “dirty” queries and efficiency scores to create a proper benchmark for text-to-SQL applications.
  • Text-to-Computational-Language: Wolfram Alpha, ThoughtSpot
  • Text-to-Code: ChatGPT Advanced Data Analysis

Is "Conversation-Driven Data Analytics" a natural evolution?

  • UX of modern analytics interfaces like search and chat are evolving, becoming more intuitive, enabled by NLP and LLMs
  • Analytics interfaces have origins in enabling decision-makers, but decision-makers are still largely reliant on data analysts
  • Many decision-maker queries are ad-hoc, best suited to “throwaway analytics”
  • Insight generation is a creative process where many insights are gained in conversations about data, possibly with peers
  • The data analytics workflow is disjointed, from the imagination of analysis to the presentation of results

Acknowledgements

Dates for the section "The Origins of Computing and Data Analytics" thanks to https://web.paristech.com/hs-fs/file-2487731396.pdf and http://dssresources.com/history/dsshistoryv28.html.