Quick Introduction to DataModel
What is DataModel?
DataModel is a minimalistic, in-browser representation of tabular data designed to work seamlessly with Muze for visualization. It provides a consistent data format with powerful transformation capabilities inspired by SQL's data manipulation philosophy.
At its core, DataModel supports relational algebra operators that enable:
- Selection (filtering rows)
- Projection (selecting columns)
- Grouping and aggregation operations
While DataModel is optimized for use with Muze, it can be used independently as an in-browser tabular data store for analysis, visualization, or general data management needs.
Key Features
-
WebAssembly-Powered Performance
- Fast data operations
- Handles large datasets efficiently
- Minimal performance degradation as data size grows
-
Relational Operations
- Row filtering
- Column selection
- Data grouping and aggregation
- Column creation from calculations
- Multi-column sorting
-
Immutable Architecture
- Each operation creates a new DataModel instance
- Operations form a Directed Acyclic Graph (DAG)
- Any node in the graph - which is a DataModel instance - can be visualized with Muze
-
Event Propagation
- Changes automatically flow through the operation DAG
- Events propagate from parent to derived DataModels
- Proper semantic relationships are maintained across DataModels
Getting Started
Example Data and Schema
Let's look at a simple example using car data:
const data = [
{
Name: "chevrolet chevelle malibu",
Miles_per_Gallon: 18,
Acceleration: 12,
},
{
Name: "buick skylark 320",
Miles_per_Gallon: 15,
Acceleration: 11.5,
},
{
Name: "plymouth satellite",
Miles_per_Gallon: 18,
Acceleration: 11,
},
{
Name: "amc rebel sst",
Miles_per_Gallon: 16,
Acceleration: 12,
},
];
const schema = [
{
name: "Name",
type: "dimension",
},
{
name: "Miles_per_Gallon",
type: "measure",
},
{
name: "Acceleration",
type: "measure",
},
];
Creating a DataModel Instance
// Load the DataModel module
const DataModel = muze.DataModel;
// Format the raw data for DataModel consumption
const formattedData = await DataModel.loadData(data, schema);
// Create a new DataModel instance
const dm = new DataModel(formattedData);
Viewing Your Data
console.log(dm.getData().data);
Available Operations
DataModel provides a rich set of operators for data transformation. These pure functions fall into two categories:
-
Relational Algebra Operations
- Selection: Filter rows based on conditions
- Projection: Select specific columns
- Grouping: Group rows with aggregation functions
-
Utility Operations
- Sorting: Arrange data by one or more columns
- Calculated Variables: Create new columns from existing ones
The resulting data structure from our example would look like this:
Name | Miles_per_Gallon | Acceleration |
---|---|---|
chevrolet chevelle malibu | 18 | 12 |
buick skylark 320 | 15 | 11.5 |
plymouth satellite | 18 | 11 |
amc rebel sst | 16 | 12 |
Note: While DataModel's design is inspired by SQL and relational algebra, no prior knowledge of these concepts is required to use it effectively.