Skip to main content
Version: Current

Data Wrangling with DataModel

DataModel provides a powerful set of operators for transforming your data. These operators are pure functions that return new DataModel instances, enabling immutable data transformations.

Understanding Operators

Operators in DataModel fall into two categories:

  • Relational algebra operators (selection, projection, etc.)
  • Utility operators for specific cases
info

All operators return a new DataModel instance, preserving immutability. This design enables building complex visualization systems and interactive applications elegantly using method chaining.

Core Operations

Selection (Filtering)

Filter rows based on specific conditions using the select operator:

const Datamodel = muze.DataModel;

const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDm = dm.select({
value: "Japan",
field: "Origin",
operator: Datamodel.ComparisonOperators.EQUAL,
});

Output:

NameMakerMiles_per_GallonDisplacementHorsepowerWeight_in_lbsAccelerationOriginCylindersYear
toyota corona mark iitoyota2411395237215Japan4-19800000
datsun pl510datsun279788213014.5Japan4-19800000
datsun pl510datsun279788213014.5Japan431516200000
toyota coronatoyota2511395222814Japan431516200000
toyota corolla 1200toyota317165177319Japan431516200000

Projection (Column Selection)

Select specific fields using the project operator:

const outputDm = dm.project(["Name", "Origin"]);

Output:

NameOrigin
chevrolet chevelle malibuUSA
buick skylark 320USA
plymouth satelliteUSA
amc rebel sstUSA
ford torinoUSA

Grouping

Aggregate data using the groupBy operator:

const Datamodel = muze.DataModel;
const { MAX } = Datamodel.AggregationFunctions;

const groupDm = dm.groupBy(["Origin"], ["Horsepower", MAX]);

Output:

OriginMiles_per_GallonDisplacementHorsepowerWeight_in_lbsAcceleration
USA20.128225806451606455119.60642570281125180014.928458498023707
Europe27.89142857142857318381182516.82191780821918
Japan30.45063291139239716879.83544303797468161316.172151898734175

Sorting

Order data using the sort operator, supporting multi-level sorting:

const sortDm = dm.sort([["Maker"], ["Weight_in_lbs", "desc"]]);

Output:

NameMakerMiles_per_GallonDisplacementHorsepowerWeight_in_lbsAccelerationOriginCylindersYear
amc matador (sw)amc14304150425715.5USA812621060000
amc matadoramc15.5304120396213.9USA818928260000
amc matador (sw)amc15304150389212.5USA863052200000
amc ambassador dplamc1539019038508.5USA8-19800000
amc rebel sst (sw)amcNaN360175385011USA8-19800000
note

The example outputs show the first few rows of the transformed data. Your actual results will depend on your dataset.

Operator Chaining

Chain multiple operators for complex transformations:

const resultantDm = dm
.select(/* selection criteria */)
.project(/* field list */)
.sort(/* sort criteria */);
tip

Operator chaining provides a clean, functional approach to data transformation. Each operation in the chain receives the output of the previous operation as its input.

Common Use Cases

Filtering by Region

// Show only Japanese cars
dm.select({
value: "Japan",
field: "Origin",
operator: Datamodel.ComparisonOperators.EQUAL,
});

Creating Summary Views

// Get max horsepower by origin
dm.groupBy(["Origin"], ["Horsepower", Datamodel.AggregationFunctions.MAX]);

Multi-level Sorting

// Sort by maker, then by weight descending
dm.sort([["Maker"], ["Weight_in_lbs", "desc"]]);
note

The examples use a car dataset containing fields like Name, Origin, Horsepower, etc. Your actual field names should match your dataset's schema.