Skip to main content
Version: 1.0.0

Basic Concepts and Terminologies in DataModel

Core Ideas

Measures

Measures are quantitative variables that quantify groups of dimensional values. These are numerical values that can have mathematical functions applied to them. Measures represent the metrics you want to analyze, such as sales amounts, counts, or calculations.

Dimensions

Dimensions are qualitative variables that help categorize data points. They provide the context for your measures and can be divided into two main types:

TypeDescriptionExamples
CategoricalRepresents types of data that can be divided into groups or categoriesRace, sex, age group, educational level
TemporalRepresents Date and Time valuesDates, timestamps

What is a Schema?

A schema is used to describe variables present in your data. Schema definitions use key-value pairs to help DataModel understand the type of data in each field and provide options to modify the default behavior of each field.

Schema Attributes

AttributeDescription
nameDescribes the field name
typeSpecifies whether the field is a dimension or measure
subtypeFor dimensions, specifies if it is categorical or temporal
defAggFnSpecifies the default aggregation function to be applied on measure fields - sum by default
formatFor temporal fields, describes the date format string to parse the raw data
displayNameSpecifies the field name to be shown when displaying the field

Example

Sample Data

[
{
"Maker": "chevrolet",
"Horsepower": 130,
"Origin": "USA",
"Year": 1978
},
{
"Maker": "buick",
"Horsepower": 165,
"Origin": "USA",
"Year": 1989
},
{
"Maker": "datsun",
"Horsepower": 88,
"Origin": "Japan",
"Year": 1981
}
]

Corresponding Schema

[
{
"name": "Maker",
"type": "dimension",
"subtype": "categorical",
"displayName": "Car Manufacturer"
},
{
"name": "Horsepower",
"type": "measure",
"defAggFn": "avg",
"displayName": "Engine Horsepower"
},
{
"name": "Origin",
"type": "dimension",
"subtype": "categorical",
"displayName": "Country of Origin"
},
{
"name": "Year",
"type": "dimension",
"subtype": "temporal",
"format": "%Y",
"displayName": "Manufacturing Year"
}
]

The schema example above demonstrates how to properly define each field from the sample data:

  • Categorical dimensions: Maker and Origin
  • Temporal dimension: Year (with format for four-digit year)
  • Measure: Horsepower (with default aggregation function)

Each field includes its appropriate type, subtype (for dimensions), and a user-friendly display name. For the measure field (Horsepower), we've included a default aggregation function. The temporal field (Year) includes its format specification.