Dimensions¶
The Dimensions app provides functionality for asking about dimension metadata, including distributions within a dimension over a dataset.
Registry¶
Import this module to get access to dimension instances.
from msgvis.apps.dimensions import registry
time = registry.get_dimension('time') # returns a TimeDimension
time.get_distribution(a_dataset)
-
msgvis.apps.dimensions.registry.
get_dimension
(dimension_key)[source]¶ Get a specific dimension by key
Models¶
-
msgvis.apps.dimensions.models.
find_messages
(queryset)[source]¶ If the given queryset is actually a
Dataset
model, get its messages queryset.
-
class
msgvis.apps.dimensions.models.
CategoricalDimension
(key, name=None, description=None, field_name=None, domain=None)[source]¶ A basic categorical dimension class.
Attributes:
key (str): A string id for the dimension (e.g. ‘time’)
name (str): A nicely-formatted name for the dimension (e.g. ‘Number of Tweets’)
description (str): A longer explanation for the dimension (e.g. “The total number of tweets produced by this author.”)
- field_name (str): The name of the field in the database for this dimension (defaults to the key)
Related to the Message model: if you want sender name, use sender__name.
Return True for real categorical dimensions
-
exclude
(queryset, **kwargs)[source]¶ Exclude some points from a queryset and return the new queryset.
-
group_by
(queryset, grouping_key=None, values_list=False, values_list_flat=False, **kwargs)[source]¶ Return a ValuesQuerySet that has been grouped by this dimension. The group value will be available as grouping_key in the dictionaries.
The grouping key defaults to the dimension key.
messages = dim.group_by(messages, 'value') distribution = messages.annotate(count=Count('id')) print distribution[0] # { 'value': 'hello', 'count': 5 }
-
select_grouping_expression
(queryset, expression)[source]¶ Add an expression for grouping to the queryset’s SELECT. Returns the queryset plus the alias for the expression.
For categorical dimensions this is a no-op. Beware if your expression refers to a related table!
-
class
msgvis.apps.dimensions.models.
ChoicesCategoricalDimension
(key, name=None, description=None, field_name=None, domain=None)[source]¶ A categorical dimension where the values come from a choices set.
Don’t use for related fields.
-
class
msgvis.apps.dimensions.models.
RelatedCategoricalDimension
(key, name=None, description=None, field_name=None, domain=None)[source]¶ A categorical dimension where the values are in a related table, e.g. sender name.
Currently doesn’t really do much beyond CategoricalDimension.
Return True for related categorical dimensions
-
class
msgvis.apps.dimensions.models.
QuantitativeDimension
(key, name=None, description=None, field_name=None, default_bins=50, min_bin_size=1)[source]¶ A generic quantitative dimension. This works for fields on Message or on related fields, e.g. field_name=sender__message_count
-
get_range
(queryset)[source]¶ Find a min and max for this dimension, as a tuple. If there isn’t one, (None, None) is returned.
-
get_grouping_expression
(queryset, bins=None, bin_size=None, **kwargs)[source]¶ Generate a SQL expression for grouping this dimension. If you already know the bin size you want, you may provide it. Or the number of bins.
-
select_grouping_expression
(queryset, expression)[source]¶ Add an expression for grouping to the queryset’s SELECT.
Returns a queryset, grouping_key tuple. The grouping_key could be used in values to identify the grouping expression.
-
group_by
(queryset, grouping_key=None, bins=None, bin_size=None, **kwargs)[source]¶ Return a ValuesQuerySet that has been grouped by this dimension. The group value will be available as grouping_key in the dictionaries.
The grouping key defaults to the dimension key.
If num_bins or bin_size is not provided, an estimate will be used.
messages = dim.group_by(messages, 'value', 100) distribution = messages.annotate(count=Count('id')) print distribution[0] # { 'value': 'hello', 'count': 5 }
-
-
class
msgvis.apps.dimensions.models.
RelatedQuantitativeDimension
(key, name=None, description=None, field_name=None, default_bins=50, min_bin_size=1)[source]¶ A quantitative dimension on a related model, e.g. sender message count.
-
class
msgvis.apps.dimensions.models.
TimeDimension
(key, name=None, description=None, field_name=None, default_bins=50, min_bin_size=1)[source]¶ A dimension for time fields on Message