-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Ruby Tracker v0.3
🚧 The documentation for the latest version can be found on the Snowplow documentation site.
This page refers to version 0.3.0 of the Snowplow Ruby Tracker.
- 3 Adding extra data
- 3.1
set_platform
- 3.2
set_user_id
- 3.3
set_screen_resolution
- 3.4
set_viewport
- 3.5
set_color_depth
- 3.6
set_timezone
- 3.7
set_lang
- 3.1
-
- 4.1 Common
- 4.1.1 Argument validation
- 4.1.2 Optional context argument
- 4.1.3 Optional timestamp argument
- 4.1.4 Example
- 4.2
track_screen_view
- 4.3
track_page_view
- 4.4
track_ecommerce_transaction
- 4.5
track_struct_event
- 4.6
track_unstruct_event
- 4.1 Common
-
- 5.1 Overview
- 5.2 The AsyncEmitter class
- 5.3 Multiple emitters
- 5.4 Manual flushing
- 6 Contracts
- 7 Logging
The Snowplow Ruby Tracker allows you to track Snowplow events in your Ruby applications and gems and Ruby on Rails web applications.
The tracker should be straightforward to use if you are comfortable with Ruby development; any prior experience with Snowplow"s Python Tracker, JavaScript Tracker, Lua Tracker, Google Analytics or Mixpanel (which have similar APIs to Snowplow) is helpful but not necessary.
The Ruby Tracker and Python Tracker have very similiar functionality and APIs.
There are three main classes which the Ruby Tracker uses: subjects, emitters, and trackers.
A subject represents a single user whose events are tracked, and holds data specific to that user. If your tracker will only be tracking a single user, you don't have to create a subject - it will be done automatically.
A tracker always has one active subject at a time associated with it. It constructs events with that subject and sends them to one or more emitters, which sends them on to a Snowplow collector.
Assuming you have completed the Ruby Tracker Setup for your Ruby project, you are ready to initialize the Ruby Tracker.
Require the Ruby Tracker into your code like this:
require 'snowplow_tracker'
You can now initialize tracker instances.
Initialize a tracker instance like this:
emitter = SnowplowTracker::Emitter.new("my-collector.cloudfront.net")
tracker = SnowplowTracker::Tracker.new(e)
If you wish to send events to more than one emitter, you can provide an array of emitters to the tracker constructor.
This tracker will log events to http://my-collector.cloudfront.net/i. There are four other optional parameters:
def initialize(endpoint, subject=nil, namespace=nil, app_id=nil, encode_base64=true)
subject
is a subject with which the tracker is initialized.
namespace
is a name for the tracker which will be added to every event the tracker fires. This is useful if you have initialized more than one tracker. app_id
is the unique ID for the Ruby application. encode_base64
determines whether JSONs in the querystring for an event will be base64-encoded.
So a more complete tracker initialization example might look like this:
initial_subject = SnowplowTracker::Subject.new
emitter = SnowplowTracker::Emitter.new("my-collector.cloudfront.net")
tracker = SnowplowTracker::Tracker.new(emitter, initial_subject, 'cf', 'ID-ap00035', false)
Each tracker instance is completely sandboxed, so you can create multiple trackers as you see fit.
Here is an example of instantiating two separate trackers:
t1 = SnowplowTracker::Tracker.new(SnowplowTracker::AsyncEmitter.new("my-collector.cloudfront.net"), nil, "t1")
t1.set_platform("cnsl")
t1.track_page_view("http://www.example.com")
t2 = SnowplowTracker::Tracker.new(SnowplowTracker::AsyncEmitter.new("my-company.c.snplow.com"), nil, "t2")
t2.set_platform("cnsl")
t2.track_screen_view("Game HUD", "23")
t1.track_screen_view("Test", "23") # Back to first tracker
You can configure the a tracker instance with additional information about your application's environment or current user. This data will be attached to every event the tracker fires regarding the subject. Here are the available methods:
Function | Description |
---|---|
set_platform |
Set the application platform |
set_user_id |
Set the user ID |
set_screen_resolution |
Set the screen resolution |
set_viewport |
Set the viewport dimensions |
set_color_depth |
Set the screen color depth |
set_timezone |
Set the timezone |
set_lang |
Set the language |
There are two ways to call these methods:
- Call them on a Subject instance. They will update the data associated with that subject and return the subject.
- Call them on the Tracker instance. They will update the data associated with the currently active subject for that tracker and return the tracker.
For example:
s0 = SnowplowTracker::Subject.new
emitter = SnowplowTracker::Emitter.new("my-collector.cloudfront.net")
my_tracker = SnowplowTracker::Tracker.new(emitter, s0)
# The following two lines are equivalent, except that the first returns s0 and the second returns my_tracker
s0.set_platform('mob')
my_tracker.set_platform('mob')
If you are using multiple subjects, you can use the set_subject
tracker method to change which Subject instance is active:
s0 = SnowplowTracker::Subject.new
emitter = SnowplowTracker::Emitter.new("my-collector.cloudfront.net")
my_tracker = SnowplowTracker::Tracker.new(emitter, s0)
# Set the viewport for the active subject, s0
my_tracker.set_viewport(300, 500)
# The data associated with s0 will be sent with this event
my_tracker.track_screen_view('title page')
# Create a new subject
s1 = SnowplowTracker::Subject.new
# Make s1 the active subject and set its viewport
my_tracker.set_subject(s1).set_viewport(600,1000)
# The data associated with s0 will be sent with this event
my_tracker.track_screen_view('another page')
# Change the subject back to s0 and track another event
my_tracker.set_subject(s0).track_screen_view('final page')
The platform can be any on of 'pc'
, 'tv'
, 'mob'
, 'cnsl'
, or 'iot'
. The default platform is 'srv'
.
tracker.set_platform('mob')
You can make the user ID a string of your choice:
tracker.set_user_id('user-000563456')
If your Ruby code has access to the device's screen resolution, you can pass it in to Snowplow. Both numbers should be positive integers; note the order is width followed by height. Example:
tracker.set_screen_resolution(1366, 768)
Similarly, you can pass the viewport dimensions in to Snowplow. Again, both numbers should be positive integers and the order is width followed by height. Example:
tracker.set_screen_resolution(300, 200)
If your Ruby code has access to the bit depth of the device's color palette for displaying images, you can pass it in to Snowplow. The number should be a positive integer, in bits per pixel.
tracker.set_color_depth(24)
If your Ruby code has access to the timezone of the device, you can pass it in to Snowplow:
tracker.set_timezone('Europe London')
You can set the language field like this:
tracker.set_lang('en')
Snowplow has been built to enable you to track a wide range of events that occur when users interact with your websites and apps. We are constantly growing the range of functions available in order to capture that data more richly.
Tracking methods supported by the Ruby Tracker at a glance:
Function | Description |
---|---|
track_page_view |
Track and record views of web pages. |
track_ecommerce_transaction |
Track an ecommerce transaction |
track_screen_view |
Track the user viewing a screen within the application |
track_struct_event |
Track a Snowplow custom structured event |
track_unstruct_event |
Track a Snowplow custom unstructured event |
All events are tracked with specific methods on the tracker instance, of the form track_XXX()
, where XXX
is the name of the event to track.
All tracker methods return the tracker instance, and so are chainable.
Each track_XXX
method expects arguments of a certain type. The types are validated using the Ruby Contracts library. If a check fails, a runtime error is thrown. The section for each track_XXX
method specifies the expected argument types for that method.
Each track_XXX
method has context
as its penultimate optional parameter. This is for an optional array of self-describing custom context JSONs attached to the event. Each element of the context
argument should be a hash whose keys are "schema", containing a pointer to the JSON schema against which the context will be validated, and "data", containing the context data itself. The "data" field should contain a flat hash of key-value pairs.
Important: Even if only one custom context is being attached to an event, it still needs to be wrapped in an array.
For example, an array containing two custom contexts relating to the event of a movie poster being viewed:
# Array of contexts
[{
# First context
'schema' => 'iglu:com.my_company/movie_poster/jsonschema/1-0-0',
'data' => {
'movie_name' => 'Solaris',
'poster_country' => 'JP',
'poster_year$dt' => new Date(1978, 1, 1)
}
},
{
# Second context
'schema' => 'iglu:com.my_company/customer/jsonschema/1-0-0',
'data' => {
'p_buy' => 0.23,
'segment' => 'young adult'
}
}]
The keys of a context hash can be either strings or Ruby symbols.
For more on how to use custom contexts, see the blog post which introduced them.
After the optional context argument, each track_XXX
method supports an optional timestamp as its final argument. This allows you to manually override the timestamp attached to this event. If you do not pass this timestamp in as an argument, then the Ruby Tracker will use the current time to be the timestamp for the event. Timestamp is counted in milliseconds since the Unix epoch - the same format generated by Time.now.to_i * 1000
in Ruby.
Here is an example of a page view event with custom context and timestamp arguments supplied:
tracker.track_page_view('http://www.film_company.com/movie_poster', nil, nil, [{
# First context
'schema' => 'iglu:com.my_company/movie_poster/jsonschema/1-0-0',
'data' => {
'movie_name' => 'Solaris',
'poster_country' => 'JP',
'poster_year$dt' => new Date(1978, 1, 1)
}
},
{
# Second context
'schema' => 'iglu:com.my_company/customer/jsonschema/1-0-0',
'data' => {
'p_buy' => 0.23,
'segment' => 'young adult'
}
}], 1368725287000)
Use track_screen_view()
to track a user viewing a screen (or equivalent) within your app. Arguments are:
Argument | Description | Required? | Validation |
---|---|---|---|
name |
Human-readable name for this screen | Yes | String |
id |
Unique identifier for this screen | No | String |
context |
Custom context | No | Array[Hash] |
tstamp |
When the screen was viewed | No | Positive integer |
Example:
tracker.track_screen_view("HUD > Save Game", "screen23")
Use track_page_view()
to track a user viewing a page within your app.
Arguments are:
Argument | Description | Required? | Validation |
---|---|---|---|
page_url |
The URL of the page | Yes | String |
page_title |
The title of the page | No | String |
referrer |
The address which linked to the page | No | String |
context |
Custom context | No | Array[Hash] |
tstamp |
When the pageview occurred | No | Positive integer |
Example:
t.track_page_view("www.example.com", "example", "www.referrer.com")
Use track_ecommerce_transaction()
to track an ecommerce transaction.
Arguments:
Argument | Description | Required? | Validation |
---|---|---|---|
transaction |
Data for the whole transaction | Yes | Hash |
items |
Data for each item | Yes | Array of hashes |
context |
Custom context | No | Array[Hash] |
tstamp |
When the transaction event occurred | No | Positive integer |
The transaction
argument is a hash containing information about the transaction. Here are the fields supported in this hash:
Field | Description | Required? | Validation |
---|---|---|---|
order_id |
ID of the eCommerce transaction | Yes | String |
total_value |
Total transaction value | Yes | Int or Float |
affiliation |
Transaction affiliation | No | String |
tax_value |
Transaction tax value | No | Int or Float |
shipping |
Delivery cost charged | No | Int or Float |
city |
Delivery address city | No | String |
state |
Delivery address state | No | String |
country |
Delivery address country | No | String |
currency |
Transaction currency | No | String |
The transaction parameter might look like this:
{
'order_id' => '12345'
'total_value' => 35
'city' => 'London'
'country' => 'UK'
'currency' => 'GBP'
}
The items
parameter is an array of hashes. Each hash represents one item in the transaction. Here are the fields supported for each item:
Argument | Description | Required? | Validation |
---|---|---|---|
sku |
Item SKU | Yes | String |
price |
Item price | Yes | Int or Float |
quantity |
Item quantity | Yes | Int |
name |
Item name | No | String |
category |
Item category | No | String |
context |
Custom context | No | Array[Hash] |
The items
parameter might look like that:
[{
'sku' => 'pbz0026',
'price' => 20,
'quantity' => 1,
'category' => 'film'
},
{
'sku' => 'pbz0038',
'price' => 15,
'quantity' => 1,
'name' => 'red shoes'
}]
The whole method call would look like this:
tracker.track_ecommerce_transaction({
'order_id' => '12345'
'total_value' => 35
'city' => 'London'
'country' => 'UK'
'currency' => 'GBP'
},
[{
'sku' => 'pbz0026',
'price' => 20,
'quantity' => 1,
'category' => 'film'
},
{
'sku' => 'pbz0038',
'price' => 15,
'quantity' => 1,
'name' => 'red shoes'
}])
This will fire three events: one for the transaction as a whole, which will include the fields in the transaction
argument, and one for each item. The order_id
and currency
fields in the transaction
argument will also be attached to each the items' events.
All three events will have the same timestamp and same randomly generated Snowplow transaction ID.
Note that each item in the transaction can have its own custom context.
Use track_struct_event()
to track a custom event happening in your app which fits the Google Analytics-style structure of having up to five fields (with only the first two required):
Argument | Description | Required? | Validation |
---|---|---|---|
category |
The grouping of structured events which this action belongs to |
Yes | String |
action |
Defines the type of user interaction which this event involves | Yes | String |
label |
A string to provide additional dimensions to the event data | No | String |
property |
A string describing the object or the action performed on it | No | String |
value |
A value to provide numerical data about the event | No | Int or Float |
context |
Custom context | No | Array[Hash] |
tstamp |
When the structured event occurred | No | Positive integer |
Example:
tracker.track_struct_event("shop", "add-to-basket", nil, "pcs", 2)
Use track_unstruct_event()
to track a custom event which consists of a name and an unstructured set of properties. This is useful when:
- You want to track event types which are proprietary/specific to your business (i.e. not already part of Snowplow), or
- You want to track events which have unpredictable or frequently changing properties
The arguments are as follows:
Argument | Description | Required? | Validation |
---|---|---|---|
event_json |
The properties of the event | Yes | Hash |
context |
Custom context | No | Array[Hash] |
tstamp |
When the unstructured event occurred | No | Positive integer |
Example:
tracker.track_unstruct_event({
"schema" => "com.example_company/save_game/jsonschema/1-0-2",
"data" => {
"saveId" => "4321",
"level" => 23,
"difficultyLevel" => "HARD",
"dlContent" => true
}
})
The event_json
argument is self-describing JSON. It has two fields: "schema", containing a pointer to the JSON schema for the event, and "data", containing the event data itself. The data field must be flat: properties cannot be nested.
The keys of the event_json
hash can be either strings or Ruby symbols.
Tracker instances must be initialized with an emitter. This section will go into more depth about the Emitter and AsyncEmitter classes.
Each tracker instance must now be initialized with an Emitter which is responsible for firing events to a Collector. An Emitter instance is initialized with two arguments: an endpoint and an optional configuration hash.
A simple example with just an endpoint:
# Create an emitter
my_emitter = SnowplowTracker::Emitter.new('my-collector.cloudfront.net')
A complicated example using every setting:
# Create an emitter
my_emitter = SnowplowTracker::Emitter.new('my-collector.cloudfront.net', {
:protocol => 'https',
:method => 'post',
:port => 80,
:buffer_size => 0,
:on_success => lambda { |success_count|
puts '#{success_count} events sent successfully'
},
:on_failure => lambda { |success_count, failures|
puts '#{success_count} events sent successfully, #{failures.size} events sent unsuccessfully'
}
})
Every setting in the configuration hash is optional. Here is what they do:
-
:protocol
determines whether events will be sent using HTTP or HTTPS. It defaults to "http". -
:method
determines whether events will be sent using GET or POST. It defaults to "get". -
:port
determines the port to use -
:buffer_size
is the number of events which will be queued before they are all sent, a process called "flushing". When using GET, it defaults to 0 because each event has its own request. When using POST, it defaults to 10, and the buffered events are all sent together in a single request. -
:on_success
is a callback which is called every time the buffer is flushed and every event in it is sent successfully (meaning with status code 200). It should accept one argument: the number of requests sent this way. -
on_failure
is a callback which is called if the buffer is flushed but not every event is sent successfully. It should accept two arguments: the number of successfully sent events and an array containing the unsuccessful events.
AsyncEmitter is a subclass of Emitter. It's API is exactly the same. It's advantage is that it always creates a new thread to flush its buffer, so requests are sent asynchronously.
It is possible to initialize a tracker with an array of emitters, in which case events will be sent to all of them:
# Create a tracker with multiple emitters
my_tracker = SnowplowTracker::Tracker.new([my_sync_emitter, my_async_emitter], 'my_tracker_name', 'my_app_id')
You can also add new emitters after creating a tracker with the add_emitter
method:
# Create a tracker with multiple emitters
my_tracker.add_emitter(another_emitter)
You may want to force an emitter to send all events in its buffer, even if the buffer is not full. The Tracker
class has a flush
method which flushes all its emitters. It accepts one argument, sync
, which defaults to false. If you set sync
to true
, the flush will be synchronous: it will block until all flushing threads are finished.
# Asynchronous flush
my_tracker.flush
# Synchronous flush
my_tracker.flush(true)
The Snowplow Ruby Tracker uses the Ruby Contracts gem for typechecking. Contracts are enabled by default but can be turned on or off:
# Turn contracts off
SnowplowTracker::disable_contracts
# Turn contracts back on
SnowplowTracker::enable_contracts
The emitters.rb module has Ruby logging enabled to give you information about requests being sent. The logger prints messages about what emitters are doing. By default, only messages with priority "INFO" or higher will be logged.
To change this:
require 'logger'
SnowplowTracker::LOGGER.level = Logger::DEBUG
The levels are:
Level | Description |
---|---|
FATAL |
Nothing logged |
WARN |
Notification for requests with status code not equal to 200 |
INFO |
Notification for all requests |
DEBUG |
Contents of all requests |
Home | About | Project | Setup Guide | Technical Docs | Copyright © 2012-2021 Snowplow Analytics Ltd. Documentation terms of use.
HOME » TECHNICAL DOCUMENTATION
1A. Trackers
Overview
ActionScript3 Tracker
Android Tracker
Arduino Tracker
CPP Tracker
Golang Tracker
Google AMP Tracker
iOS Tracker
Java Tracker
JavaScript Tracker
Lua Tracker
.NET Tracker
Node.js Tracker
PHP Tracker
Pixel Tracker
Python Tracker
Ruby Tracker
Scala Tracker
Unity Tracker
Building a Tracker
1B. Webhooks
Iglu webhook adapter
CallRail webhook adapter
MailChimp webhook adapter
Mandrill webhook adapter
PagerDuty webhook adapter
Pingdom webhook adapter
SendGrid webhook adapter
Urban Airship Connect webhook adapter
Mailgun webhook adapter
StatusGator webhook adapter
Unbounce webhook adapter
Olark webhook adapter
Marketo webhook adapter
Vero webhook adapter
2. Collectors
Overview
Cloudfront collector
Clojure collector (Elastic Beanstalk)
Scala Stream Collector
3. Enrich
Overview
EmrEtlRunner
Stream Enrich
Beam Enrich
Snowplow Event Recovery
Hadoop Event Recovery
C. Canonical Snowplow event model
4. Storage
Overview
Relational Database Shredder
Relational Database Loader
S3 Loader
Elasticsearch Loader
Storage in Redshift
Storage in PostgreSQL
Storage in Infobright (deprecated)
D. Snowplow storage formats (to write)
5. Analytics
Analytics-documentation
Event-manifest-populator
Common
Shredding
Artifact repositories