Important note: On March 14, 2023, a new data set will be released as part of the Daily Activity Stream, and the path at which the exported files are stored will change. If you have scheduled jobs to read and ingest the Daily Activity Stream before this date, follow the steps described here to ensure continuity. Learn more about the new data set here.
With the Daily Activity Stream, you can export Dynamic Yield raw data and ingest it into your own analytics platform. This enables you to connect the Dynamic Yield data to additional data sources in your database and build customized reports that are tailored to your unique business needs.
The export consists of two datasets:
- Raw data: Includes raw interactions with Dynamic Yield variations, events, and pageviews.
- Attribution data [available March 14, 2023]: Includes the relationship between events and the variations they were attributed to according to your experience settings.
Data is exported daily into a secured Amazon S3 bucket, where it is stored for 30 days.
Data is exported as Apache Parquet files, a format optimized for large data sets.
Data sets
Raw data
This data set contains all interactions of the following types:
- Variation engagement: Impression of, or clicks on, variations.
- Events: Such as Purchase or Subscription.
- Page Views
For example, if a user views 3 pages, triggers 2 events, and has 1 impression of a variation, the export contains 5 rows.
Each interaction, referred to in the data set as eventType, carries specific properties. For example:
eventType | Description | Additional Attributes (Examples) |
---|---|---|
UIA | Pageview | URL, page context |
DPX | Event hit | Event properties, event value |
VARIATION_ENGAGEMENT | Variation impression or click | Campaign name, variation name |
The following tables include the full list of attributes available for each eventType.
Represents a page viewed by the user.
Attribute | Description | Example |
---|---|---|
eventType String |
The type of activity. For pageview, the value is UIA (as opposed to VARIATION_ENGAGEMENT or DPX). | "UIA" |
interactionId |
Available March 14, 2023 |
1234567890 |
contextType String |
The page type, according to the page context. | "HOMEPAGE" |
contextData String array |
Data about the page type.
|
["Women","Shoes"] |
dyId Long |
The internal identifier Dynamic Yield assigns to each visitor to the site or app, unique per device. | 123456789012345678 |
timestamp Long |
The time the activity occurred, in milliseconds, from the UNIX epoch. | 1621798861400 |
eventType String |
The type of event fired. It's always one of the following:
|
"DPX" |
sessionId Integer |
The internal identifier Dynamic Yield assigns to a visitor's session. | 1234567890 |
url String |
The URL from which the event was fired. | "https://www.example.com/?url_params=123" |
urlClean String |
The URL from which the event was fired, after removing any URL parameters. | "https://www.example.com/" |
audiences Integer array |
The list of identifiers for audiences the user is a member of at the time of firing the event. | [1234567, 9876543] |
browser String |
The browser type from which the event was fired. | "Safari" |
device String |
The type of device that triggered the event. | "Tablet" |
operatingSystem String |
The operation system of the device that fired the event. | "Mac OS X" |
screenResolution String |
The screen resolution of the device that fired the event. | "Low (1024px and below)" |
reqTimestamp Long |
Internal request timestamp to the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
procTimestamp Long |
Internal processing timestamp of the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
resTimestamp Long |
Internal resolution timestamp of the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
Represents a Dynamic Yield event triggered by the user.
Attribute | Description | Example |
---|---|---|
eventType String |
The type of activity. For an event hit, the value is DPX (as opposed to VARIATION_ENGAGEMENT or UIA). | "DPX" |
interactionId |
Available March 14, 2023 |
1234567890 |
eventId Integer |
A unique identifier for each event explicitly fired from the site. | 12345 |
eventName String |
The event name as written in the event API. | "Purchase" |
eventProperties String (JSON) |
The event properties as written in the event API. These differ depending on the eventType value. | { "transaction_id": "ABC123456", "value": 100.0, "currency": "USD", "dyType": "purchase-v1", "Brands": "Nike", "Categories": "Sneakers", "Number_of_items": 1.0, "cart": [{ "productId": "AIR-123", "quantity": 1.0, "itemPrice": 100.0 }] } |
eventValue Long |
The total value of all items in the cart, as they appear in the eventProperties attribute. The value is in cents. |
10000 |
uniqueTransactionId String |
The transaction ID, as it appears in the eventProperties attribute for a purchase event. |
ABC123456 |
productIds |
The list of product IDs upon which an action was done. |
[“12345”] |
dyId Long |
The internal identifier Dynamic Yield assigns to each visitor to the site or app, unique per device. | 123456789012345678 |
timestamp Long |
The time of the activity has occurred, in milliseconds, from the UNIX epoch. | 1621798861400 |
sessionId Integer |
The Internal identifier Dynamic Yield assigns to a visitor's session | 1234567890 |
url String |
The URL from which the event was fired. | "https://www.example.com/?url_params=123" |
urlClean String |
The URL from which the event was fired, after removing any URL parameters. | "https://www.example.com/" |
audiences Integer array |
The list of identifiers of audiences the user is a member of at the time of firing the event. | [1234567, 9876543] |
browser String |
The browser type from which the event was fired. | "Safari" |
device String |
The type of device that triggered the event | "Tablet" |
operatingSystem String |
The operation system of the device that fired the event. | "Mac OS X" |
screenResolution String |
The screen resolution of the device that fired the event. | "Low (1024px and below)" |
reqTimestamp Long |
Internal request timestamp to the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
procTimestamp Long |
Internal processing timestamp of the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
resTimestamp Long |
Internal resolution timestamp of the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
Represents an impression of a variation or a click on a variation.
Attribute | Description | Example |
---|---|---|
eventType String |
The type of activity. For variation click or impression, the value is VARIATION_ENGAGEMENT (as opposed to UIA or DPX). | "VARIATION_ENGAGEMENT" |
interactionId |
Available March 14, 2023 |
1234567890 |
engagementType String |
The type of engagement with the variation. Possible values:
|
"IMPRESSION" |
campaignId Integer |
The ID of the campaign that this variation is part of. For engagement with Experience Email blocks, this represents the ID of the block, while parentCampaignId contains the ID of the Experience Email campaign. | 123456 |
campaignName String |
The name of the campaign that this variation is part of. For engagement with Experience Email blocks, this represents the name of the block, while parentCampaignName contains the name of the Experience Email campaign. | "Homepage Banner" |
experienceId Integer |
The ID of the experience that this variation is part of. | 123456 |
experienceName String |
The name of the experience that this variation is part of. | "Summer Promo" |
experimentId Integer |
The unique identifier of the test. | 123456 |
versionId Integer |
The unique identifier of the test version. An A/B test might have multiple versions. | 245467 |
variationIds Integer array |
The ID of the variation that the user was served with (if the type is IMPRESSION) or clicked on (if the type is CLICK). Usually, the list contains a single ID, but if the campaign type is "Dynamic Content Item List", it contains a list of variations, separated by commas. |
[1234567, 9876543] |
variationNames String array |
The name of the variation that the user was served (if the type is IMPRESSION) or clicked on (if the type is CLICK). Usually, the list contains a single ID, but if the campaign type is "Dynamic Content Item List", it contains a list of variations, separated by commas. |
["Blue Button", "Red Button"] |
sku String array |
If the event is a view of or click on a recommendation widget, it's the list of SKUs that were recommended or the SKU that was clicked. |
["1234", "9876"] |
strategyId Integer array |
If the event is a view of or click on a recommendation widget, it's the ID of the Strategy that was served. A single variation can include multiple widgets with multiple strategies. |
[126651,426356] |
strategyName |
If the event is a view of or click on a recommendation widget, it's the name of the Strategy that was served. A single variation can include multiple widgets with multiple strategies. |
["Most Popular","Affinity"] |
touchpointId Integer |
In touchpoints only: The ID of the touchpoint. | 245467 |
touchpointName |
In touchpoints only: The name of the touchpoint. | ["Hero Banner"] |
parentVariationId Integer |
In touchpoints only: The ID of the variation that serves this touchpoint in the multi-touch campaign. | [9876543] |
parentVariationName |
In touchpoints only: The name of the variation that serves this touchpoint in the multi-touch campaign. | "Blue Design" |
parentCampaignId |
If the engagement is with an Experience Email block, this field contains the Campaign ID of the Experience Email campaign | 123456 |
parentCampaignName |
If the engagement is with an Experience Email block, this field contains the campaign name of the Experience Email campaign | "Experience Email Campaign 1" |
dyId Long |
The internal identifier Dynamic Yield assigns to each visitor to the site or app, unique per device. | 123456789012345678 |
timestamp Long |
The time the activity occurred, in milliseconds, from the UNIX epoch. | 1621798861400 |
sessionId Integer |
The internal identifier Dynamic Yield assigns to a visitor's session | 1234567890 |
url String |
The URL from which the event was fired. | "https://www.example.com/?url_params=123" |
urlClean String |
The URL from which the event was fired, after removing any URL parameters. | "https://www.example.com/" |
audiences Integer array |
The list of identifiers of audiences the user is a member of at the time of firing the event. | [1234567, 9876543] |
reqTimestamp Long |
Internal request timestamp to the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
procTimestamp Long |
Internal processing timestamp of the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
resTimestamp Long |
Internal resolution timestamp of the analytics pipeline, in milliseconds, from the UNIX epoch. | 1621798861400 |
Attribution Data [available March 14, 2023]
This data set contains the relationship between a variation and the events attributed to it according to the experience settings.
Each data set record represents a distinct event-variation combination.
Note that there is usually a one-to-many relationship between events and variations. For example, if 1 purchase was attributed to 2 variations (control of test A, variation of test B), the data set contains 2 rows.
Field | Description |
---|---|
interactionId (int) |
The interaction’s identifier (for a specific event, for example, a specific purchase). It can be used to join this data set to the attribution data set. |
eventType (varchar) |
The type of event being attributed to a variation. Currently, only DPX events are exported. |
dyid (int) |
The identifier of the user who triggered the attributed event. |
experimentId (int) |
The Experiment ID to which the event was attributed. |
versionId (int) |
The Version ID of the Experiment to which the event was attributed. |
variationIds (array) |
The array of Variation IDs to which the event was attributed. This array usually contains a single value, but can contain multiple values if the experience is an Item List. |
eventId (int) |
The ID of the DPX event attributed to the variation. For example, for a purchase, this would be the event ID of the Purchase event. |
eventValue (int) |
The value of the DPX event attributed to the variation. For example, for a purchase, this would represent revenue connected to the Purchase event.
The value is in cents. |
Turning on the Daily Activity Stream
To turn on the Daily Activity Stream:
- Go to Settings › General Settings › Daily Activity Stream.
- Click Turn on daily export.
- Copy the S3 bucket path and credentials to a secure location as it is only displayed once.
Lost your credentials?
Click the additional options iconand then click Generate New Credentials. Keep in mind that you can only generate new credentials once.
- That's it! The first data set will be exported after midnight, following which new data will be added daily.
After activating Daily Activity Stream (turning the toggle on), the page also displays the status of the export.
You can always disable the export in the options menu. If you disable the export and then enable it again, your S3 bucket remains the same, but you'll be given a new set of credentials.
Accessing your S3 bucket
After you activate the Daily Activity Stream, you'll be given access to an S3 bucket with the following path:
- s3://dy-raw-data-export/sectionId=1234567, for the US data center
- s3://dy-raw-data-export-eu/sectionId=1234567, for the EU data center.
Under this path, you'll find subfolders for each date, where parquet files are stored.
For example, this path contains all the data collected on 2022-01-01, according to the time zone configured in your General Settings:
s3://dy-raw-data-export-eu/sectionId=1234567/date=2022-01-01
Important changes [March 14, 2023]
As of March 14, 2023, with the release of the first Attribution data set, we'll add a new folder within each date, to separate the Raw and Attribution data sets.
- This path includes the currently exported raw data:
s3://dy-raw-data-export/sectionId=1234567/date=2023-01-01/reportType=raw - This path includes the new attribution data:
s3://dy-raw-data-export/sectionId=1234567/date=2023-01-01/reportType=attribution
In addition, we'll modify the schema of the raw data, adding the interactionId field, as a requirement in both datasets.
Importing data into your analytics platform
To integrate the Daily Activity Stream data into your systems automatically, you might want to create a job to extract the Parquet files from S3 and load them into your database of choice. Here are some links to relevant articles for popular solutions:
- Google BigQuery
- Microsoft SQL server
- Amazon Athena
- Amazon Redshift
- Snowflake
- Oracle 12c
- Databricks
- IBM Infosphere
- Informatica
I already use the Daily Activity Stream. How does the release of Attribution data on March 14 affect me?
On March 14, 2023, we'll begin exporting a new attribution data set. Together with this release, we'll change the structure of the bucket to keep the new and existing data sets separate. From that moment on, the path at which files are exported will have two new subfolders:
-
/reportType=raw, containing the existing raw data set
-
/reportType=attribution, containing the new data set
This means that on March 13, the data for March 12 will be located at the usual path:
- s3://dy-raw-data-export/sectionId=1234567/date=2023-03-12/*.parquet, for the US data center
- s3://dy-raw-data-export-eu/sectionId=1234567/date=2023-03-12/*.parquet, for the EU data center
But as of March 14, the 2 datasets for March 13 will be located at the following paths.
Raw data will be here:
- s3://dy-raw-data-export/sectionId=1234567/date=2023-03-13/reportType=raw/*.parquet, for the US data center
- s3://dy-raw-data-export-eu/sectionId=1234567/date=2023-03-13/reportType=raw/*.parquet, for the EU Data center
Attribution data will be here:
- s3://dy-raw-data-export/sectionId=1234567/date=2023-03-13/reportType=attribution/*.parquet, for the US data center
- s3://dy-raw-data-export-eu/sectionId=1234567/date=2023-03-13/reportType=attribution/*.parquet, for the EU data center
If you have a job scheduled to ingest Daily Activity Stream, do the following:
- Edit the S3 path you are reading from to include the new subpath. For example, if you are in the US data center:
- From s3://dy-raw-data-export/sectionId=1234567/date=2022-03-13/
- To: s3://dy-raw-data-export/sectionId=1234567/date=2023-03-13/reportType=raw/*.parquet
- Alternatively, if your software allows it, exclude files from the path containing "reportType=attribution" from being loaded by your current job.
- Add a new column of your target table for raw data to include the new interactionId field.
If you want to start ingesting the new attribution data set as well, create a new job reading from:
s3://dy-raw-data-export/sectionId=1234567/date=2022-03-13/reportType=attribution
All the field descriptions and data types are described earlier in this article.