From a70388fb86aa46ebaa13032f9e36d5a684f849b1 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Fri, 19 Jul 2024 12:46:12 +0200 Subject: [PATCH 1/6] Tweak description to be clearer about how to track a user's id and their item id... --- docs/html/event.schema.html | 6 +++--- docs/html/query.request.schema.html | 2 +- docs/html/query.response.schema.html | 2 +- docs/schema/event-properties-client_id.md | 2 +- docs/schema/event-properties-event_attributes.md | 2 +- docs/schema/event.md | 4 ++-- docs/schema/query-1-properties-client_id.md | 2 +- docs/schema/query-1.md | 2 +- out/event.schema.json | 2 +- out/query.request.schema.json | 2 +- 10 files changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/html/event.schema.html b/docs/html/event.schema.html index a0fdb64..8ed03fe 100644 --- a/docs/html/event.schema.html +++ b/docs/html/event.schema.html @@ -3,13 +3,13 @@
"doctor-search"
 


The name of the action that triggered the event. We have a set of common defaults, however you can pass in whatever you want.

Type: enum (of string)

Must be one of:

  • "click_through"
  • "add_to_cart"
  • "click"
  • "watch"
  • "view"
  • "purchase"
Type: string

Must be at most 100 characters long


The unique identifier of a query, typically a UUID, but can be any string.

Type: stringFormat: uuid

Example:

"00112233-4455-6677-8899-aabbccddeeff"
 
Type: string

Must be at most 100 characters long


Example:

"1234-user-5678"
-

Type: string

The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot.

Must be at most 100 characters long


Examples:

"5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5"
+

Type: string

The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an Additional Property.

Must be at most 100 characters long


Examples:

"5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5"
 
"quepid-nightly-bot"
 
"BugsBunny::Firefox@0967084"
 

Type: stringFormat: date-time

When the event took place.


Example:

"2018-11-13T20:20:39+00:00"
 

Type: string

Group various action_name's into logical bins.

Must be at most 100 characters long


Examples:

"QUERY"
 
"CONVERSION"
-

Type: string

Optional text message for the log entry. For example, for a message_type of QUERY, we would expect the text to be about what the user is searching on.

Must be at most 1024 characters long

Type: object

Extensible details about a specific event.

Type: object

Structure which contains identifying information of the object returned from the query that the user interacts with (i.e.: a book, a product, a post, etc..).


The id that a user could look up and find the object instance within the document corpus. Examples include: ssn, isbn, ean, etc. Variants need to be incorporated in the object_id, so for a t-shirt that is red, you would need SKU level as the object_id.

Type: string

Must be at most 256 characters long


Examples:

"XYZ-12345"
+

Type: string

Optional text message for the log entry. For example, for a message_type of QUERY, we would expect the text to be about what the user is searching on.

Must be at most 1024 characters long

Type: object

Extensible details about a specific event. A common example of an Additional Properties is the specific identifier of the user (user_id). Note: a user identifier is different then the required client_id attribute.

Type: object

Structure which contains identifying information of the object returned from the query that the user interacts with (i.e.: a book, a product, a post, etc..).


The id that a user could look up and find the object instance within the document corpus. Examples include: ssn, isbn, ean, etc. Variants need to be incorporated in the object_id, so for a t-shirt that is red, you would need SKU level as the object_id.

Type: string

Must be at most 256 characters long


Examples:

"XYZ-12345"
 
"ISBN 0-061-96436-0"
 
"123"
 

Type: string

The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be sku and if you are querying for people, maybe this is ssn. If you do not provide this value then the default primary identifier in your search index will be used. For example _id on OpenSearch.

Must be at most 100 characters long


A unique id that the an individual search engine uses internally to index the object via. For example, in OpenSearch, think the _id field in the indices.

Type: string

Must be at most 256 characters long


Examples:

"1"
@@ -17,4 +17,4 @@
 

Additional Properties of any type are allowed.

Type: object


Structure that contains information on the location of the event origin, such as screen x,y coordinates, or the nth object out of 10 results.

Type: object

Type: object

The nth position of the document on the search results page.

Type: integer

The position of the document. For grid layout this would be left to right, ignoring wrapping.


Examples:

1
 
3
 
24
-
Type: object

Type: object

The x,y coordinates on the screen for triggering an event.

Type: number

The horizontal location on the page or screen of the event.

Type: number

The vertical location on the page or screen of the event.

Additional Properties of any type are allowed.

Type: object

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +
Type: object

Type: object

The x,y coordinates on the screen for triggering an event.

Type: number

The horizontal location on the page or screen of the event.

Type: number

The vertical location on the page or screen of the event.

Additional Properties of any type are allowed.

Type: object

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/docs/html/query.request.schema.html b/docs/html/query.request.schema.html index efe3458..578f657 100644 --- a/docs/html/query.request.schema.html +++ b/docs/html/query.request.schema.html @@ -3,4 +3,4 @@

Type: string

The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot.

Must be at most 100 characters long


Examples:

"5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5"
 
"quepid-nightly-bot"
 
"BugsBunny::Firefox@0967084"
-

Type: string

The query as the user entered it. No length limit specified.

Type: object

Any query modifiers like filter choices or pagination. Other attributes such as experiment identifiers that need to be tracked with the query.

Additional Properties of any type are allowed.

Type: object

Type: string

The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be sku and if you are querying for people, maybe this is ssn. If you do not provide this value then the default primary identifier in your search index will be used. For example _id on OpenSearch.

Must be at most 100 characters long

\ No newline at end of file +

Type: string

The query as the user entered it. No length limit specified.

Type: object

Any query modifiers like filter choices or pagination. Other attributes such as experiment identifiers that need to be tracked with the query.

Additional Properties of any type are allowed.

Type: object

Type: string

The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be sku and if you are querying for people, maybe this is ssn. If you do not provide this value then the default primary identifier in your search index will be used. For example _id on OpenSearch.

Must be at most 100 characters long

\ No newline at end of file diff --git a/docs/html/query.response.schema.html b/docs/html/query.response.schema.html index 88e5afa..309c071 100644 --- a/docs/html/query.response.schema.html +++ b/docs/html/query.response.schema.html @@ -1,3 +1,3 @@ Query Response When Using UBI

Query Response When Using UBI

Type: object

Version 1.0.0; last updated 2024-06-14. The response to a query made by a user should support this schema.


The unique identifier of a query, typically a UUID, but can be any string.

Type: stringFormat: uuid

Example:

"00112233-4455-6677-8899-aabbccddeeff"
 
Type: string

Must be at most 100 characters long


Example:

"1234-user-5678"
-
\ No newline at end of file + \ No newline at end of file diff --git a/docs/schema/event-properties-client_id.md b/docs/schema/event-properties-client_id.md index 2c477f2..59bf20c 100644 --- a/docs/schema/event-properties-client_id.md +++ b/docs/schema/event-properties-client_id.md @@ -4,7 +4,7 @@ https://o19s.github.io/ubi/schema/1.0.0/event.schema.json#/properties/client_id ``` -The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. +The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an *Additional Property*. | Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | :------------------ | :--------- | :------------- | :---------------------- | :---------------- | :-------------------- | :------------------ | :------------------------------------------------------------------------------ | diff --git a/docs/schema/event-properties-event_attributes.md b/docs/schema/event-properties-event_attributes.md index 04a2a98..21f3ed4 100644 --- a/docs/schema/event-properties-event_attributes.md +++ b/docs/schema/event-properties-event_attributes.md @@ -4,7 +4,7 @@ https://o19s.github.io/ubi/schema/1.0.0/event.schema.json#/properties/event_attributes ``` -Extensible details about a specific event. +Extensible details about a specific event. A common example of an *Additional Properties* is the specific identifier of the user (`user_id`). Note: a user identifier is different then the required `client_id` attribute. | Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | :------------------ | :--------- | :------------- | :----------- | :---------------- | :-------------------- | :------------------ | :------------------------------------------------------------------------------ | diff --git a/docs/schema/event.md b/docs/schema/event.md index a6d5291..ad6d12b 100644 --- a/docs/schema/event.md +++ b/docs/schema/event.md @@ -113,7 +113,7 @@ one (and only one) of ## client\_id -The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. +The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an *Additional Property*. `client_id` @@ -233,7 +233,7 @@ Optional text message for the log entry. For example, for a message\_type of QUE ## event\_attributes -Extensible details about a specific event. +Extensible details about a specific event. A common example of an *Additional Properties* is the specific identifier of the user (`user_id`). Note: a user identifier is different then the required `client_id` attribute. `event_attributes` diff --git a/docs/schema/query-1-properties-client_id.md b/docs/schema/query-1-properties-client_id.md index 6a2f6d3..712d689 100644 --- a/docs/schema/query-1-properties-client_id.md +++ b/docs/schema/query-1-properties-client_id.md @@ -4,7 +4,7 @@ https://o19s.github.io/ubi/schema/1.0.0/query.request.schema.json#/properties/client_id ``` -The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. +The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an *Additional Property*. | Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | :------------------ | :--------- | :------------- | :---------------------- | :---------------- | :-------------------- | :------------------ | :---------------------------------------------------------------------------------------------- | diff --git a/docs/schema/query-1.md b/docs/schema/query-1.md index 27fa0b0..7802674 100644 --- a/docs/schema/query-1.md +++ b/docs/schema/query-1.md @@ -50,7 +50,7 @@ one (and only one) of ## client\_id -The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. +The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an *Additional Property*. `client_id` diff --git a/out/event.schema.json b/out/event.schema.json index b15639d..c24bcbc 100644 --- a/out/event.schema.json +++ b/out/event.schema.json @@ -1 +1 @@ -{"$schema":"https://json-schema.org/draft/2020-12/schema","$id":"https://o19s.github.io/ubi/schema/1.0.0/event.schema.json","title":"Event tracking for UBI","description":"Version 1.0.0; last updated 2024-06-14. An event that occurred, typically in response to a user.","type":"object","required":["action_name","query_id","timestamp"],"properties":{"application":{"description":"name of the application tracking UBI events.","type":"string","maxLength":100,"examples":["amazon-shop","ABC-microservice","doctor-search"]},"action_name":{"description":"The name of the action that triggered the event. We have a set of common defaults, however you can pass in whatever you want.","oneOf":[{"type":"string","maxLength":100,"enum":["click_through","add_to_cart","click","watch","view","purchase"]},{"type":"string","maxLength":100}]},"query_id":{"description":"The unique identifier of a query, typically a UUID, but can be any string.","oneOf":[{"type":"string","format":"uuid","examples":["00112233-4455-6677-8899-aabbccddeeff"]},{"type":"string","maxLength":100,"examples":["1234-user-5678"]}]},"client_id":{"description":"The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot.","type":"string","maxLength":100,"examples":["5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5","quepid-nightly-bot","BugsBunny::Firefox@0967084"]},"timestamp":{"description":"When the event took place.","type":"string","format":"date-time","examples":["2018-11-13T20:20:39+00:00"]},"message_type":{"description":"Group various `action_name`'s into logical bins.","type":"string","maxLength":100,"examples":["QUERY","CONVERSION"],"$comment":"TDB: action_type? event_type? Should the front end even define this?"},"message":{"description":"Optional text message for the log entry. For example, for a message_type of QUERY, we would expect the text to be about what the user is searching on.","type":"string","maxLength":1024},"event_attributes":{"description":"Extensible details about a specific event.","type":"object","additionalProperties":true,"required":["position"],"properties":{"object":{"description":"Structure which contains identifying information of the object returned from the query that the user interacts with (i.e.: a book, a product, a post, etc..).","type":"object","additionalProperties":true,"required":["object_id"],"properties":{"object_id":{"description":"The id that a user could look up and find the object instance within the *document corpus*. Examples include: _ssn_, _isbn_, _ean_, etc. Variants need to be incorporated in the `object_id`, so for a t-shirt that is red, you would need SKU level as the `object_id`.","examples":["XYZ-12345","ISBN 0-061-96436-0","123"],"anyOf":[{"type":"string","maxLength":256},{"type":"integer"}]},"object_id_field":{"description":"The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be `sku` and if you are querying for people, maybe this is `ssn`. If you do not provide this value then the default primary identifier in your search index will be used. For example `_id` on OpenSearch. ","type":"string","maxLength":100},"internal_id":{"description":"A unique id that the an individual search engine uses internally to index the object via. For example, in OpenSearch, think the `_id` field in the indices.","examples":["1","123456"],"anyOf":[{"type":"string","maxLength":256},{"type":"integer"}]}}},"position":{"description":"Structure that contains information on the location of the event origin, such as screen x,y coordinates, or the nth object out of 10 results.","type":"object","additionalProperties":true,"oneOf":[{"type":"object","properties":{"ordinal":{"description":"The nth position of the document on the search results page.","type":"object","properties":{"index":{"description":"The position of the document. For grid layout this would be left to right, ignoring wrapping.","type":"integer","examples":[1,3,24]}},"required":["index"]}},"required":["ordinal"]},{"type":"object","properties":{"xy":{"description":"The x,y coordinates on the screen for triggering an event.","$comment":"What about bounding boxes?","type":"object","properties":{"x":{"description":"The horizontal location on the page or screen of the event.","type":"number"},"y":{"description":"The vertical location on the page or screen of the event.","type":"number"}},"required":["x","y"]}},"required":["xy"]}]}}}}} +{"$schema":"https://json-schema.org/draft/2020-12/schema","$id":"https://o19s.github.io/ubi/schema/1.0.0/event.schema.json","title":"Event tracking for UBI","description":"Version 1.0.0; last updated 2024-06-14. An event that occurred, typically in response to a user.","type":"object","required":["action_name","query_id","timestamp"],"properties":{"application":{"description":"name of the application tracking UBI events.","type":"string","maxLength":100,"examples":["amazon-shop","ABC-microservice","doctor-search"]},"action_name":{"description":"The name of the action that triggered the event. We have a set of common defaults, however you can pass in whatever you want.","oneOf":[{"type":"string","maxLength":100,"enum":["click_through","add_to_cart","click","watch","view","purchase"]},{"type":"string","maxLength":100}]},"query_id":{"description":"The unique identifier of a query, typically a UUID, but can be any string.","oneOf":[{"type":"string","format":"uuid","examples":["00112233-4455-6677-8899-aabbccddeeff"]},{"type":"string","maxLength":100,"examples":["1234-user-5678"]}]},"client_id":{"description":"The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an _Additional Property_.","type":"string","maxLength":100,"examples":["5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5","quepid-nightly-bot","BugsBunny::Firefox@0967084"]},"timestamp":{"description":"When the event took place.","type":"string","format":"date-time","examples":["2018-11-13T20:20:39+00:00"]},"message_type":{"description":"Group various `action_name`'s into logical bins.","type":"string","maxLength":100,"examples":["QUERY","CONVERSION"],"$comment":"TDB: action_type? event_type? Should the front end even define this?"},"message":{"description":"Optional text message for the log entry. For example, for a message_type of QUERY, we would expect the text to be about what the user is searching on.","type":"string","maxLength":1024},"event_attributes":{"description":"Extensible details about a specific event. A common example of an _Additional Properties_ is the specific identifier of the user (`user_id`). Note: a user identifier is different then the required `client_id` attribute.","type":"object","additionalProperties":true,"required":["position"],"properties":{"object":{"description":"Structure which contains identifying information of the object returned from the query that the user interacts with (i.e.: a book, a product, a post, etc..).","type":"object","additionalProperties":true,"required":["object_id"],"properties":{"object_id":{"description":"The id that a user could look up and find the object instance within the *document corpus*. Examples include: _ssn_, _isbn_, _ean_, etc. Variants need to be incorporated in the `object_id`, so for a t-shirt that is red, you would need SKU level as the `object_id`.","examples":["XYZ-12345","ISBN 0-061-96436-0","123"],"anyOf":[{"type":"string","maxLength":256},{"type":"integer"}]},"object_id_field":{"description":"The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be `sku` and if you are querying for people, maybe this is `ssn`. If you do not provide this value then the default primary identifier in your search index will be used. For example `_id` on OpenSearch. ","type":"string","maxLength":100},"internal_id":{"description":"A unique id that the an individual search engine uses internally to index the object via. For example, in OpenSearch, think the `_id` field in the indices.","examples":["1","123456"],"anyOf":[{"type":"string","maxLength":256},{"type":"integer"}]}}},"position":{"description":"Structure that contains information on the location of the event origin, such as screen x,y coordinates, or the nth object out of 10 results.","type":"object","additionalProperties":true,"oneOf":[{"type":"object","properties":{"ordinal":{"description":"The nth position of the document on the search results page.","type":"object","properties":{"index":{"description":"The position of the document. For grid layout this would be left to right, ignoring wrapping.","type":"integer","examples":[1,3,24]}},"required":["index"]}},"required":["ordinal"]},{"type":"object","properties":{"xy":{"description":"The x,y coordinates on the screen for triggering an event.","$comment":"What about bounding boxes?","type":"object","properties":{"x":{"description":"The horizontal location on the page or screen of the event.","type":"number"},"y":{"description":"The vertical location on the page or screen of the event.","type":"number"}},"required":["x","y"]}},"required":["xy"]}]}}}}} diff --git a/out/query.request.schema.json b/out/query.request.schema.json index a29e3ea..ad8f8c7 100644 --- a/out/query.request.schema.json +++ b/out/query.request.schema.json @@ -1 +1 @@ -{"$schema":"https://json-schema.org/draft/2020-12/schema","$id":"https://o19s.github.io/ubi/schema/1.0.0/query.request.schema.json","title":"Query Tracking for UBI","description":"Version 1.0.0; last updated 2024-06-14. A query made by a user should include these attributes for UBI tracking.","type":"object","required":["user_query"],"properties":{"query_id":{"description":"The unique identifier of a query, typically a UUID, but can be any string.","oneOf":[{"type":"string","format":"uuid","examples":["00112233-4455-6677-8899-aabbccddeeff"]},{"type":"string","maxLength":100,"examples":["1234-user-5678"]}]},"client_id":{"description":"The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot.","type":"string","maxLength":100,"examples":["5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5","quepid-nightly-bot","BugsBunny::Firefox@0967084"]},"user_query":{"description":"The query as the user entered it. No length limit specified.","type":"string","$comment":"Currently not required to support recommendation systems etc that might not have a user generated query."},"query_attributes":{"description":"Any query modifiers like filter choices or pagination. Other attributes such as experiment identifiers that need to be tracked with the query.","type":"object","additionalProperties":true},"object_id_field":{"description":"The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be `sku` and if you are querying for people, maybe this is `ssn`. If you do not provide this value then the default primary identifier in your search index will be used. For example `_id` on OpenSearch. ","type":"string","maxLength":100}}} +{"$schema":"https://json-schema.org/draft/2020-12/schema","$id":"https://o19s.github.io/ubi/schema/1.0.0/query.request.schema.json","title":"Query Tracking for UBI","description":"Version 1.0.0; last updated 2024-06-14. A query made by a user should include these attributes for UBI tracking.","type":"object","required":["user_query"],"properties":{"query_id":{"description":"The unique identifier of a query, typically a UUID, but can be any string.","oneOf":[{"type":"string","format":"uuid","examples":["00112233-4455-6677-8899-aabbccddeeff"]},{"type":"string","maxLength":100,"examples":["1234-user-5678"]}]},"client_id":{"description":"The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an _Additional Property_.","type":"string","maxLength":100,"examples":["5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5","quepid-nightly-bot","BugsBunny::Firefox@0967084"]},"user_query":{"description":"The query as the user entered it. No length limit specified.","type":"string","$comment":"Currently not required to support recommendation systems etc that might not have a user generated query."},"query_attributes":{"description":"Any query modifiers like filter choices or pagination. Other attributes such as experiment identifiers that need to be tracked with the query.","type":"object","additionalProperties":true},"object_id_field":{"description":"The name of the field that has the id of the objects that will be stored in the backend queries data store. So it you have a query for products and want to save the SKUs, then this might be `sku` and if you are querying for people, maybe this is `ssn`. If you do not provide this value then the default primary identifier in your search index will be used. For example `_id` on OpenSearch. ","type":"string","maxLength":100}}} From 68a0291a6dcc1efe0bc7e1aaafa7a9b165690602 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Fri, 19 Jul 2024 12:46:41 +0200 Subject: [PATCH 2/6] Generated after tweaking schema description --- schema/1.0.0/event.schema.json | 4 ++-- schema/1.0.0/query.request.schema.json | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/schema/1.0.0/event.schema.json b/schema/1.0.0/event.schema.json index e111c98..e6e89cb 100644 --- a/schema/1.0.0/event.schema.json +++ b/schema/1.0.0/event.schema.json @@ -46,7 +46,7 @@ ] }, "client_id": { - "description": "The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot.", + "description": "The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an _Additional Property_.", "type": "string", "maxLength": 100, "examples": ["5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5","quepid-nightly-bot", "BugsBunny::Firefox@0967084"] @@ -70,7 +70,7 @@ "maxLength": 1024 }, "event_attributes": { - "description": "Extensible details about a specific event.", + "description": "Extensible details about a specific event. A common example of an _Additional Properties_ is the specific identifier of the user (`user_id`). Note: a user identifier is different then the required `client_id` attribute.", "type": "object", "additionalProperties": true, "required": ["position"], diff --git a/schema/1.0.0/query.request.schema.json b/schema/1.0.0/query.request.schema.json index a8876d6..6c99622 100644 --- a/schema/1.0.0/query.request.schema.json +++ b/schema/1.0.0/query.request.schema.json @@ -22,7 +22,7 @@ ] }, "client_id": { - "description": "The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot.", + "description": "The client issuing the query. This could be a unique browser, a microservice that performs searches, a crawling bot. If only authenticated users are tracked, then you could use a specific user id here, otherwise you should use something permanent and track user id as an _Additional Property_.", "type": "string", "maxLength": 100, "examples": ["5e3b2a1c-8b7d-4f2e-a3d4-c9b2e1f3a4b5","quepid-nightly-bot", "BugsBunny::Firefox@0967084"] From f493ba2740af22585908ecc69c6881a3d75ee325 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Fri, 19 Jul 2024 12:59:41 +0200 Subject: [PATCH 3/6] Start adding FAQ --- README.md | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 7afdd18..f9b1070 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ UBI (or User Behavior Insights) is a(nother) naive attempt to create **a standar [Why use it](#-why-use-it) • [How to use it](#-how-to-use-it) • +[FAQ](#-frequently-asked-questions) • [Who uses it](#-who-uses-it) • [Who we are](#-who-we-are) • [How to contribute](#%EF%B8%8F-how-to-contribute) • @@ -44,7 +45,6 @@ UBI requires coordination between the client (a browser, a mobile app, etc) and | [query.response.schema.json](https://o19s.github.io/ubi/schema/1.0.0/query.response.schema.json) | [query.response.schema.html](https://o19s.github.io/ubi/docs/html/query.response.schema.html) | | [event.schema.json](https://o19s.github.io/ubi/schema/1.0.0/event.schema.json) | [event.schema.html](https://o19s.github.io/ubi/docs/html/event.schema.html) | -To validate You just need to copy, download or reference one of the schema files to validate a UBI data structure, built as a JSON file from scratch, or a JSON generated previously (for example, [these samples](https://github.com/o19s/ubi/blob/master/samples/)). To get started, you can copy both schema and sample in an **online validator** like [jsonschemavalidator.net](https://www.jsonschemavalidator.net/) or [liquid-technologies.com/online-json-schema-validator](https://www.liquid-technologies.com/online-json-schema-validator). Make sure to just copy the UBI related portions, and not any of the search engine specific code. Here is the UBI portion from the file [query-solr.json](https://github.com/o19s/ubi/blob/master/samples/query-solr.json) for example: @@ -76,6 +76,37 @@ The Schema is documented by itself, but it's much easier to get "the big picture
+ +## 🤔 Frequently Asked Questions + +#### How do I handle anonymous users? +We often want to track a specific identifer for a user, but then realize that we also want to connect those events to previously unauthenticated events. Therefore, we can't just plop in a explicit user id as the `client_id` attribute. Instead, you want to track something that is permanent, across the anonymous AND logged in session as the `client_id`. To make processing simpler you can store the explicit user identifier in the Event --> Event Attributes --> Additional Properties hash. Here is an example of user "abc" who clicked on item with sku "1234": + +```json +{ + "action_name": "item_click", + "query_id": "00112233-4455-6677-8899-aabbccddeeff", + "message_type": "INFO", + "message": "User abc clicked sku 1234", + "event_attributes": { + "position":{}, + "object": { + "object_id":"1234" + "object_id_field": "sku", + "user_id":"abc" + } + } +} +``` + +#### Where do I record my user id and item id? + +Blah + + + + + ### 🏫 Learn More * OpenSearchCon EU - [User Behavior Insights](https://www.youtube.com/watch?v=dH7SPHKpxo0&list=PLzgr9zSpws14zCETcKtCBwcOuTGMccpV9&index=32) From 6bdd17b119fc4e6e64a744ae5ab805cbd0e4c8a2 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Fri, 19 Jul 2024 13:02:51 +0200 Subject: [PATCH 4/6] take out double spaces, and add a link --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index f9b1070..4f297d9 100644 --- a/README.md +++ b/README.md @@ -25,11 +25,11 @@ UBI (or User Behavior Insights) is a(nother) naive attempt to create **a standar ## 🥘 Why use it -Many Search teams struggle with understanding "Why is my user doing this". They have great understanding of an incoming query and the documents returned, but no ability to connect that dot with an indicator of success, such as a click through event or event a add to cart. +Many Search teams struggle with understanding "Why is my user doing this". They have great understanding of an incoming query and the documents returned, but no ability to connect that dot with an indicator of success, such as a click through event or event a add to cart. -There are A LOT of tools out there for tracking events, Google Analytics, Snowplow, etc, but each is a bit different, and each tends to lock you in. None of them think about the needs of Search teams specifically either. +There are A LOT of tools out there for tracking events, Google Analytics, Snowplow, etc, but each is a bit different, and each tends to lock you in. None of them think about the needs of Search teams specifically either. -The User Behavior Insights standard attempts to provide a search focused standard that can operate across many platforms. There are implementations for +The User Behavior Insights standard attempts to provide a search focused standard that can operate across many platforms. There are implementations for * [OpenSearch](https://github.com/o19s/documentation-website/tree/ubi-docs-consolidation/_search-plugins/ubi) * [Apache Solr](https://github.com/apache/solr/pull/2452) @@ -37,7 +37,7 @@ The User Behavior Insights standard attempts to provide a search focused standar ## 🪛 How to use it -UBI requires coordination between the client (a browser, a mobile app, etc) and the backend, which is documented using JSON Schema. +UBI requires coordination between the client (a browser, a mobile app, etc) and the backend, which is documented using JSON Schema. | JSON Schema | HTML Docs | | --- | --- | @@ -45,9 +45,9 @@ UBI requires coordination between the client (a browser, a mobile app, etc) and | [query.response.schema.json](https://o19s.github.io/ubi/schema/1.0.0/query.response.schema.json) | [query.response.schema.html](https://o19s.github.io/ubi/docs/html/query.response.schema.html) | | [event.schema.json](https://o19s.github.io/ubi/schema/1.0.0/event.schema.json) | [event.schema.html](https://o19s.github.io/ubi/docs/html/event.schema.html) | -You just need to copy, download or reference one of the schema files to validate a UBI data structure, built as a JSON file from scratch, or a JSON generated previously (for example, [these samples](https://github.com/o19s/ubi/blob/master/samples/)). +You just need to copy, download or reference one of the schema files to validate a UBI data structure, built as a JSON file from scratch, or a JSON generated previously (for example, [these samples](https://github.com/o19s/ubi/blob/master/samples/)). -To get started, you can copy both schema and sample in an **online validator** like [jsonschemavalidator.net](https://www.jsonschemavalidator.net/) or [liquid-technologies.com/online-json-schema-validator](https://www.liquid-technologies.com/online-json-schema-validator). Make sure to just copy the UBI related portions, and not any of the search engine specific code. Here is the UBI portion from the file [query-solr.json](https://github.com/o19s/ubi/blob/master/samples/query-solr.json) for example: +To get started, you can copy both schema and sample in an **online validator** like [jsonschemavalidator.net](https://www.jsonschemavalidator.net/) or [liquid-technologies.com/online-json-schema-validator](https://www.liquid-technologies.com/online-json-schema-validator). Make sure to just copy the UBI related portions, and not any of the search engine specific code. Here is the UBI portion from the file [query-solr.json](https://github.com/o19s/ubi/blob/master/samples/query-solr.json) for example: ```json { @@ -80,7 +80,7 @@ The Schema is documented by itself, but it's much easier to get "the big picture ## 🤔 Frequently Asked Questions #### How do I handle anonymous users? -We often want to track a specific identifer for a user, but then realize that we also want to connect those events to previously unauthenticated events. Therefore, we can't just plop in a explicit user id as the `client_id` attribute. Instead, you want to track something that is permanent, across the anonymous AND logged in session as the `client_id`. To make processing simpler you can store the explicit user identifier in the Event --> Event Attributes --> Additional Properties hash. Here is an example of user "abc" who clicked on item with sku "1234": +We often want to track a specific identifer for a user, but then realize that we also want to connect those events to previously unauthenticated events. Therefore, we can't just plop in a explicit user id as the `client_id` attribute. Instead, you want to track something that is permanent, across the anonymous AND logged in session as the `client_id`. To make processing simpler you can store the explicit user identifier in the [Event --> Event Attributes --> Additional Properties](https://o19s.github.io/ubi/docs/html/event.schema.html#event_attributes_additionalProperties) hash. Here is an example of user "abc" who clicked on item with sku "1234": ```json { @@ -140,4 +140,4 @@ If you want to say thank you and/or support active development of UBI: Thanks so much for your interest in growing the reach of UBI! -_This site was inspired by https://github.com/getmanfred/mac. Thank you!_ +_This site was inspired by https://github.com/getmanfred/mac. Thank you!_ From e71c77b9bffc8ef3d14b161b9f73a7159aa76a00 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Fri, 19 Jul 2024 13:11:54 +0200 Subject: [PATCH 5/6] Add new FAQ answers --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4f297d9..8095efd 100644 --- a/README.md +++ b/README.md @@ -99,9 +99,11 @@ We often want to track a specific identifer for a user, but then realize that we } ``` +In post processing, you can use the Client ID field to connect queries and events from the anonymous user to queries and events after they are logged in, and pluck the explicit user id from the detailed event_attributes information. + #### Where do I record my user id and item id? -Blah +If your user identification is stable, then feel free to use the [Query Request --> Client ID](https://o19s.github.io/ubi/docs/html/query.request.schema.html#client_id) and [Event --> Client ID](https://o19s.github.io/ubi/docs/html/event.schema.html#client_id). Otherwise, see the above FAQ entry for how to handle it. The item ID is tracked for an event in the [Event --> Object](https://o19s.github.io/ubi/docs/html/event.schema.html#event_attributes_object) datastructure. From 9b6e2683e6ef837dee8e3dab4d602c9387958cde Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Mon, 22 Jul 2024 12:22:59 -0400 Subject: [PATCH 6/6] fix awkward phrasing --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8095efd..f369c6c 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,7 @@ UBI (or User Behavior Insights) is a(nother) naive attempt to create **a standar ## 🥘 Why use it -Many Search teams struggle with understanding "Why is my user doing this". They have great understanding of an incoming query and the documents returned, but no ability to connect that dot with an indicator of success, such as a click through event or event a add to cart. +Many Search teams struggle with understanding "Why is my user doing this". They have great understanding of an incoming query and the documents returned, but no ability to connect that dot with an indicator of success, such as a click through event or add to cart event. There are A LOT of tools out there for tracking events, Google Analytics, Snowplow, etc, but each is a bit different, and each tends to lock you in. None of them think about the needs of Search teams specifically either.