the question about the run file #14

1812030208 · 2023-07-07T09:49:40Z

Hello! Thank you for your recently uploaded GAIA dataset! I would like to ask if each line in the csv file in the run file corresponds to an injected exception?
Because I see some lines of log information such as "upload business logs on 2021-07-31 successfully", is this also an exception? If so, what type of exception? Looking forward to your reply, thank you very much!

Xander-cloudwise · 2023-07-11T03:29:44Z

Thank you for your interest in the GAIA-dataset. In a run file, we provided two kinds of messages, resp. at WARNING level and INFO level. The message at the INFO level records the routine information from different data sources. The message at the WARNING level records the unexpected actions in the system, including unexpected user behaviors and resource-consumption anomalies.

1812030208 · 2023-07-12T04:34:34Z

Thank you for your interest in the GAIA-dataset. In a run file, we provided two kinds of messages, resp. at WARNING level and INFO level. The message at the INFO level records the routine information from different data sources. The message at the WARNING level records the unexpected actions in the system, including unexpected user behaviors and resource-consumption anomalies.

Thank you for your reply! So the WARNING level message is abnormal, and the INFO level message is normal, right? When I tag business and metric, I only need to tag them according to the WARNING level message in the run file, right? Looking forward to your reply, thank you very much!

Xander-cloudwise · 2023-07-12T06:31:31Z

Yes, your understanding is correct.

1812030208 · 2023-08-28T09:21:51Z

Yes, your understanding is correct.

Dear official, hello! The run file provides information for fault injection, but the duration of each fault is not separately marked in these information. Do we need to further extract the duration of the fault according to the information in the "message" column? For example, does the following sentence mean that the failure duration is 11 seconds? As follows: "2021-07-12 18:57:42,805 | WARNING | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1c689ba98 | wait for 11 seconds for 2021-07-12 18:57:42,805 | warning | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1C689ba98 | wait for 11 seconds for follow-up operations to simulate the login failure of the QR code expired".
Looking forward to your reply, thank you very much!

Xander-cloudwise · 2023-08-29T09:11:14Z

Yes, your understanding is correct.

Dear official, hello! The run file provides information for fault injection, but the duration of each fault is not separately marked in these information. Do we need to further extract the duration of the fault according to the information in the "message" column? For example, does the following sentence mean that the failure duration is 11 seconds? As follows: "2021-07-12 18:57:42,805 | WARNING | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1c689ba98 | wait for 11 seconds for 2021-07-12 18:57:42,805 | warning | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1C689ba98 | wait for 11 seconds for follow-up operations to simulate the login failure of the QR code expired". Looking forward to your reply, thank you very much!

For resource-consumption anomalies, the duration is marked in the "message" column. Usually an anomaly lasts 600 seconds.

However, the message "wait for 11 seconds" is different and needs further explanation. MicroSS supports the user login procedure of a website. When the login procedure starts, a QR code is created and shown on the screen, and will expire after 10 seconds. Mobservice simulates the user behavior of scanning the QR code to login. Sometimes, a user may not scan the QR code in time so that logging in will fail. To simulate this scenario, mobservice will wait 11 seconds for QR code expiration, and then "scan" the expired QR code, leading to a login failure.

1812030208 · 2023-08-30T03:32:02Z

Yes, your understanding is correct.

Dear official, hello! The run file provides information for fault injection, but the duration of each fault is not separately marked in these information. Do we need to further extract the duration of the fault according to the information in the "message" column? For example, does the following sentence mean that the failure duration is 11 seconds? As follows: "2021-07-12 18:57:42,805 | WARNING | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1c689ba98 | wait for 11 seconds for 2021-07-12 18:57:42,805 | warning | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1C689ba98 | wait for 11 seconds for follow-up operations to simulate the login failure of the QR code expired". Looking forward to your reply, thank you very much!

For resource-consumption anomalies, the duration is marked in the "message" column. Usually an anomaly lasts 600 seconds.

However, the message "wait for 11 seconds" is different and needs further explanation. MicroSS supports the user login procedure of a website. When the login procedure starts, a QR code is created and shown on the screen, and will expire after 10 seconds. Mobservice simulates the user behavior of scanning the QR code to login. Sometimes, a user may not scan the QR code in time so that logging in will fail. To simulate this scenario, mobservice will wait 11 seconds for QR code expiration, and then "scan" the expired QR code, leading to a login failure.

Messages as "2021-07-12 18:57:42,805 | WARNING | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1c689ba98 | wait for 11 seconds for 2021-07-12 18:57:42,805 | warning | 0.0.0.1 | 172.17.0.5 | mobservice1 | e37a99d1C689ba98 | wait for 11 seconds for follow-up operations to simulate the login failure of the QR code expired" record the above login failure information.

Dear official, hello! After aligning the log, kpi and trace corresponding to the same timestamp, can we judge whether the entire system (label) is abnormal according to whether the log corresponding to this timestamp is abnormal without looking at the run file? Looking forward to your reply!

wangsandlmu · 2023-09-01T01:02:38Z

Dear official, I also have the same question. When we use logs, metrics, and trace for anomaly detection, if the log at that time is "error", can I disregard the "run" file and directly determine the label corresponding to the three data at that moment as an anomaly?

Xander-cloudwise · 2023-09-01T03:23:12Z

It depends because a problem in a single trace or a single business transaction may not reflect the overall issue of the system, and a temporal fluctuation on a kpi time series also may not indicate system instability. The records in the run file are the anomalous actions we injected into the system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the question about the run file #14

the question about the run file #14

1812030208 commented Jul 7, 2023

Xander-cloudwise commented Jul 11, 2023

1812030208 commented Jul 12, 2023

Xander-cloudwise commented Jul 12, 2023

1812030208 commented Aug 28, 2023

Xander-cloudwise commented Aug 29, 2023

1812030208 commented Aug 30, 2023

wangsandlmu commented Sep 1, 2023

Xander-cloudwise commented Sep 1, 2023

the question about the run file #14

the question about the run file #14

Comments

1812030208 commented Jul 7, 2023

Xander-cloudwise commented Jul 11, 2023

1812030208 commented Jul 12, 2023

Xander-cloudwise commented Jul 12, 2023

1812030208 commented Aug 28, 2023

Xander-cloudwise commented Aug 29, 2023

1812030208 commented Aug 30, 2023

wangsandlmu commented Sep 1, 2023

Xander-cloudwise commented Sep 1, 2023