Skip to content

Commit

Permalink
feat: introduce tracing
Browse files Browse the repository at this point in the history
We replace `log` with `tracing` across the codebase. Where possible, we make use of structured logging now instead of templating strings.

`tracing` offers the ability to also record "span"s. A span lasts until its dropped and describes the entire duration that it is active for. All logs (in `tracing` term "events") are hierarchically embedded in all parent-spans).

We introduce several spans:

- On debug level: One for `new_outgoing_connection`, `new_incoming_connection` and `new_established_connection`
- On debug level: `Connection::poll`, `Swarm::poll` and `Pool::poll`
- On trace level: `NetworkBehaviour::poll` for each implementation of `NetworkBehaviour`
- On trace level: `ConnectionHandler::poll` for each implementation of (protocol) `ConnectionHandler`s

The idea here is that logging on debug level gives you a decent overview of what the system is doing. You get spans for the duration of connections and how often each connection gets polled.

Dropping down to trace level gives you an extremely detailed view of how long each individual `ConnectionHandler` was executed as part of `poll` which could be used for detailed analysis on how busy certain handlers are.

Most importantly, simply logging on `info` does not give you any spans. We consider `info` to be a good default that should be reasonably quiet.

Resolves libp2p#1533.

Pull-Request: libp2p#4282.
  • Loading branch information
eserilev authored Nov 2, 2023
1 parent 426606d commit f0374c3
Showing 5 changed files with 47 additions and 38 deletions.
6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -19,16 +19,16 @@ libp2p-core = { workspace = true }
libp2p-swarm = { workspace = true }
libp2p-request-response = { workspace = true }
libp2p-identity = { workspace = true }
log = "0.4"
rand = "0.8"
quick-protobuf = "0.8"
rand = "0.8"
tracing = "0.1.37"
quick-protobuf-codec = { workspace = true }
asynchronous-codec = "0.6.2"

[dev-dependencies]
async-std = { version = "1.10", features = ["attributes"] }
env_logger = "0.10"
libp2p-swarm-test = { path = "../../swarm-test" }
tracing-subscriber = { version = "0.3", features = ["env-filter"] }

# Passing arguments to the docsrs builder in order to properly document cfg's.
# More information: https://docs.rs/about/builds#cross-compiling
1 change: 1 addition & 0 deletions src/behaviour.rs
Original file line number Diff line number Diff line change
@@ -435,6 +435,7 @@ impl NetworkBehaviour for Behaviour {
<request_response::Behaviour<AutoNatCodec> as NetworkBehaviour>::ConnectionHandler;
type ToSwarm = Event;

#[tracing::instrument(level = "trace", name = "ConnectionHandler::poll", skip(self, cx))]
fn poll(
&mut self,
cx: &mut Context<'_>,
22 changes: 11 additions & 11 deletions src/behaviour/as_client.rs
Original file line number Diff line number Diff line change
@@ -112,7 +112,7 @@ impl<'a> HandleInnerEvent for AsClient<'a> {
response,
},
} => {
log::debug!("Outbound dial-back request returned {:?}.", response);
tracing::debug!(?response, "Outbound dial-back request returned response");

let probe_id = self
.ongoing_outbound
@@ -154,10 +154,10 @@ impl<'a> HandleInnerEvent for AsClient<'a> {
error,
request_id,
} => {
log::debug!(
"Outbound Failure {} when on dial-back request to peer {}.",
tracing::debug!(
%peer,
"Outbound Failure {} when on dial-back request to peer.",
error,
peer
);
let probe_id = self
.ongoing_outbound
@@ -275,13 +275,13 @@ impl<'a> AsClient<'a> {
) -> Result<PeerId, OutboundProbeError> {
let _ = self.last_probe.insert(Instant::now());
if addresses.is_empty() {
log::debug!("Outbound dial-back request aborted: No dial-back addresses.");
tracing::debug!("Outbound dial-back request aborted: No dial-back addresses");
return Err(OutboundProbeError::NoAddresses);
}
let server = match self.random_server() {
Some(s) => s,
None => {
log::debug!("Outbound dial-back request aborted: No qualified server.");
tracing::debug!("Outbound dial-back request aborted: No qualified server");
return Err(OutboundProbeError::NoServer);
}
};
@@ -293,7 +293,7 @@ impl<'a> AsClient<'a> {
},
);
self.throttled_servers.push((server, Instant::now()));
log::debug!("Send dial-back request to peer {}.", server);
tracing::debug!(peer=%server, "Send dial-back request to peer");
self.ongoing_outbound.insert(request_id, probe_id);
Ok(server)
}
@@ -344,10 +344,10 @@ impl<'a> AsClient<'a> {
return None;
}

log::debug!(
"Flipped assumed NAT status from {:?} to {:?}",
self.nat_status,
reported_status
tracing::debug!(
old_status=?self.nat_status,
new_status=?reported_status,
"Flipped assumed NAT status"
);

let old_status = self.nat_status.clone();
48 changes: 28 additions & 20 deletions src/behaviour/as_server.rs
Original file line number Diff line number Diff line change
@@ -110,9 +110,9 @@ impl<'a> HandleInnerEvent for AsServer<'a> {
let probe_id = self.probe_id.next();
match self.resolve_inbound_request(peer, request) {
Ok(addrs) => {
log::debug!(
"Inbound dial request from Peer {} with dial-back addresses {:?}.",
peer,
tracing::debug!(
%peer,
"Inbound dial request from peer with dial-back addresses {:?}",
addrs
);

@@ -140,10 +140,10 @@ impl<'a> HandleInnerEvent for AsServer<'a> {
])
}
Err((status_text, error)) => {
log::debug!(
"Reject inbound dial request from peer {}: {}.",
peer,
status_text
tracing::debug!(
%peer,
status=%status_text,
"Reject inbound dial request from peer"
);

let response = DialResponse {
@@ -167,10 +167,10 @@ impl<'a> HandleInnerEvent for AsServer<'a> {
error,
request_id,
} => {
log::debug!(
"Inbound Failure {} when on dial-back request from peer {}.",
error,
peer
tracing::debug!(
%peer,
"Inbound Failure {} when on dial-back request from peer",
error
);

let probe_id = match self.ongoing_inbound.get(&peer) {
@@ -206,10 +206,10 @@ impl<'a> AsServer<'a> {
return None;
}

log::debug!(
"Dial-back to peer {} succeeded at addr {:?}.",
peer,
address
tracing::debug!(
%peer,
%address,
"Dial-back to peer succeeded"
);

let (probe_id, _, _, channel) = self.ongoing_inbound.remove(peer).unwrap();
@@ -232,11 +232,19 @@ impl<'a> AsServer<'a> {
error: &DialError,
) -> Option<InboundProbeEvent> {
let (probe_id, _, _, channel) = peer.and_then(|p| self.ongoing_inbound.remove(&p))?;
log::debug!(
"Dial-back to peer {} failed with error {:?}.",
peer.unwrap(),
error
);

match peer {
Some(p) => tracing::debug!(
peer=%p,
"Dial-back to peer failed with error {:?}",
error
),
None => tracing::debug!(
"Dial-back to non existent peer failed with error {:?}",
error
),
};

let response_error = ResponseError::DialError;
let response = DialResponse {
result: Err(response_error.clone()),
8 changes: 4 additions & 4 deletions src/protocol.rs
Original file line number Diff line number Diff line change
@@ -129,7 +129,7 @@ impl DialRequest {
{
(peer_id, addrs)
} else {
log::debug!("Received malformed dial message.");
tracing::debug!("Received malformed dial message");
return Err(io::Error::new(
io::ErrorKind::InvalidData,
"invalid dial message",
@@ -146,7 +146,7 @@ impl DialRequest {
.filter_map(|a| match Multiaddr::try_from(a.to_vec()) {
Ok(a) => Some(a),
Err(e) => {
log::debug!("Unable to parse multiaddr: {e}");
tracing::debug!("Unable to parse multiaddr: {e}");
None
}
})
@@ -207,7 +207,7 @@ impl TryFrom<proto::ResponseStatus> for ResponseError {
proto::ResponseStatus::E_BAD_REQUEST => Ok(ResponseError::BadRequest),
proto::ResponseStatus::E_INTERNAL_ERROR => Ok(ResponseError::InternalError),
proto::ResponseStatus::OK => {
log::debug!("Received response with status code OK but expected error.");
tracing::debug!("Received response with status code OK but expected error");
Err(io::Error::new(
io::ErrorKind::InvalidData,
"invalid response error type",
@@ -251,7 +251,7 @@ impl DialResponse {
result: Err(ResponseError::try_from(status)?),
},
_ => {
log::debug!("Received malformed response message.");
tracing::debug!("Received malformed response message");
return Err(io::Error::new(
io::ErrorKind::InvalidData,
"invalid dial response message",

0 comments on commit f0374c3

Please sign in to comment.